Skip to content

Reduce ctest wall-time without losing coverage#844

Open
mjp41 wants to merge 6 commits intomicrosoft:mainfrom
mjp41:test_time_reduction
Open

Reduce ctest wall-time without losing coverage#844
mjp41 wants to merge 6 commits intomicrosoft:mainfrom
mjp41:test_time_reduction

Conversation

@mjp41
Copy link
Copy Markdown
Member

@mjp41 mjp41 commented May 6, 2026

Add a --smoke mode for perf tests and trim redundant outer repetitions in func/memory.

Perf tests:

  • New src/test/perf_setup.h exposes perf_iterations(opt, name, cli_default, smoke_value). Returns the smoke value when the binary was invoked with --smoke (which CMake now passes for every perf test under ctest), and the full count otherwise so direct CLI invocation is unchanged.
  • Concurrency-stress tests (perf-contention, perf-msgpass, perf-large_producer_consumer, perf-lotsofthreads) are on an exclusion list inside the helper: they receive --smoke from ctest but the helper still returns the full iteration count, so their scheduler-coverage capability is preserved.
  • CMake injects the canonical test name as SNMALLOC_TEST_NAME on both the per-flavour executable and the shared OBJLIB, so the helper can identify itself without a third source of truth.
  • Wired into: singlethread, contention, msgpass, external_pointer, large_alloc, memcpy.

func/memory:

  • TEST(...) outer-repeat 50 -> 3. The inner tests already do size-class sweeps and per-offset loops; three re-entries still catches leak-across-reentry bugs without 50x the work.
  • test_external_pointer_large is moved out of the TEST(...) macro and run once: each invocation walks ~512 MB of interior pointers, which is its own internal stress.
  • test_static_sized_allocs default max_size 2^23 -> 2^20.

Coverage delta on src/snmalloc/ vs origin/main: zero functions lost, zero lines lost, one branch fewer missed in
backend_helpers/subrange.h (improvement) and one extra missed in ds/combininglock.h (timing-noise on the spinlock contention path).

@mjp41 mjp41 force-pushed the test_time_reduction branch from 667e991 to 3509333 Compare May 7, 2026 07:00
mjp41 and others added 6 commits May 7, 2026 14:38
Add a `--smoke` mode for perf tests and trim redundant outer
repetitions in func/memory.

Perf tests:
* New `src/test/perf_setup.h` exposes `perf_iterations(opt, name,
  cli_default, smoke_value)`. Returns the smoke value when the
  binary was invoked with `--smoke` (which CMake now passes for
  every perf test under ctest), and the full count otherwise so
  direct CLI invocation is unchanged.
* Concurrency-stress tests (`perf-contention`, `perf-msgpass`,
  `perf-large_producer_consumer`, `perf-lotsofthreads`) are on an
  exclusion list inside the helper: they receive `--smoke` from
  ctest but the helper still returns the full iteration count, so
  their scheduler-coverage capability is preserved.
* CMake injects the canonical test name as `SNMALLOC_TEST_NAME`
  on both the per-flavour executable and the shared OBJLIB, so
  the helper can identify itself without a third source of truth.
* Wired into: singlethread, contention, msgpass, external_pointer,
  large_alloc, memcpy.

func/memory:
* TEST(...) outer-repeat 50 -> 3. The inner tests already do
  size-class sweeps and per-offset loops; three re-entries still
  catches leak-across-reentry bugs without 50x the work.
* test_external_pointer_large is moved out of the TEST(...) macro
  and run once: each invocation walks ~512 MB of interior
  pointers, which is its own internal stress.
* test_static_sized_allocs default max_size 2^23 -> 2^20.

Coverage delta on src/snmalloc/ vs origin/main: zero functions
lost, zero lines lost, one branch fewer missed in
backend_helpers/subrange.h (improvement) and one extra missed in
ds/combininglock.h (timing-noise on the spinlock contention
path).
The benchmark's purpose is to stress the cross-thread free path
under contention, which is already observable at much lower
iteration counts. Debug builds carry full instrumentation and run
~10x slower per iteration, so 200k iterations across 8 threads
makes this single test dominate Debug ctest wall-time without
producing any additional contention coverage relative to a smaller
count.

Divide iterations by 10 when NDEBUG is not defined. Release builds
keep the original counts (200000 / 50000 depending on platform and
sanitizer) so the benchmark's signal is unchanged where it matters.

Local measurement on Debug:
  perf-lotsofthreads-fast:  ~136s -> 13.6s
  perf-lotsofthreads-check: ~500s -> 49.7s

The test is on the concurrency-stress exclusion list and so is
deliberately not affected by the --smoke knob; reducing iterations
in Debug here is the right lever.
Review feedback: perf_setup.h was over-engineered for what amounts
to a single CLI query and a per-test conditional. Each call site
already knew which knobs to scale and by how much; the helper added
a name lookup table, a CMake-injected SNMALLOC_TEST_NAME macro, and
an indirection that obscured what each test actually does.

Replace the helper with direct `opt.has("--smoke")` checks at each
call site:

  * external_pointer, large_alloc, memcpy, singlethread:
    `size_t x = opt.has("--smoke") ? smoke : default;`

  * contention, msgpass: these have CLI overrides for the same
    knobs (--swapcount/--swapsize, --batches), so make `--smoke`
    lower the *default* fed to opt.is(...). Explicit command-line
    arguments still win, which is what the user expects from a
    smoke flag.

The "concurrency-stress exclusion list" inside the helper turns
into nothing: tests that don't read --smoke (perf-lotsofthreads,
perf-large_producer_consumer) just ignore it. CMake unconditionally
passing --smoke to every perf test under ctest is harmless for
those tests and gives us a single uniform invocation.

Drop the SNMALLOC_TEST_NAME injection in CMakeLists.txt (no longer
needed since the test no longer needs to identify itself), and
delete src/test/perf_setup.h.
Functional tests don't currently read --smoke (most use `int main()`
with no args, the rest UNUSED(argc, argv)) so passing it is a silent
no-op. But there's no reason to gate the flag on TEST_CATEGORY:
keeping the perf/non-perf split forces every future test that wants
a smoke mode to also touch CMake.

Collapse the four-arm add_test() block into two arms (release vs
not), each unconditionally appending --smoke. Any test that wants to
honour it just reads the flag; any test that doesn't, ignores it.
@mjp41 mjp41 force-pushed the test_time_reduction branch from 9ee0b11 to 322fca5 Compare May 7, 2026 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant