Reduce ctest wall-time without losing coverage#844
Open
mjp41 wants to merge 6 commits intomicrosoft:mainfrom
Open
Reduce ctest wall-time without losing coverage#844mjp41 wants to merge 6 commits intomicrosoft:mainfrom
mjp41 wants to merge 6 commits intomicrosoft:mainfrom
Conversation
Add a `--smoke` mode for perf tests and trim redundant outer repetitions in func/memory. Perf tests: * New `src/test/perf_setup.h` exposes `perf_iterations(opt, name, cli_default, smoke_value)`. Returns the smoke value when the binary was invoked with `--smoke` (which CMake now passes for every perf test under ctest), and the full count otherwise so direct CLI invocation is unchanged. * Concurrency-stress tests (`perf-contention`, `perf-msgpass`, `perf-large_producer_consumer`, `perf-lotsofthreads`) are on an exclusion list inside the helper: they receive `--smoke` from ctest but the helper still returns the full iteration count, so their scheduler-coverage capability is preserved. * CMake injects the canonical test name as `SNMALLOC_TEST_NAME` on both the per-flavour executable and the shared OBJLIB, so the helper can identify itself without a third source of truth. * Wired into: singlethread, contention, msgpass, external_pointer, large_alloc, memcpy. func/memory: * TEST(...) outer-repeat 50 -> 3. The inner tests already do size-class sweeps and per-offset loops; three re-entries still catches leak-across-reentry bugs without 50x the work. * test_external_pointer_large is moved out of the TEST(...) macro and run once: each invocation walks ~512 MB of interior pointers, which is its own internal stress. * test_static_sized_allocs default max_size 2^23 -> 2^20. Coverage delta on src/snmalloc/ vs origin/main: zero functions lost, zero lines lost, one branch fewer missed in backend_helpers/subrange.h (improvement) and one extra missed in ds/combininglock.h (timing-noise on the spinlock contention path).
The benchmark's purpose is to stress the cross-thread free path under contention, which is already observable at much lower iteration counts. Debug builds carry full instrumentation and run ~10x slower per iteration, so 200k iterations across 8 threads makes this single test dominate Debug ctest wall-time without producing any additional contention coverage relative to a smaller count. Divide iterations by 10 when NDEBUG is not defined. Release builds keep the original counts (200000 / 50000 depending on platform and sanitizer) so the benchmark's signal is unchanged where it matters. Local measurement on Debug: perf-lotsofthreads-fast: ~136s -> 13.6s perf-lotsofthreads-check: ~500s -> 49.7s The test is on the concurrency-stress exclusion list and so is deliberately not affected by the --smoke knob; reducing iterations in Debug here is the right lever.
Review feedback: perf_setup.h was over-engineered for what amounts
to a single CLI query and a per-test conditional. Each call site
already knew which knobs to scale and by how much; the helper added
a name lookup table, a CMake-injected SNMALLOC_TEST_NAME macro, and
an indirection that obscured what each test actually does.
Replace the helper with direct `opt.has("--smoke")` checks at each
call site:
* external_pointer, large_alloc, memcpy, singlethread:
`size_t x = opt.has("--smoke") ? smoke : default;`
* contention, msgpass: these have CLI overrides for the same
knobs (--swapcount/--swapsize, --batches), so make `--smoke`
lower the *default* fed to opt.is(...). Explicit command-line
arguments still win, which is what the user expects from a
smoke flag.
The "concurrency-stress exclusion list" inside the helper turns
into nothing: tests that don't read --smoke (perf-lotsofthreads,
perf-large_producer_consumer) just ignore it. CMake unconditionally
passing --smoke to every perf test under ctest is harmless for
those tests and gives us a single uniform invocation.
Drop the SNMALLOC_TEST_NAME injection in CMakeLists.txt (no longer
needed since the test no longer needs to identify itself), and
delete src/test/perf_setup.h.
Functional tests don't currently read --smoke (most use `int main()` with no args, the rest UNUSED(argc, argv)) so passing it is a silent no-op. But there's no reason to gate the flag on TEST_CATEGORY: keeping the perf/non-perf split forces every future test that wants a smoke mode to also touch CMake. Collapse the four-arm add_test() block into two arms (release vs not), each unconditionally appending --smoke. Any test that wants to honour it just reads the flag; any test that doesn't, ignores it.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a
--smokemode for perf tests and trim redundant outer repetitions in func/memory.Perf tests:
src/test/perf_setup.hexposesperf_iterations(opt, name, cli_default, smoke_value). Returns the smoke value when the binary was invoked with--smoke(which CMake now passes for every perf test under ctest), and the full count otherwise so direct CLI invocation is unchanged.perf-contention,perf-msgpass,perf-large_producer_consumer,perf-lotsofthreads) are on an exclusion list inside the helper: they receive--smokefrom ctest but the helper still returns the full iteration count, so their scheduler-coverage capability is preserved.SNMALLOC_TEST_NAMEon both the per-flavour executable and the shared OBJLIB, so the helper can identify itself without a third source of truth.func/memory:
Coverage delta on src/snmalloc/ vs origin/main: zero functions lost, zero lines lost, one branch fewer missed in
backend_helpers/subrange.h (improvement) and one extra missed in ds/combininglock.h (timing-noise on the spinlock contention path).