Skip to content

Commit 6e2d89a

Browse files
committed
docs(runbooks): add test-strategy.md (RAN-46 AC #5)
Closes the last missing runbook from the RAN-46 acceptance list: test-strategy.md per engineering-standards.md §4. Scope: - Test layers (unit / integration / e2e quality) with discriminators matching the codeiq codebase (Spring profile rule, @tempdir rule) - Coverage targets aligned to engineering-standards.md §1 (project-wide ≥85% floor, new-code ≥80%, critical paths 100%) - Detector test contract: positive + negative + determinism (per CLAUDE.md "Adding a New Detector") - Flake policy: same-PR resolution, no retry-loops in CI - Regression suite scope: mvn verify is the gate; E2E quality tests are nightly + on-demand, not per-PR Independent of the (A) ratify-Sonar-stack vs (B) revert-to-OSS-CLI ruling pending on RAN-46 — content references only the testing tiers, coverage targets, and flake policy that hold either way.
1 parent 19c6619 commit 6e2d89a

1 file changed

Lines changed: 90 additions & 0 deletions

File tree

shared/runbooks/test-strategy.md

Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Test Strategy — codeiq
2+
3+
> **SSoT for testing policy: layers, coverage targets, flake handling, regression scope.** Owner: QA (until codeiq grows a dedicated QA hire; until then, the engineer who owns the change owns its tests). Pairs with [`engineering-standards.md`](engineering-standards.md) §1 (quality gates) and §4 (testing tiers) — this file is the operational expansion of those two sections.
4+
5+
If a rule here conflicts with `engineering-standards.md`, the standards file wins. This runbook is **how**; the standards file is **what**.
6+
7+
---
8+
9+
## 1. Test layers (what runs where)
10+
11+
codeiq runs three test tiers. Every change picks the lightest tier that gives useful signal.
12+
13+
| Layer | Definition | Where | Runs in CI | Wall-clock target |
14+
|---|---|---|---|---|
15+
| **Unit** | Pure logic; no I/O; no Spring context; no Neo4j; no filesystem beyond `@TempDir`. The bulk of tests. | `src/test/java/.../<package>/*Test.java` (Surefire) | Every PR + push (`mvn test`) | < 10 ms / test, < 60 s suite |
16+
| **Integration** | Real H2 cache, real Neo4j Embedded, real `@TempDir` filesystem, real ANTLR/JavaParser. Spring context allowed when needed. | `src/test/java/.../analyzer/`, `.../graph/`, `.../intelligence/`, `.../e2e/` (Failsafe — `*IT.java` or `@IntegrationTest`) | Every PR + push (`mvn verify`) | < 5 s / test, < 5 min suite |
17+
| **E2E quality** | Full pipeline (`index → enrich → serve`) against a real cloned external repo (Spring PetClinic, etc.); endpoint responses validated against Context7-sourced ground-truth JSON. | `E2EQualityTest`, ground-truth at `src/test/resources/e2e/ground-truth-*.json` | On demand + nightly cron | < 10 min / repo |
18+
19+
**Discriminator:** if a test starts an `ApplicationContext`, touches Neo4j, or reads the filesystem outside `@TempDir`, it is integration, not unit. Move it to `src/test/java/.../analyzer/` or `.../e2e/`. Keep `src/test/java/.../detector/` unit-only — detectors are stateless beans, their tests should never need Spring.
20+
21+
**Spring profile rule:** any `@SpringBootTest` MUST have `@ActiveProfiles("test")` so Neo4j embedded does not start during unit-context runs. This is a real-bug-causing gotcha — see `CLAUDE.md` § "Gotchas".
22+
23+
---
24+
25+
## 2. Coverage targets
26+
27+
| Scope | Target | Floor (build fails below) | Tool |
28+
|---|---|---|---|
29+
| Project-wide line | ≥ 90% | **85%** (JaCoCo BUNDLE LINE COVEREDRATIO; `pom.xml` rule) | `jacoco-maven-plugin` |
30+
| New code (per PR) | ≥ 90% | **80%** | SonarCloud "new code" gate (active per `engineering-standards.md` §1) |
31+
| Critical paths (auth, path-traversal, max-bytes, deserialization, anything in `api/` security checks) | 100% line + branch | 100% — no merge with gaps | JaCoCo + manual review |
32+
| Detectors | Positive match + negative match + determinism (run twice, assert identical output) | All three present | Test convention (per `CLAUDE.md`) |
33+
34+
Coverage exclusions live in `pom.xml` `<jacoco>` config. Adding to that list requires TechLead sign-off and a one-line justification per entry. Generated ANTLR sources, the Spring `application/` main, and pure data records are pre-excluded.
35+
36+
**Coverage is a signal, not a target.** 100% coverage with assertion-free tests is worse than 60% with meaningful ones. Don't chase the number; chase the failure modes the code can actually have.
37+
38+
---
39+
40+
## 3. What every new detector test must include
41+
42+
Per `CLAUDE.md` § "Adding a New Detector":
43+
44+
1. **Positive match** — at least one synthetic input that should produce the expected node/edge.
45+
2. **Negative match** — an input that *looks* close but should NOT match (regression guard against the framework-false-positive class — e.g., generic `router.get` patterns wrongly attributed to Quarkus).
46+
3. **Determinism** — run `detect()` twice on the same input, assert byte-identical `DetectorResult`. This catches `Set` iteration leaks, mutable static state, and race conditions in shared helpers.
47+
48+
Discriminator-guard detectors (Quarkus, Fastify, Micronaut, NestJS, etc.) need at minimum **two** negative cases: (a) framework-not-imported, (b) different-framework-imported.
49+
50+
---
51+
52+
## 4. Flake policy — flaky test = broken test
53+
54+
A flaky test is broken. Same PR resolution; do not merge code that makes flake worse.
55+
56+
| State | Resolution |
57+
|---|---|
58+
| Flake reproduced locally | Fix the timing / order assumption, re-run 50× before declaring solved |
59+
| Flake in CI only, can't reproduce | Add deterministic seeding, isolate from shared state, retry **once** to gather a second log; if still flaky, quarantine |
60+
| Quarantine | `@Disabled("flaky — RAN-XXX")` with a tracked Paperclip issue; never silently deleted, never `@RepeatedTest`-looped past in CI |
61+
| Three quarantines on the same suite | The suite is unsound; rewrite or delete. Don't accumulate `@Disabled` debt |
62+
63+
**Never** retry-loop in CI to mask a flake. That hides real concurrency / timing bugs (and codeiq runs heavily on virtual threads — exactly the place those bugs hide).
64+
65+
---
66+
67+
## 5. Regression suite
68+
69+
The regression suite is **everything** in Surefire + Failsafe. There is no separate "regression" phase — `mvn verify` is it. Total wall-clock target: < 7 min on CI's `ubuntu-latest`.
70+
71+
E2E quality tests are **not** part of the per-PR gate (too slow + require external repo clone). They run nightly and on-demand via `E2E_PETCLINIC_DIR=... mvn -Dtest=E2EQualityTest test`. A red E2E nightly opens a `RAN-*` issue with the diff against ground truth attached.
72+
73+
Ground-truth files (`src/test/resources/e2e/ground-truth-*.json`) are versioned. Updating one requires either: (a) the underlying upstream repo legitimately changed (link the upstream PR), or (b) a bug fix landed and the prior ground truth was wrong (link the codeiq PR).
74+
75+
---
76+
77+
## 6. What we do NOT test (out of scope here)
78+
79+
- **No live network.** Tests must work behind a corporate firewall / air-gapped (per `~/.claude/rules/build.md`). Anything that needs `https://` goes in `E2EQualityTest` with explicit external-repo opt-in via env var.
80+
- **No browser / E2E UI.** The React UI is bundled in the JAR; smoke-testing the SPA boot is done in `serve`-command CLI smoke (per `first-time-setup.md` §3), not in the test suite.
81+
- **No load / stress testing.** Performance work uses `pom.xml` JMH harnesses where they exist. Microbenchmarks are not regression tests; they are decision-support for `~/.claude/rules/performance.md`.
82+
83+
---
84+
85+
## 7. References
86+
87+
- [`engineering-standards.md`](engineering-standards.md) §1 (quality gates), §4 (test tiers), §6 (style)
88+
- [`first-time-setup.md`](first-time-setup.md) §3 (build, test, run loops)
89+
- [`/CLAUDE.md`](../../CLAUDE.md) "Testing" + "Adding a New Detector"
90+
- `pom.xml``jacoco-maven-plugin` rules, `surefire`/`failsafe` config

0 commit comments

Comments
 (0)