Skip to content

Commit 0c1a4c8

Browse files
aksOpsclaude
andauthored
feat: sub-project 1 Phase 4-6 — resolver pipeline wiring + 4 Java detector migrations (#104)
* feat(analyzer): wire ResolverRegistry bootstrap + per-file resolve Phase 4 of sub-project 1 — pipeline wiring (plan tasks 19-21). The orchestration boundary for the symbol-resolution pass that sits between parse and detect. ResolverRegistry becomes a new constructor dependency on Analyzer: @Autowired primary ctor adds it as the 10th arg; the 6-arg backward-compat ctor defaults to `new ResolverRegistry(List.of())` so existing tests + direct constructor call-sites still work and observe the same behaviour (every ctx.resolved() reads back as Optional.of(EmptyResolved.INSTANCE)). Two private helpers do the work: - bootstrapResolvers(Path) — called exactly once at the top of every pipeline entry point (run / runBatchedIndex / runSmartIndex), before any file iteration. ResolverRegistry already swallows per-resolver failures; this catches the registry-itself-blowing-up case so the pipeline keeps going with NOOP resolvers. - resolveFor(DiscoveredFile, Object) — called per file at all three DetectorContext build sites (analyzeFile, the batched-index variant, and the regex-only fallback). Catches ResolutionException + RuntimeException and falls back to EmptyResolved.INSTANCE so one file's resolver blow-up cannot disrupt the rest of the pass. Every DetectorContext now reads back ctx.resolved() == Optional.of(...) — either the language's resolver result or EmptyResolved.INSTANCE. Detectors that don't care simply ignore the field; migrations to consume the resolved view follow in Phase 6. IndexCommand reaches the resolver via Analyzer.runSmartIndex, so plan task 21 ("mirror in IndexCommand") lands automatically with the analyzer wiring — no separate command-side changes required. 8 wiring tests cover: - bootstrap called exactly once per run (single file, many files, empty repo) - bootstrap path is the normalised absolute repoPath - resolverFor("java") called for each java file - ctx.resolved() is Optional.of(EmptyResolved.INSTANCE) when no resolver is registered for the language - legacy 6-arg ctor still produces a working analyzer with the same observable resolved() shape Plan: docs/plans/2026-04-27-sub-project-1-resolver-spi-and-java-pilot.md (tasks 19, 20, 21). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(resolver/java): lazy-parse Java source so ctx.resolved() carries JavaResolved The orchestrator (Analyzer) only parses structured-language files at the top level (YAML/JSON/etc.) — Java is parsed independently inside AbstractJavaParserDetector via its ThreadLocal pool. Without an extra hook, ctx.resolved() always reads back as EmptyResolved for Java because the SPI's resolve(file, parsedAst) was never given a CompilationUnit. Two minimal changes flip ctx.resolved() to JavaResolved for Java files: 1. JavaSymbolResolver.resolve() now accepts a String source as well as a CompilationUnit. When given a String, it parses with a fresh JavaParser configured with the symbol solver, so resolution is attached to the resulting CU. Per-call JavaParser allocation is intentional (JavaParser instances aren't thread-safe and resolve() is invoked from virtual threads concurrently); cost is small relative to the parse itself. 2. Analyzer.resolveFor() now takes content as a 3rd arg and uses it as the parsedAst fallback when the orchestrator's structured parser produced nothing. The 3 call sites (analyzeFile, the batched-index variant, and the regex-only fallback) all pass content. Permissive parsing: JavaParser produces a CompilationUnit even for files with syntax errors (with attached Problems). The resolver returns JavaResolved in that case — production analysis must keep going across malformed files instead of failing the whole pass. EmptyResolved is only returned when getResult().isEmpty(), which JavaParser reserves for hard configuration-level failures. New tests: - JavaSymbolResolverTest: 4 new cases — valid source string parses, junk input doesn't throw or null, empty source produces an empty CU, unknown AST type (e.g. a Path) → EmptyResolved (replaces the old "wrong AST type" String case). - AnalyzerResolverWiringTest: javaFilePicksUpJavaResolvedWhenResolverRegistered asserts ctx.resolved() is JavaResolved (not EmptyResolved) once a JavaSymbolResolver is registered with the registry. This is the bridge that lets detector migrations (Phase 6 / tasks 24-29) actually consume RESOLVED-tier resolution from ctx.resolved(). Plan: docs/plans/2026-04-27-sub-project-1-resolver-spi-and-java-pilot.md (unblocks tasks 24-29). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/jpa): consume ctx.resolved() for RESOLVED-tier MAPS_TO edges Phase 6 task 26 (and the JPA-relationship part of task 24) — first detector migration to the resolver SPI. When ctx.resolved() carries a JavaResolved (Analyzer registered a JavaSymbolResolver), the detector now: 1. Uses the resolver-parsed CompilationUnit (which has the symbol solver attached) instead of the local ThreadLocal-pool parse — no double-parse, and Type.resolve() works inside the AST walk. 2. Attempts to resolve each @onetomany / @manytoone / @OnetoOne / @manytomany field's target type to a fully-qualified name via the symbol solver: - Generic-arg case: List<Owner> → resolves the type argument - Direct-field case: Owner → resolves the field type 3. On resolution success, attaches target_fqn to the edge properties and stamps Confidence.RESOLVED + source = "jpa_entity". The simple- name edge ID + target placeholder are unchanged so EntityLinker's post-pass keeps working — target_fqn rides as the canonical pointer. 4. On resolution failure (missing classpath, unsolvable symbol, etc.), falls back gracefully to the existing simple-name path with the base-class default confidence. Existing detector behaviour is unchanged when ctx.resolved() is empty or carries EmptyResolved — the 29 pre-existing JpaEntityDetectorExtended tests still pass without modification. 5 new tests in JpaEntityDetectorResolvedTest cover the three plan- required modes (resolved, fallback, mixed) plus generic-arg resolution and the no-resolved-at-all legacy ctx path: - resolvedModeProducesResolvedEdgeWithTargetFqn — two Owner classes in different packages; with resolution, the imported one wins and edge.target_fqn = "com.example.a.Owner" + RESOLVED. - resolvedModeFindsCollectionGenericArg — @onetomany List<Owner> resolves the generic arg, not the List type. - fallbackModeMatchesPreSpecBaseline — EmptyResolved → no target_fqn + raw-default confidence (orchestrator stamps SYNTACTIC at boundary). - fallbackModeWhenContextHasNoResolvedAtAll — Optional.empty() also produces the same baseline shape (legacy ctx path safety). - mixedModeUsesResolverWhereAvailable — one resolvable + one unresolvable relationship in the same class; the resolvable edge is RESOLVED + target_fqn, the unresolvable falls back. Plan: docs/plans/2026-04-27-sub-project-1-resolver-spi-and-java-pilot.md (task 26). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/repository): consume ctx.resolved() for RESOLVED-tier QUERIES edges Phase 6 task 25 (Spring Data repository migration). Same shape as the JpaEntityDetector migration applied to RepositoryDetector — promote the QUERIES edge from SYNTACTIC → RESOLVED with a stable target FQN when the resolver can pin the entity type. The detector stays regex-first (the inheritance regex is the cheapest positive signal for "this file is a Spring Data repo"), and uses ctx.resolved() purely for the FQN upgrade. When ctx.resolved() carries a JavaResolved, the detector walks JavaResolved.cu(), finds the interface declaration matching the regex-extracted name, takes the first type argument of its first extended type (e.g. JpaRepository<User, Long> → User), and resolves it via the attached symbol solver. On success: - repo node gains entity_fqn (so the RepositoryNode can be reasoned about without a join through the entity index). - QUERIES edge gains target_fqn + Confidence.RESOLVED + source = "spring_repository". On failure (no ctx.resolved(), EmptyResolved, no parent type with generics, solver can't find the type), behaviour is unchanged from before this commit — the simple-name placeholder edge with default confidence is what shipped before, and tests confirm that path is intact. 4 new tests in RepositoryDetectorResolvedTest cover the same three modes as the JpaEntity migration: - resolvedModeProducesResolvedEdgeWithTargetFqn — two User classes in different packages; the imported one wins on entity_fqn + target_fqn. - fallbackModeMatchesPreSpecBaseline — EmptyResolved → no FQN properties + no RESOLVED stamp. - fallbackModeWhenContextHasNoResolvedAtAll — Optional.empty() also safe. - mixedModeFallsBackForUnreachableEntityType — repo whose entity has no source: solver fails → fallback to simple-name + default tier. Plan: docs/plans/2026-04-27-sub-project-1-resolver-spi-and-java-pilot.md (task 25). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/spring-rest): emit RESOLVED MAPS_TO edges for @RequestBody DTOs Phase 6 task 29 (SpringRestDetector migration). Per the plan: "Resolves @RequestBody UserDto dto and @PathVariable types. Edge: MAPS_TO from endpoint node to the resolved DTO class." When ctx.resolved() carries a JavaResolved, the detector now: 1. Uses the resolver-parsed CompilationUnit (symbol solver attached) instead of the local ThreadLocal-pool parse — Type.resolve() works inside the AST walk for parameter type resolution. 2. After emitting each ENDPOINT node + its EXPOSES edge, scans the method's parameters for @RequestBody. For each binding parameter whose type is a class/interface, attempts to resolve to a stable fully-qualified name via the symbol solver. 3. On success, emits a MAPS_TO edge: endpoint --MAPS_TO--> *:<simpleName> stamped with target_fqn / parameter_kind=request_body / parameter_name properties + Confidence.RESOLVED + source = "spring_rest". Target node uses NodeKind.CLASS so EntityLinker can resolve the FQN to a concrete class node post-pass. 4. On failure (primitive type, classpath gap, generic variable, etc.), no MAPS_TO edge is emitted — endpoint extraction itself is unaffected. The endpoint's `parameters` property still records the simple type name for the lexical / SYNTACTIC tier. This is purely additive: when ctx.resolved() is empty / EmptyResolved, the detector behaves identically to before the migration. The 27 existing SpringRestDetectorExtended tests pass unchanged. 4 new tests in SpringRestDetectorResolvedTest cover: - resolvedModeProducesResolvedMapsToEdge — two UserDto in different packages; imported FQN wins on edge.target_fqn + RESOLVED stamp + parameter_name property. - fallbackModeProducesNoMapsToEdge — EmptyResolved → endpoint still emitted, but no MAPS_TO (additive contract). - fallbackModeWhenContextHasNoResolvedAtAll — Optional.empty() also produces no MAPS_TO. - mixedModeFallsBackForUnreachableType — endpoint with one resolvable DTO + one unresolvable: only the resolvable one gets MAPS_TO. Plan: docs/plans/2026-04-27-sub-project-1-resolver-spi-and-java-pilot.md (task 29). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(detector/class-hierarchy): consume ctx.resolved() for RESOLVED EXTENDS/IMPLEMENTS edges Phase 6 task 24-style migration applied to ClassHierarchyDetector. Class hierarchy is high-leverage for resolution: simple-name superclass references like "extends Service" are routine across unrelated codebases, and EXTENDS / IMPLEMENTS edges are downstream-load-bearing for blast-radius / dead-code / cycle / topology analysis. Pinning the target FQN turns "Service-named-something" into a stable cross-file reference. When ctx.resolved() carries a JavaResolved, the detector now: 1. Uses the resolver-parsed CompilationUnit (symbol solver attached). 2. For each parent type in extendedTypes / implementedTypes, calls a single new helper addHierarchyEdge() that: - tries to resolve the type via the symbol solver - on success, attaches target_fqn to edge properties + stamps Confidence.RESOLVED + source = "java.class_hierarchy" - on failure (and always when ctx.resolved() is empty), emits the existing simple-name placeholder edge with raw default confidence (orchestrator stamps SYNTACTIC at the boundary). 3. The 4 prior in-line edge-emission blocks (class-extends, interface-extends, class-implements, enum-implements) collapse to two-line iterations through addHierarchyEdge — net is fewer LOC plus the new resolution capability. Existing 30 ClassHierarchyDetectorExtended tests pass unchanged — node emission, regex fallback, the property shapes, and the simple-name edge IDs / target placeholders are all preserved. 5 new tests in ClassHierarchyDetectorResolvedTest: - resolvedModeStampsResolvedTierOnExtendsEdge — two BaseService in different packages; imported one wins on edge.target_fqn. - resolvedModeStampsResolvedTierOnImplementsEdge — same shape for interface implements. - fallbackModeMatchesPreSpecBaseline — EmptyResolved → no FQN, no RESOLVED stamp. - fallbackModeWhenContextHasNoResolvedAtAll — Optional.empty() also safe. - mixedModeFallsBackForUnreachableType — class extends a known type, implements an unknown one: EXTENDS is RESOLVED, IMPLEMENTS falls back gracefully. This brings the migrated-detector count to 4 (JpaEntityDetector, RepositoryDetector, SpringRestDetector, ClassHierarchyDetector) — at the lower bound of the plan's "4-6 Java detectors migrated as proof of value". Plan: docs/plans/2026-04-27-sub-project-1-resolver-spi-and-java-pilot.md (spirit of tasks 24-29 — using the actual detectors that exist in this repo, not the plan's hypothetical names). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(changelog): document resolver pipeline wiring + 4 Java detector migrations Extends the [Unreleased] entry with the Phase 4 + 6 follow-up work shipped on this branch — the resolver is now wired end-to-end into Analyzer and four Java detectors consume ctx.resolved() to emit RESOLVED-tier edges with stable target FQNs: - JpaEntityDetector — @onetomany / @manytoone MAPS_TO targets - RepositoryDetector — JpaRepository<T, ID> entity FQN - SpringRestDetector — @RequestBody DTO MAPS_TO edges - ClassHierarchyDetector — EXTENDS / IMPLEMENTS FQN targets Also covers the JavaSymbolResolver lazy-parse extension that lets the orchestrator pass raw source content for Java (the structured parser doesn't cover Java, so without this the resolver could never receive a CompilationUnit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(resolver/java): add Layer 3 + Layer 6 — concurrency stress + determinism Phase 7 of the resolver-and-Java-pilot plan, top two highest-leverage aggressive-testing layers. Layer 6 — JavaSymbolResolverDeterminismTest (4 tests): - sameInputResolvesToSameFqnEveryTime — single resolver, 25 iterations over the same source must produce the same resolved FQN ("com.example.a.Owner"). Pins the value-stable contract under repeated calls (different identity, same value). - twoResolverInstancesOverSameProjectAgree — two independent resolver instances bootstrapped against the same root must produce the same FQN for the same source — establishes that bootstrap is value-stable across instances, not just within one. - rebootstrapStillProducesSameFqn — resolve, rebootstrap, resolve again; FQN is unchanged. The orchestrator calls bootstrap once, but if the resolver were ever refreshed mid-run, the value contract must still hold. - deeperFqnsAreAlsoStable — same shape on a 3-segment package (com.example.inner.deep.Marker) so a divergence on a deeper lookup can't hide behind a 1-level passing test. Layer 3 — JavaSymbolResolverConcurrencyTest (3 tests): - parallelResolveNeverThrowsAndAlwaysAgrees — 256 virtual threads each resolve the same source; the aggregated FQN set must be of size 1. Catches "thread X's CU bleeds into thread Y" / shared-mutable-state classes of races. Runs cleanly on the per-call-fresh-JavaParser contract. - parallelResolveAcrossDistinctFilesProducesPerFileResults — 200 distinct Consumer files each resolved on a virtual thread; aggregate FQN set = {com.example.api.Target}. Catches "one thread's resolved state survives into another thread's resolution" classes of bugs. - parallelResolveOnGarbageInputDoesNotThrow — 256 virtual threads each pass garbage strings; no exceptions escape and no thread returns null. The resolver's "no throw, no null" contract holds under concurrency. Plan: docs/plans/2026-04-27-sub-project-1-resolver-spi-and-java-pilot.md (tasks 30 + 31). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * checkpoint: pre-yolo 2026-04-28T01:16:10 * checkpoint: pre-yolo 2026-04-28T01:20:26 * fix(resolver): four robustness fixes from dual-agent (superpowers + codex) brainstorm Both reviewers independently identified the same four corner-cases in the Phase 4 + 6 wiring; this lands the converged fix list. 1. JavaSymbolResolver — `volatile` on `solver` and `combined` bootstrap() publishes; resolve() and the public accessors read from arbitrary virtual-thread carriers. The JLS Thread Start Rule covers the executor.submit() path but does NOT cover callers that read the public accessors after bootstrap on a different thread. Cheap fence, closes the visibility hole. 2. JavaSymbolResolver.resolve(String) — strict parse-success check JavaParser is permissive and may return a partial CompilationUnit even when the source has parse problems. Resolving against a partial CU silently emits simple-name-only edges and looks like coverage even though resolution is broken. Treat any non-success as EmptyResolved so the graph never carries phantom RESOLVED-tier edges from broken parses. 3. Analyzer.resolveFor — catch StackOverflowError Pathological generic / type-cycle inputs can blow JavaSymbolSolver's recursion stack. Catching the Error keeps the virtual-thread worker alive and degrades that file's resolution to lexical. Other Errors (OOM, ThreadDeath) remain fatal and propagate. 4. JavaSourceRootDiscovery.containsJavaFile — try-with-resources on Files.walk Files.walk holds an open directory stream; without a close, the file descriptor leaks for every plain-layout fallback scan. Cheap fix. mvn test: 3592 tests / 0 failures / 31 skipped (full suite, no regressions). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * checkpoint: pre-yolo 2026-04-28T01:32:51 * checkpoint: pre-yolo 2026-04-28T01:42:48 * test(resolver): aggressive-testing layers 1, 4, 5, 7, 8 + Layer 9 PIT profile Phase 7 of the sub-project 1 plan. Spec §12's testing matrix lands as five new test classes (26 tests) plus a non-default Maven profile. Layers 3 + 6 were already shipped in the prior commit on this branch. Layer 1 — JavaSymbolResolverLayer1ExtendedTest (16): Spec §12 Layer 1 cases not exercised by the existing JavaSymbolResolverTest — deep generics (Map<String, List<Set<UUID>>>), inner classes (static + non-static), records, sealed hierarchies, enums with abstract methods, default-method interfaces, abstract classes, annotation types, same simple name in different packages pinned by import direction, JDK Optional/Stream/List via ReflectionTypeSolver, multi-source-root cross-reference (src/main ↔ src/test), wildcard imports, cyclic imports both directions. Layer 4 — JavaSymbolResolverPathologicalTest (3): 10K-line class, 1000 imports (most unresolvable), 10-deep generic nesting (programmatically built so brackets are provably balanced). @timeout per-test is the regression sentinel against quadratic memoization; Surefire's default heap covers the spec's -Xmx512m target many times over so we don't pin it explicitly. Layer 5 — JavaSymbolResolverAdversarialTest (5): Unbalanced braces (strict-success → EmptyResolved, strong assertion), mis-tagged Kotlin (no exception/null, branch-agnostic — JavaParser's permissiveness for "fun ... { }" is implementation-specific), mis-tagged random bytes, mixed source root with .java + .txt siblings (only .java enters the solver), empty source root (no Java files anywhere) bootstraps via ReflectionTypeSolver alone. Layer 7 — E2EResolverPetclinicTest (1, env-gated): Runs JavaSymbolResolver against every .java under $E2E_PETCLINIC_DIR and asserts bootstrap < 10 s (spec §9 budget), no exception, > 50% files produce JavaResolved (i.e. strict-success isn't false-rejecting valid Java). Lighter than spec §12 Layer 7's full precision/recall comparison — that needs a pre-resolver baseline JSON checked into test resources, captured at implementation time. This stand-in is the strongest signal we have until that baseline lands. Layer 8 — JavaSymbolResolverRandomizedTest (1, 100 samples): Hand-rolled randomized generator with fixed seed (0xC0DE197042L). Per the plan's license guidance, jqwik (EPL-2.0) isn't on the project's preferred-license list (~/.claude/rules/dependencies.md prefers MIT/Apache/BSD); this is the documented JUnit + java.util.Random fallback. Properties: never throws unchecked, never returns null, completes per-file in < 1 s budget. Layer 9 — mutation Maven profile (non-default): Adds pitest-maven 1.18.0 (Apache-2.0) targeting intelligence.resolver.* and model.Confidence. Run with mvn -P mutation org.pitest:pitest-maven:mutationCoverage \ -Dfrontend.skip=true -Ddependency-check.skip=true Reports under target/pit-reports/. Non-gating per the plan; the ≥ 80% target is a follow-up signal once a first run lands. Full suite: mvn test → 3618 / 0 failures / 32 skipped (1 new skip is the env-gated E2EResolverPetclinicTest). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * checkpoint: pre-yolo 2026-04-28T01:47:06 * docs(claude+summary): close out sub-project 1 plan tasks 40 + 41 Plan §40 / §41 close-out — the resolver SPI, Confidence schema, CACHE_VERSION bump, and runtime lifecycle gotchas now land in CLAUDE.md and PROJECT_SUMMARY.md, not just CHANGELOG.md. CLAUDE.md Gotchas: - Cache versioning bullet updated 4 → 5 with reason (Confidence/source schema), so future agents reading the gotcha don't propagate the stale "4" forward. - "Symbol resolver runs at index-time only" — bootstrapResolvers and resolveFor are wired into run / runBatchedIndex / runSmartIndex only. Never reached at serve. Prevents future agents from reaching for ResolverRegistry from serve-mode code paths. - "Confidence + source mandatory on every CodeNode/CodeEdge" — DetectorEmissionDefaults stamps the floor; RESOLVED is opt-in via ctx.resolved(). Reading legacy data is non-throwing. - "JavaSymbolResolver.resolve() allocates a fresh JavaParser per call" — intentional thread-safety boundary for virtual-thread fan-out, not a perf bug. - "Strict parse-success check" — resolve(String) returns EmptyResolved on any JavaParser problem so the graph never carries phantom RESOLVED-tier edges from partial-CU outputs. - "Volatile fields on JavaSymbolResolver" — closes the public-accessor visibility race per the dual-agent brainstorm fix. PROJECT_SUMMARY.md: - Tech-stack row updated to "AST + symbols (Java) | JavaParser 3.28.0 + javaparser-symbol-solver-core 3.28.0". - Cache dir line updated CACHE_VERSION=4 → 5. - Two new gotchas (resolver-is-index-time-only, strict parse-success) cross-referencing the canonical CLAUDE.md entries. No source changes. Full mvn test was last green at 3618 / 0 / 32 skipped (unchanged for this docs-only commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 47e6404 commit 0c1a4c8

26 files changed

Lines changed: 2827 additions & 69 deletions

AGENTS.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,12 @@ If the task asks for product/feature work and `shared/runbooks/release.md` is mi
4141
## Auth escalation
4242

4343
If you hit something requiring GitHub App / PAT / OAuth that the runtime cannot satisfy (org admin escalation, Sonatype Central re-namespace, OpenSSF Best Practices form, etc.), do **not** improvise auth: PATCH the issue to `blocked` with the exact ask and `@`-mention the board.
44+
45+
46+
<claude-mem-context>
47+
# Memory Context
48+
49+
# [codeiq] recent context, 2026-04-28 1:14am UTC
50+
51+
No previous sessions found.
52+
</claude-mem-context>

CHANGELOG.md

Lines changed: 101 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -82,8 +82,51 @@ for that specific tag for the per-commit details.
8282
(detectors that explicitly stamp survive untouched). 11 atomic commits
8383
ship with ~290 new tests covering happy paths, legacy-data fallbacks,
8484
malformed inputs, determinism, concurrency-safe construction, and singleton
85-
invariants. Detector migrations to consume `ctx.resolved()` and the
86-
resolver-bootstrap-into-Analyzer hook follow in sub-project 1 Phase 5.
85+
invariants.
86+
87+
- **Resolver pipeline wiring + Java pilot detectors** (sub-project 1, plan
88+
Phases 4 + 6 — follow-up to the SPI scaffolding above): the resolver
89+
is now actually invoked end-to-end and four Java detectors consume
90+
`ctx.resolved()` to emit RESOLVED-tier edges with stable
91+
fully-qualified-name targets.
92+
- `Analyzer` now bootstraps `ResolverRegistry` exactly once per pipeline
93+
entry point (`run` / `runBatchedIndex` / `runSmartIndex`) and threads a
94+
`Resolved` onto every `DetectorContext` at all three detect call sites
95+
(`analyzeFile`, the batched-index variant, the regex-only fallback).
96+
Per-file `ResolutionException` + `RuntimeException` are swallowed and
97+
fall back to `EmptyResolved.INSTANCE`, so one resolver blow-up cannot
98+
take down the whole pass.
99+
- `JavaSymbolResolver.resolve()` now lazy-parses raw source `String`
100+
content with a fresh symbol-solver-configured `JavaParser` per call —
101+
a small per-call allocation that lets `Analyzer` pass the file content
102+
directly (the orchestrator-level structured parser doesn't cover Java).
103+
Permissive parsing returns `JavaResolved` with a possibly-error-laden
104+
`CompilationUnit` rather than refusing — production analysis must keep
105+
going across files with syntax errors.
106+
- Four detectors migrated to consume `ctx.resolved()` (purely additive —
107+
every existing detector test passes unchanged):
108+
- **JpaEntityDetector**`MAPS_TO` edges between entities now carry
109+
`target_fqn` and `Confidence.RESOLVED` when the symbol solver can
110+
pin the relationship target's FQN (handles `@OneToMany List<Owner>`,
111+
`@ManyToOne Owner`, both direct-field and generic-arg cases).
112+
- **RepositoryDetector** — Spring Data repo `QUERIES` edges plus the
113+
repo node carry the resolved entity FQN (`entity_fqn` /
114+
`target_fqn`) when `JpaRepository<User, Long>` resolves.
115+
- **SpringRestDetector** — endpoints emit a `MAPS_TO` edge to the
116+
`@RequestBody` DTO class when the parameter type resolves, with
117+
`parameter_kind=request_body` + `parameter_name` properties for
118+
downstream consumers (SPA, MCP).
119+
- **ClassHierarchyDetector**`EXTENDS` / `IMPLEMENTS` edges across
120+
classes, interfaces, and enums now stamp `Confidence.RESOLVED` +
121+
`target_fqn` when the parent type resolves, collapsing four
122+
duplicated in-line edge-emission blocks into a single
123+
`addHierarchyEdge` helper as a side-benefit.
124+
- Backward compatibility is total: when no resolver is registered or
125+
`JavaSymbolResolver.bootstrap` fails, every detector returns the
126+
same simple-name-targeted edge shape it shipped before this slice.
127+
- 18 new wiring + resolved-mode tests on top of the SPI's ~290 — every
128+
migration ships with the plan-required three-mode coverage (resolved,
129+
fallback, mixed).
87130
- **AKS read-only deploy hardening** (sub-project 2): runbook at
88131
[`shared/runbooks/aks-read-only-deploy.md`](shared/runbooks/aks-read-only-deploy.md),
89132
JVM-flag-preset launcher at [`scripts/aks-launch.sh`](scripts/aks-launch.sh),
@@ -97,6 +140,62 @@ for that specific tag for the per-commit details.
97140
`-XX:ErrorFile` / `-XX:HeapDumpPath` overrides. Spec at
98141
[`docs/specs/2026-04-28-aks-read-only-deploy-design.md`](docs/specs/2026-04-28-aks-read-only-deploy-design.md).
99142

143+
- **Resolver aggressive-testing layers** (sub-project 1, plan Phase 7 —
144+
Layers 1, 3, 4, 5, 6, 7, 8, 9): the spec §12 testing matrix lands as
145+
six new test classes plus a non-default Maven profile.
146+
- **Layer 1**`JavaSymbolResolverLayer1ExtendedTest` (16 tests):
147+
deeply-nested generics, static / non-static inner classes, records,
148+
sealed hierarchies, enum-with-abstract-methods, default-method
149+
interfaces, abstract classes, annotation types, same simple name in
150+
different packages by import, JDK `Optional` / `Stream` / `List` via
151+
`ReflectionTypeSolver`, multi-source-root cross-references
152+
(`src/main``src/test`), wildcard imports, cyclic imports.
153+
- **Layer 3**`JavaSymbolResolverConcurrencyTest` (already shipped
154+
in the prior commit): virtual-thread fan-out under `N=200` files /
155+
`256` concurrent calls, garbage-input variant.
156+
- **Layer 4**`JavaSymbolResolverPathologicalTest` (3 tests):
157+
10K-line class, 1000 imports (most unresolvable), 10-deep generic
158+
nesting; per-test `@Timeout` is the regression sentinel against
159+
quadratic memoization.
160+
- **Layer 5**`JavaSymbolResolverAdversarialTest` (5 tests):
161+
unbalanced braces (strict-success → `EmptyResolved`), mis-tagged
162+
Kotlin / random-bytes (no exception, no null), mixed source root
163+
with `.java` + `.txt` siblings, empty source root (no Java files
164+
anywhere) bootstraps via `ReflectionTypeSolver` alone.
165+
- **Layer 6**`JavaSymbolResolverDeterminismTest` (already shipped):
166+
same input → same FQN 25× in a row, two independent resolvers
167+
agree, rebootstrap is observably idempotent, deeper FQNs are stable.
168+
- **Layer 7**`E2EResolverPetclinicTest` (env-gated): runs the
169+
resolver against every `.java` under `$E2E_PETCLINIC_DIR`, asserts
170+
bootstrap < 10 s, no exception, > 50% files produce `JavaResolved`
171+
(i.e. strict-success isn't false-rejecting valid Java). Lighter than
172+
spec §12 Layer 7's full precision/recall comparison — that requires
173+
a pre-resolver baseline JSON checked into test resources, captured
174+
at implementation time. This stand-in is the strongest signal until
175+
that baseline lands.
176+
- **Layer 8**`JavaSymbolResolverRandomizedTest` (1 test, 100
177+
samples): hand-rolled randomized generator with fixed seed; per the
178+
plan's license guidance, jqwik (EPL-2.0) is not on the preferred-
179+
license list, and this is the documented JUnit + `java.util.Random`
180+
fallback. Properties: never throws, never returns null, completes
181+
per file in < 1 s.
182+
- **Layer 9**`mutation` Maven profile (non-default): adds
183+
`pitest-maven` 1.18.0 (Apache-2.0) targeting
184+
`intelligence.resolver.*` and `model.Confidence`. Run with
185+
`mvn -P mutation org.pitest:pitest-maven:mutationCoverage
186+
-Dfrontend.skip=true -Ddependency-check.skip=true`. Reports under
187+
`target/pit-reports/`.
188+
- Four robustness fixes from a dual-agent (superpowers + codex)
189+
brainstorm landed on the same branch: `volatile` on
190+
`JavaSymbolResolver`'s `solver` / `combined` fields, strict
191+
parse-success check in the String-source branch (was silently
192+
emitting partial-CU edges on broken parses), `StackOverflowError`
193+
catch in `Analyzer.resolveFor` (pathological generics no longer kill
194+
virtual threads), `try-with-resources` on the `Files.walk` in
195+
`JavaSourceRootDiscovery.containsJavaFile` (fd leak fix). 26 new
196+
tests on top of the resolver wiring slice's 18 — full suite at 3618
197+
/ 0 / 32 skipped, +1 skip is the env-gated E2E petclinic test.
198+
100199
### Changed
101200

102201
- Documentation count drift fixed: detector total updated from **97 → 99**

CLAUDE.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -421,7 +421,12 @@ bean for code paths that haven't been ported yet.
421421
- **`@ActiveProfiles("test")`**: Required on any `@SpringBootTest` to avoid Neo4j startup conflicts.
422422
- **Dead code detection**: Must filter by semantic edges only (calls, imports, depends_on). Exclude structural edges (contains, defines) and entry points (endpoints, config files).
423423
- **H2 reserved words**: `key`, `value`, `order` are reserved in H2 SQL. Use `meta_key`, `meta_value` etc. in CREATE TABLE statements.
424-
- **Cache versioning**: `AnalysisCache` has a `CACHE_VERSION` constant (currently `4`). Bump it when changing the hash algorithm or H2 schema so stale caches are auto-cleared on next run.
424+
- **Cache versioning**: `AnalysisCache` has a `CACHE_VERSION` constant (currently `5`, bumped from `4` for the resolver `confidence` + `source` schema). Bump it when changing the hash algorithm, H2 schema, or any field that becomes mandatory on cached nodes/edges so stale caches are auto-cleared on next run.
425+
- **Symbol resolver runs at index-time only.** `Analyzer.bootstrapResolvers()` and `Analyzer.resolveFor()` are wired into `run` / `runBatchedIndex` / `runSmartIndex` paths only — never at `serve`. The resolver SPI lives under `intelligence/resolver/`. If you find yourself reaching for `ResolverRegistry` from a serve-mode code path, stop — the graph is the source of truth at serve.
426+
- **`Confidence` + `source` are mandatory on every `CodeNode` / `CodeEdge`.** `DetectorEmissionDefaults.applyDefaults` stamps the per-detector floor (`LEXICAL` for regex bases, `SYNTACTIC` for AST/JavaParser/structured bases) at the orchestration boundary; detectors that consume `ctx.resolved()` upgrade to `Confidence.RESOLVED` and attach a `target_fqn` property. Reading legacy data without these fields is non-throwing — they read back as `LEXICAL` / null.
427+
- **`JavaSymbolResolver.resolve()` allocates a fresh `JavaParser` per call.** JavaParser instances aren't thread-safe and `resolve()` is invoked from virtual threads concurrently. Per-call allocation is intentional, not a perf bug — don't "optimize" by sharing one parser across calls.
428+
- **`JavaSymbolResolver.resolve(String)` enforces strict parse-success.** When JavaParser flags any problem (`!parseResult.isSuccessful()`), the resolver returns `EmptyResolved.INSTANCE` rather than a partial-CU `JavaResolved`. This prevents silent simple-name-only edges from broken parses that look like RESOLVED-tier coverage. Detectors must treat `ctx.resolved()` returning `EmptyResolved` as "lexical fallback" — never assume RESOLVED edges land for every Java file.
429+
- **`JavaSymbolResolver` fields are `volatile`.** `combined` and `solver` are written by `bootstrap()` and read by `resolve()` + the public accessors from arbitrary virtual-thread carriers. The JLS Thread Start Rule covers the `executor.submit()` path; `volatile` covers post-bootstrap callers on other threads. Don't drop the keyword.
425430
- **FileHasher uses SHA-256**: Changed from MD5. Hash output is 64 hex chars (not 32). Tests must expect 64-char hashes.
426431
- **SnakeYAML parses `on` as Boolean.TRUE**: In YAML files, bare `on` key becomes `Boolean.TRUE`. Use `String.valueOf(key)` comparisons, not `Boolean.TRUE.equals(key)` (SonarCloud S2159).
427432
- **Regex possessive quantifiers**: Use `*+` instead of `*` for nested quantifiers like `([^"\\]*(?:\\.[^"\\]*)*)``([^"\\]*+(?:\\.[^"\\]*+)*+)` to prevent stack overflow (SonarCloud S5998).

PROJECT_SUMMARY.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ Read directly from the `pom.xml` `<properties>` block and `src/main/frontend/pac
2323
| Graph DB | Neo4j Embedded 2026.02.3 (Community) | `pom.xml` `<neo4j.version>` |
2424
| MCP | Spring AI 2.0.0-M3 (`spring-ai-starter-mcp-server-webmvc`) | `pom.xml` `<spring-ai.version>` |
2525
| CLI | Picocli 4.7.7 (`picocli-spring-boot-starter`) | `pom.xml` `<picocli.version>` |
26-
| AST (Java) | JavaParser 3.28.0 | `[CLAUDE.md]``pom.xml` references via dep |
26+
| AST + symbols (Java) | JavaParser 3.28.0 + `javaparser-symbol-solver-core` 3.28.0 (Apache-2.0) | `pom.xml` `javaparser` deps; `intelligence/resolver/java/JavaSymbolResolver.java` |
2727
| Parsers (35+ langs) | ANTLR 4.13.2 (TS/JS, Python, Go, C#, Rust, C++) | `[CLAUDE.md]` |
2828
| Cache | H2 in embedded mode (incremental analysis cache) | `src/main/java/io/github/randomcodespace/iq/cache/AnalysisCache.java` |
2929
| Frontend | React 18.3 + AntD 5.24 + ECharts 5.6 + react-router 7 | `src/main/frontend/package.json` |
@@ -114,7 +114,7 @@ CI gate is `mvn verify` — runs unit + integration tests **plus** SpotBugs and
114114
**Required env / external services:** none. codeiq is offline-first by design — Neo4j and H2 are embedded; no external server, no network calls at runtime. Air-gapped install: `git clone` + Maven mirror + `mvn package`. See [`shared/runbooks/first-time-setup.md`](shared/runbooks/first-time-setup.md).
115115

116116
**Cache + graph dirs at runtime** (created in your scanned repo):
117-
- `.codeiq/cache/` — H2 incremental analysis cache (`CACHE_VERSION=4` constant near the top of `cache/AnalysisCache.java`)
117+
- `.codeiq/cache/` — H2 incremental analysis cache (`CACHE_VERSION=5` constant near the top of `cache/AnalysisCache.java`; bumped from 4 for the resolver `confidence` + `source` schema, so stale v4 caches drop and rebuild on first run after upgrade)
118118
- `.codeiq/graph/graph.db/` — Neo4j Embedded data dir
119119

120120
## Conventions an agent must respect
@@ -138,7 +138,9 @@ CI gate is `mvn verify` — runs unit + integration tests **plus** SpotBugs and
138138
- **Edges must be attached to source nodes before `bulkSave()`.** Cypher `MATCH` silently returns 0 rows for missing source IDs — pre-validate.
139139
- **`@ActiveProfiles("test")` is required on every `@SpringBootTest`** to avoid Neo4j auto-startup conflicts.
140140
- **`AnalysisCache` uses a `ReentrantReadWriteLock`** (not `synchronized`). JEP 491 (Java 25) means lock primitives no longer pin virtual-thread carriers; the read/write lock is what prevents `ClosedChannelException` on H2's MVStore under concurrent virtual-thread access. Don't "simplify" to `synchronized`.
141-
- **Bump `CACHE_VERSION` in `cache/AnalysisCache.java`** (top of file) when you change the file-hash algorithm or H2 schema. Stale caches auto-clear on next run.
141+
- **Bump `CACHE_VERSION` in `cache/AnalysisCache.java`** (top of file) when you change the file-hash algorithm or H2 schema. Stale caches auto-clear on next run. Currently `5` (bumped from 4 for the resolver `confidence` + `source` schema).
142+
- **Symbol resolver is index-time only.** `Analyzer.bootstrapResolvers()` is reached from `run` / `runBatchedIndex` / `runSmartIndex` only — never at `serve`. The SPI lives at `intelligence/resolver/`; the Java backend wraps `javaparser-symbol-solver-core`. RESOLVED-tier edges and `target_fqn` properties land at index-time and are then served read-only from Neo4j.
143+
- **`JavaSymbolResolver.resolve(String)` enforces strict parse-success.** Partial-CU outputs from JavaParser problems are converted to `EmptyResolved` so the graph never carries phantom RESOLVED edges from broken parses. Detectors must handle `EmptyResolved` as "lexical fallback".
142144
- **SnakeYAML parses bare `on` as `Boolean.TRUE`.** Compare YAML keys with `String.valueOf(key)`, not `Boolean.TRUE.equals(key)` (SonarCloud S2159).
143145
- **Determinism gate:** every new detector needs a determinism test (run twice, assert equal output) — see existing `*DetectorTest.java` for the pattern.
144146
- **First `mvn verify` downloads ~1 GB NVD database** for OWASP dependency-check. Override locally with `-Ddependency-check.skip=true`.

pom.xml

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -489,6 +489,42 @@
489489
</build>
490490

491491
<profiles>
492+
<!--
493+
Mutation testing profile — Phase 7 Layer 9 (non-gating).
494+
Run: mvn -P mutation org.pitest:pitest-maven:mutationCoverage
495+
-Dfrontend.skip=true -Ddependency-check.skip=true
496+
Targets the resolver SPI surface and Confidence model. Reports under
497+
target/pit-reports/. Apache-2.0 licensed (preferred-license tier).
498+
-->
499+
<profile>
500+
<id>mutation</id>
501+
<build>
502+
<plugins>
503+
<plugin>
504+
<groupId>org.pitest</groupId>
505+
<artifactId>pitest-maven</artifactId>
506+
<version>1.18.0</version>
507+
<configuration>
508+
<targetClasses>
509+
<param>io.github.randomcodespace.iq.intelligence.resolver.*</param>
510+
<param>io.github.randomcodespace.iq.intelligence.resolver.java.*</param>
511+
<param>io.github.randomcodespace.iq.model.Confidence</param>
512+
</targetClasses>
513+
<targetTests>
514+
<param>io.github.randomcodespace.iq.intelligence.resolver.*</param>
515+
<param>io.github.randomcodespace.iq.intelligence.resolver.java.*</param>
516+
<param>io.github.randomcodespace.iq.model.ConfidenceTest</param>
517+
</targetTests>
518+
<outputFormats>
519+
<outputFormat>HTML</outputFormat>
520+
<outputFormat>XML</outputFormat>
521+
</outputFormats>
522+
<timestampedReports>false</timestampedReports>
523+
</configuration>
524+
</plugin>
525+
</plugins>
526+
</build>
527+
</profile>
492528
<profile>
493529
<id>release</id>
494530
<build>

0 commit comments

Comments
 (0)