Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,12 @@ If the task asks for product/feature work and `shared/runbooks/release.md` is mi
## Auth escalation

If you hit something requiring GitHub App / PAT / OAuth that the runtime cannot satisfy (org admin escalation, Sonatype Central re-namespace, OpenSSF Best Practices form, etc.), do **not** improvise auth: PATCH the issue to `blocked` with the exact ask and `@`-mention the board.


<claude-mem-context>
# Memory Context

# [codeiq] recent context, 2026-04-28 1:14am UTC

No previous sessions found.
</claude-mem-context>
103 changes: 101 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,8 +82,51 @@ for that specific tag for the per-commit details.
(detectors that explicitly stamp survive untouched). 11 atomic commits
ship with ~290 new tests covering happy paths, legacy-data fallbacks,
malformed inputs, determinism, concurrency-safe construction, and singleton
invariants. Detector migrations to consume `ctx.resolved()` and the
resolver-bootstrap-into-Analyzer hook follow in sub-project 1 Phase 5.
invariants.

- **Resolver pipeline wiring + Java pilot detectors** (sub-project 1, plan
Phases 4 + 6 — follow-up to the SPI scaffolding above): the resolver
is now actually invoked end-to-end and four Java detectors consume
`ctx.resolved()` to emit RESOLVED-tier edges with stable
fully-qualified-name targets.
- `Analyzer` now bootstraps `ResolverRegistry` exactly once per pipeline
entry point (`run` / `runBatchedIndex` / `runSmartIndex`) and threads a
`Resolved` onto every `DetectorContext` at all three detect call sites
(`analyzeFile`, the batched-index variant, the regex-only fallback).
Per-file `ResolutionException` + `RuntimeException` are swallowed and
fall back to `EmptyResolved.INSTANCE`, so one resolver blow-up cannot
take down the whole pass.
- `JavaSymbolResolver.resolve()` now lazy-parses raw source `String`
content with a fresh symbol-solver-configured `JavaParser` per call —
a small per-call allocation that lets `Analyzer` pass the file content
directly (the orchestrator-level structured parser doesn't cover Java).
Permissive parsing returns `JavaResolved` with a possibly-error-laden
`CompilationUnit` rather than refusing — production analysis must keep
going across files with syntax errors.
- Four detectors migrated to consume `ctx.resolved()` (purely additive —
every existing detector test passes unchanged):
- **JpaEntityDetector** — `MAPS_TO` edges between entities now carry
`target_fqn` and `Confidence.RESOLVED` when the symbol solver can
pin the relationship target's FQN (handles `@OneToMany List<Owner>`,
`@ManyToOne Owner`, both direct-field and generic-arg cases).
- **RepositoryDetector** — Spring Data repo `QUERIES` edges plus the
repo node carry the resolved entity FQN (`entity_fqn` /
`target_fqn`) when `JpaRepository<User, Long>` resolves.
- **SpringRestDetector** — endpoints emit a `MAPS_TO` edge to the
`@RequestBody` DTO class when the parameter type resolves, with
`parameter_kind=request_body` + `parameter_name` properties for
downstream consumers (SPA, MCP).
- **ClassHierarchyDetector** — `EXTENDS` / `IMPLEMENTS` edges across
classes, interfaces, and enums now stamp `Confidence.RESOLVED` +
`target_fqn` when the parent type resolves, collapsing four
duplicated in-line edge-emission blocks into a single
`addHierarchyEdge` helper as a side-benefit.
- Backward compatibility is total: when no resolver is registered or
`JavaSymbolResolver.bootstrap` fails, every detector returns the
same simple-name-targeted edge shape it shipped before this slice.
- 18 new wiring + resolved-mode tests on top of the SPI's ~290 — every
migration ships with the plan-required three-mode coverage (resolved,
fallback, mixed).
- **AKS read-only deploy hardening** (sub-project 2): runbook at
[`shared/runbooks/aks-read-only-deploy.md`](shared/runbooks/aks-read-only-deploy.md),
JVM-flag-preset launcher at [`scripts/aks-launch.sh`](scripts/aks-launch.sh),
Expand All @@ -97,6 +140,62 @@ for that specific tag for the per-commit details.
`-XX:ErrorFile` / `-XX:HeapDumpPath` overrides. Spec at
[`docs/specs/2026-04-28-aks-read-only-deploy-design.md`](docs/specs/2026-04-28-aks-read-only-deploy-design.md).

- **Resolver aggressive-testing layers** (sub-project 1, plan Phase 7 —
Layers 1, 3, 4, 5, 6, 7, 8, 9): the spec §12 testing matrix lands as
six new test classes plus a non-default Maven profile.
- **Layer 1** — `JavaSymbolResolverLayer1ExtendedTest` (16 tests):
deeply-nested generics, static / non-static inner classes, records,
sealed hierarchies, enum-with-abstract-methods, default-method
interfaces, abstract classes, annotation types, same simple name in
different packages by import, JDK `Optional` / `Stream` / `List` via
`ReflectionTypeSolver`, multi-source-root cross-references
(`src/main` ↔ `src/test`), wildcard imports, cyclic imports.
- **Layer 3** — `JavaSymbolResolverConcurrencyTest` (already shipped
in the prior commit): virtual-thread fan-out under `N=200` files /
`256` concurrent calls, garbage-input variant.
- **Layer 4** — `JavaSymbolResolverPathologicalTest` (3 tests):
10K-line class, 1000 imports (most unresolvable), 10-deep generic
nesting; per-test `@Timeout` is the regression sentinel against
quadratic memoization.
- **Layer 5** — `JavaSymbolResolverAdversarialTest` (5 tests):
unbalanced braces (strict-success → `EmptyResolved`), mis-tagged
Kotlin / random-bytes (no exception, no null), mixed source root
with `.java` + `.txt` siblings, empty source root (no Java files
anywhere) bootstraps via `ReflectionTypeSolver` alone.
- **Layer 6** — `JavaSymbolResolverDeterminismTest` (already shipped):
same input → same FQN 25× in a row, two independent resolvers
agree, rebootstrap is observably idempotent, deeper FQNs are stable.
- **Layer 7** — `E2EResolverPetclinicTest` (env-gated): runs the
resolver against every `.java` under `$E2E_PETCLINIC_DIR`, asserts
bootstrap < 10 s, no exception, > 50% files produce `JavaResolved`
(i.e. strict-success isn't false-rejecting valid Java). Lighter than
spec §12 Layer 7's full precision/recall comparison — that requires
a pre-resolver baseline JSON checked into test resources, captured
at implementation time. This stand-in is the strongest signal until
that baseline lands.
- **Layer 8** — `JavaSymbolResolverRandomizedTest` (1 test, 100
samples): hand-rolled randomized generator with fixed seed; per the
plan's license guidance, jqwik (EPL-2.0) is not on the preferred-
license list, and this is the documented JUnit + `java.util.Random`
fallback. Properties: never throws, never returns null, completes
per file in < 1 s.
- **Layer 9** — `mutation` Maven profile (non-default): adds
`pitest-maven` 1.18.0 (Apache-2.0) targeting
`intelligence.resolver.*` and `model.Confidence`. Run with
`mvn -P mutation org.pitest:pitest-maven:mutationCoverage
-Dfrontend.skip=true -Ddependency-check.skip=true`. Reports under
`target/pit-reports/`.
- Four robustness fixes from a dual-agent (superpowers + codex)
brainstorm landed on the same branch: `volatile` on
`JavaSymbolResolver`'s `solver` / `combined` fields, strict
parse-success check in the String-source branch (was silently
emitting partial-CU edges on broken parses), `StackOverflowError`
catch in `Analyzer.resolveFor` (pathological generics no longer kill
virtual threads), `try-with-resources` on the `Files.walk` in
`JavaSourceRootDiscovery.containsJavaFile` (fd leak fix). 26 new
tests on top of the resolver wiring slice's 18 — full suite at 3618
/ 0 / 32 skipped, +1 skip is the env-gated E2E petclinic test.

### Changed

- Documentation count drift fixed: detector total updated from **97 → 99**
Expand Down
7 changes: 6 additions & 1 deletion CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -421,7 +421,12 @@ bean for code paths that haven't been ported yet.
- **`@ActiveProfiles("test")`**: Required on any `@SpringBootTest` to avoid Neo4j startup conflicts.
- **Dead code detection**: Must filter by semantic edges only (calls, imports, depends_on). Exclude structural edges (contains, defines) and entry points (endpoints, config files).
- **H2 reserved words**: `key`, `value`, `order` are reserved in H2 SQL. Use `meta_key`, `meta_value` etc. in CREATE TABLE statements.
- **Cache versioning**: `AnalysisCache` has a `CACHE_VERSION` constant (currently `4`). Bump it when changing the hash algorithm or H2 schema so stale caches are auto-cleared on next run.
- **Cache versioning**: `AnalysisCache` has a `CACHE_VERSION` constant (currently `5`, bumped from `4` for the resolver `confidence` + `source` schema). Bump it when changing the hash algorithm, H2 schema, or any field that becomes mandatory on cached nodes/edges so stale caches are auto-cleared on next run.
- **Symbol resolver runs at index-time only.** `Analyzer.bootstrapResolvers()` and `Analyzer.resolveFor()` are wired into `run` / `runBatchedIndex` / `runSmartIndex` paths only — never at `serve`. The resolver SPI lives under `intelligence/resolver/`. If you find yourself reaching for `ResolverRegistry` from a serve-mode code path, stop — the graph is the source of truth at serve.
- **`Confidence` + `source` are mandatory on every `CodeNode` / `CodeEdge`.** `DetectorEmissionDefaults.applyDefaults` stamps the per-detector floor (`LEXICAL` for regex bases, `SYNTACTIC` for AST/JavaParser/structured bases) at the orchestration boundary; detectors that consume `ctx.resolved()` upgrade to `Confidence.RESOLVED` and attach a `target_fqn` property. Reading legacy data without these fields is non-throwing — they read back as `LEXICAL` / null.
- **`JavaSymbolResolver.resolve()` allocates a fresh `JavaParser` per call.** JavaParser instances aren't thread-safe and `resolve()` is invoked from virtual threads concurrently. Per-call allocation is intentional, not a perf bug — don't "optimize" by sharing one parser across calls.
- **`JavaSymbolResolver.resolve(String)` enforces strict parse-success.** When JavaParser flags any problem (`!parseResult.isSuccessful()`), the resolver returns `EmptyResolved.INSTANCE` rather than a partial-CU `JavaResolved`. This prevents silent simple-name-only edges from broken parses that look like RESOLVED-tier coverage. Detectors must treat `ctx.resolved()` returning `EmptyResolved` as "lexical fallback" — never assume RESOLVED edges land for every Java file.
- **`JavaSymbolResolver` fields are `volatile`.** `combined` and `solver` are written by `bootstrap()` and read by `resolve()` + the public accessors from arbitrary virtual-thread carriers. The JLS Thread Start Rule covers the `executor.submit()` path; `volatile` covers post-bootstrap callers on other threads. Don't drop the keyword.
- **FileHasher uses SHA-256**: Changed from MD5. Hash output is 64 hex chars (not 32). Tests must expect 64-char hashes.
- **SnakeYAML parses `on` as Boolean.TRUE**: In YAML files, bare `on` key becomes `Boolean.TRUE`. Use `String.valueOf(key)` comparisons, not `Boolean.TRUE.equals(key)` (SonarCloud S2159).
- **Regex possessive quantifiers**: Use `*+` instead of `*` for nested quantifiers like `([^"\\]*(?:\\.[^"\\]*)*)` → `([^"\\]*+(?:\\.[^"\\]*+)*+)` to prevent stack overflow (SonarCloud S5998).
Expand Down
8 changes: 5 additions & 3 deletions PROJECT_SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Read directly from the `pom.xml` `<properties>` block and `src/main/frontend/pac
| Graph DB | Neo4j Embedded 2026.02.3 (Community) | `pom.xml` `<neo4j.version>` |
| MCP | Spring AI 2.0.0-M3 (`spring-ai-starter-mcp-server-webmvc`) | `pom.xml` `<spring-ai.version>` |
| CLI | Picocli 4.7.7 (`picocli-spring-boot-starter`) | `pom.xml` `<picocli.version>` |
| AST (Java) | JavaParser 3.28.0 | `[CLAUDE.md]` — `pom.xml` references via dep |
| AST + symbols (Java) | JavaParser 3.28.0 + `javaparser-symbol-solver-core` 3.28.0 (Apache-2.0) | `pom.xml` `javaparser` deps; `intelligence/resolver/java/JavaSymbolResolver.java` |
| Parsers (35+ langs) | ANTLR 4.13.2 (TS/JS, Python, Go, C#, Rust, C++) | `[CLAUDE.md]` |
| Cache | H2 in embedded mode (incremental analysis cache) | `src/main/java/io/github/randomcodespace/iq/cache/AnalysisCache.java` |
| Frontend | React 18.3 + AntD 5.24 + ECharts 5.6 + react-router 7 | `src/main/frontend/package.json` |
Expand Down Expand Up @@ -114,7 +114,7 @@ CI gate is `mvn verify` — runs unit + integration tests **plus** SpotBugs and
**Required env / external services:** none. codeiq is offline-first by design — Neo4j and H2 are embedded; no external server, no network calls at runtime. Air-gapped install: `git clone` + Maven mirror + `mvn package`. See [`shared/runbooks/first-time-setup.md`](shared/runbooks/first-time-setup.md).

**Cache + graph dirs at runtime** (created in your scanned repo):
- `.codeiq/cache/` — H2 incremental analysis cache (`CACHE_VERSION=4` constant near the top of `cache/AnalysisCache.java`)
- `.codeiq/cache/` — H2 incremental analysis cache (`CACHE_VERSION=5` constant near the top of `cache/AnalysisCache.java`; bumped from 4 for the resolver `confidence` + `source` schema, so stale v4 caches drop and rebuild on first run after upgrade)
- `.codeiq/graph/graph.db/` — Neo4j Embedded data dir

## Conventions an agent must respect
Expand All @@ -138,7 +138,9 @@ CI gate is `mvn verify` — runs unit + integration tests **plus** SpotBugs and
- **Edges must be attached to source nodes before `bulkSave()`.** Cypher `MATCH` silently returns 0 rows for missing source IDs — pre-validate.
- **`@ActiveProfiles("test")` is required on every `@SpringBootTest`** to avoid Neo4j auto-startup conflicts.
- **`AnalysisCache` uses a `ReentrantReadWriteLock`** (not `synchronized`). JEP 491 (Java 25) means lock primitives no longer pin virtual-thread carriers; the read/write lock is what prevents `ClosedChannelException` on H2's MVStore under concurrent virtual-thread access. Don't "simplify" to `synchronized`.
- **Bump `CACHE_VERSION` in `cache/AnalysisCache.java`** (top of file) when you change the file-hash algorithm or H2 schema. Stale caches auto-clear on next run.
- **Bump `CACHE_VERSION` in `cache/AnalysisCache.java`** (top of file) when you change the file-hash algorithm or H2 schema. Stale caches auto-clear on next run. Currently `5` (bumped from 4 for the resolver `confidence` + `source` schema).
- **Symbol resolver is index-time only.** `Analyzer.bootstrapResolvers()` is reached from `run` / `runBatchedIndex` / `runSmartIndex` only — never at `serve`. The SPI lives at `intelligence/resolver/`; the Java backend wraps `javaparser-symbol-solver-core`. RESOLVED-tier edges and `target_fqn` properties land at index-time and are then served read-only from Neo4j.
- **`JavaSymbolResolver.resolve(String)` enforces strict parse-success.** Partial-CU outputs from JavaParser problems are converted to `EmptyResolved` so the graph never carries phantom RESOLVED edges from broken parses. Detectors must handle `EmptyResolved` as "lexical fallback".
- **SnakeYAML parses bare `on` as `Boolean.TRUE`.** Compare YAML keys with `String.valueOf(key)`, not `Boolean.TRUE.equals(key)` (SonarCloud S2159).
- **Determinism gate:** every new detector needs a determinism test (run twice, assert equal output) — see existing `*DetectorTest.java` for the pattern.
- **First `mvn verify` downloads ~1 GB NVD database** for OWASP dependency-check. Override locally with `-Ddependency-check.skip=true`.
Expand Down
36 changes: 36 additions & 0 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -489,6 +489,42 @@
</build>

<profiles>
<!--
Mutation testing profile — Phase 7 Layer 9 (non-gating).
Run: mvn -P mutation org.pitest:pitest-maven:mutationCoverage
-Dfrontend.skip=true -Ddependency-check.skip=true
Targets the resolver SPI surface and Confidence model. Reports under
target/pit-reports/. Apache-2.0 licensed (preferred-license tier).
-->
<profile>
<id>mutation</id>
<build>
<plugins>
<plugin>
<groupId>org.pitest</groupId>
<artifactId>pitest-maven</artifactId>
<version>1.18.0</version>
<configuration>
<targetClasses>
<param>io.github.randomcodespace.iq.intelligence.resolver.*</param>
<param>io.github.randomcodespace.iq.intelligence.resolver.java.*</param>
<param>io.github.randomcodespace.iq.model.Confidence</param>
</targetClasses>
<targetTests>
<param>io.github.randomcodespace.iq.intelligence.resolver.*</param>
<param>io.github.randomcodespace.iq.intelligence.resolver.java.*</param>
<param>io.github.randomcodespace.iq.model.ConfidenceTest</param>
</targetTests>
<outputFormats>
<outputFormat>HTML</outputFormat>
<outputFormat>XML</outputFormat>
</outputFormats>
<timestampedReports>false</timestampedReports>
</configuration>
</plugin>
</plugins>
</build>
</profile>
<profile>
<id>release</id>
<build>
Expand Down
Loading
Loading