Skip to content

Commit 15bd098

Browse files
aksOpsclaude
andcommitted
Eliminate all code duplication + add 150 tests for 20 dark detectors
Utils migration: - 63 detector files now import from detectors/utils.py - 0 remaining ctx.content.decode duplicates (was 63) - 0 remaining private _find_line_number copies (was 9) Test coverage: - 150 new tests across 20 previously untested detectors - Covers: Spring REST/JPA/Kafka/Events/gRPC, Flask/Django/FastAPI/ SQLAlchemy/Celery, Express/NestJS/GraphQL/TypeORM, C++, PowerShell - 565 total tests, all passing in 0.92s Benchmark: contoso-real-estate 488 files, 2,313 nodes, 2,905 edges, 3.7s — no performance regression, +56 files from new extensions Updated CLAUDE.md with benchmark requirements. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 5689c01 commit 15bd098

87 files changed

Lines changed: 2540 additions & 151 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CLAUDE.md

Lines changed: 25 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -59,9 +59,32 @@ FileDiscovery → Parsers → Detectors → GraphBuilder (buffered) → Linkers
5959

6060
## Testing
6161

62-
- `pytest tests/ -x -q` — must always pass (currently 361 tests)
62+
- `pytest tests/ -x -q` — must always pass (currently 565 tests)
6363
- Every detector needs: positive match test, negative match test, determinism test
64-
- Benchmark on spring-boot (10K files) for performance regression checks
64+
- All detectors use shared `detectors/utils.py` — decode_text, find_line_number, etc.
65+
66+
## Benchmark Requirements
67+
68+
**After every change**, run a clean benchmark on a small project to verify:
69+
1. No performance regression (time should not increase significantly)
70+
2. 100% determinism (2 runs produce identical node/edge counts)
71+
3. Coverage doesn't decrease (file/node/edge counts should not drop)
72+
73+
**Benchmark procedure:**
74+
```bash
75+
rm -rf ~/projects/testDir/contoso-real-estate/.code-intelligence/
76+
find ~/projects/testDir/contoso-real-estate -name ".code_intelligence_cache*" -delete
77+
# Run twice
78+
time code-intelligence analyze ~/projects/testDir/contoso-real-estate --full -j 8
79+
time code-intelligence analyze ~/projects/testDir/contoso-real-estate --full -j 8
80+
```
81+
82+
If `testDir/contoso-real-estate` is not available, clone an official secure project:
83+
```bash
84+
git clone --depth 1 https://github.com/Azure-Samples/contoso-real-estate.git ~/projects/testDir/contoso-real-estate
85+
```
86+
87+
**Baseline (contoso-real-estate, 488 files):** 2,313 nodes, 2,905 edges, ~3.7s
6588
- Cross-backend parity test on contoso-real-estate for data quality
6689

6790
## Key Files

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -22,9 +22,9 @@
2222
<!-- DYNAMIC:vulnerabilities --><a href="https://github.com/RandomCodeSpace/code-iq/security/dependabot"><img src="https://img.shields.io/badge/vulnerabilities-0-brightgreen?style=flat-square&logo=hackthebox&logoColor=white" alt="0 Vulnerabilities"></a><!-- /DYNAMIC:vulnerabilities -->
2323
<!-- DYNAMIC:detectors --><a href="https://github.com/RandomCodeSpace/code-iq"><img src="https://img.shields.io/badge/detectors-75-brightgreen?style=flat-square&logo=codefactor&logoColor=white" alt="75 Detectors"></a><!-- /DYNAMIC:detectors -->
2424
<!-- DYNAMIC:languages --><a href="https://github.com/RandomCodeSpace/code-iq"><img src="https://img.shields.io/badge/languages-35-blue?style=flat-square&logo=stackblitz&logoColor=white" alt="35 Languages"></a><!-- /DYNAMIC:languages -->
25-
<!-- DYNAMIC:tests --><a href="https://github.com/RandomCodeSpace/code-iq"><img src="https://img.shields.io/badge/tests-415%20passed-brightgreen?style=flat-square&logo=pytest&logoColor=white" alt="415 passed Tests"></a><!-- /DYNAMIC:tests -->
26-
<!-- DYNAMIC:files --><a href="https://github.com/RandomCodeSpace/code-iq"><img src="https://img.shields.io/badge/files-204-informational?style=flat-square&logo=files&logoColor=white" alt="204 Files"></a><!-- /DYNAMIC:files -->
27-
<!-- DYNAMIC:loc --><a href="https://github.com/RandomCodeSpace/code-iq"><img src="https://img.shields.io/badge/LOC-23%2C370-informational?style=flat-square&logo=codacy&logoColor=white" alt="23,370 Loc"></a><!-- /DYNAMIC:loc -->
25+
<!-- DYNAMIC:tests --><a href="https://github.com/RandomCodeSpace/code-iq"><img src="https://img.shields.io/badge/tests-565%20passed-brightgreen?style=flat-square&logo=pytest&logoColor=white" alt="565 passed Tests"></a><!-- /DYNAMIC:tests -->
26+
<!-- DYNAMIC:files --><a href="https://github.com/RandomCodeSpace/code-iq"><img src="https://img.shields.io/badge/files-226-informational?style=flat-square&logo=files&logoColor=white" alt="226 Files"></a><!-- /DYNAMIC:files -->
27+
<!-- DYNAMIC:loc --><a href="https://github.com/RandomCodeSpace/code-iq"><img src="https://img.shields.io/badge/LOC-25%2C736-informational?style=flat-square&logo=codacy&logoColor=white" alt="25,736 Loc"></a><!-- /DYNAMIC:loc -->
2828
</p>
2929

3030
---

src/code_intelligence/detectors/auth/certificate_auth.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from dataclasses import dataclass
77

88
from code_intelligence.detectors.base import DetectorContext, DetectorResult
9+
from code_intelligence.detectors.utils import decode_text
910
from code_intelligence.models.graph import GraphNode, NodeKind, SourceLocation
1011

1112

@@ -83,7 +84,7 @@ class CertificateAuthDetector:
8384

8485
def detect(self, ctx: DetectorContext) -> DetectorResult:
8586
result = DetectorResult()
86-
text = ctx.content.decode("utf-8", errors="replace")
87+
text = decode_text(ctx)
8788
lines = text.split("\n")
8889

8990
# Track which lines already produced a node (first match wins per line).

src/code_intelligence/detectors/auth/ldap_auth.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import re
66

77
from code_intelligence.detectors.base import DetectorContext, DetectorResult
8+
from code_intelligence.detectors.utils import decode_text
89
from code_intelligence.models.graph import GraphNode, NodeKind, SourceLocation
910

1011
# -- Java patterns --
@@ -56,7 +57,7 @@ def detect(self, ctx: DetectorContext) -> DetectorResult:
5657
if ctx.language not in _LANGUAGE_PATTERNS:
5758
return result
5859

59-
text = ctx.content.decode("utf-8", errors="replace")
60+
text = decode_text(ctx)
6061
lines = text.split("\n")
6162
patterns = _LANGUAGE_PATTERNS[ctx.language]
6263
seen_lines: set[int] = set()

src/code_intelligence/detectors/auth/session_header_auth.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
from dataclasses import dataclass
77

88
from code_intelligence.detectors.base import DetectorContext, DetectorResult
9+
from code_intelligence.detectors.utils import decode_text
910
from code_intelligence.models.graph import GraphNode, NodeKind, SourceLocation
1011

1112

@@ -84,7 +85,7 @@ def detect(self, ctx: DetectorContext) -> DetectorResult:
8485
if ctx.language not in self.supported_languages:
8586
return result
8687

87-
text = ctx.content.decode("utf-8", errors="replace")
88+
text = decode_text(ctx)
8889
lines = text.split("\n")
8990
seen_lines: set[int] = set()
9091

src/code_intelligence/detectors/config/batch_structure.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import re
66

77
from code_intelligence.detectors.base import DetectorContext, DetectorResult
8+
from code_intelligence.detectors.utils import decode_text
89
from code_intelligence.models.graph import (
910
EdgeKind,
1011
GraphEdge,
@@ -28,7 +29,7 @@ def detect(self, ctx: DetectorContext) -> DetectorResult:
2829
result = DetectorResult()
2930

3031
try:
31-
text = ctx.content.decode("utf-8", errors="replace")
32+
text = decode_text(ctx)
3233
except Exception:
3334
return result
3435

src/code_intelligence/detectors/config/ini_structure.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import configparser
66

77
from code_intelligence.detectors.base import DetectorContext, DetectorResult
8+
from code_intelligence.detectors.utils import decode_text
89
from code_intelligence.models.graph import (
910
EdgeKind,
1011
GraphEdge,
@@ -41,7 +42,7 @@ def detect(self, ctx: DetectorContext) -> DetectorResult:
4142

4243
# Parse INI from raw content
4344
try:
44-
text = ctx.content.decode("utf-8", errors="replace")
45+
text = decode_text(ctx)
4546
parser = configparser.ConfigParser()
4647
parser.read_string(text)
4748
except Exception:

src/code_intelligence/detectors/config/pyproject_toml.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@
66
import sys
77

88
from code_intelligence.detectors.base import DetectorContext, DetectorResult
9+
from code_intelligence.detectors.utils import decode_text
910
from code_intelligence.models.graph import (
1011
EdgeKind,
1112
GraphEdge,
@@ -40,7 +41,7 @@ def detect(self, ctx: DetectorContext) -> DetectorResult:
4041
return result
4142

4243
try:
43-
data = tomllib.loads(ctx.content.decode("utf-8", errors="replace"))
44+
data = tomllib.loads(decode_text(ctx))
4445
except Exception:
4546
return result
4647

src/code_intelligence/detectors/config/sql_structure.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import re
66

77
from code_intelligence.detectors.base import DetectorContext, DetectorResult
8+
from code_intelligence.detectors.utils import decode_text
89
from code_intelligence.models.graph import (
910
EdgeKind,
1011
GraphEdge,
@@ -45,7 +46,7 @@ def detect(self, ctx: DetectorContext) -> DetectorResult:
4546
result = DetectorResult()
4647

4748
try:
48-
text = ctx.content.decode("utf-8", errors="replace")
49+
text = decode_text(ctx)
4950
except Exception:
5051
return result
5152

src/code_intelligence/detectors/config/toml_structure.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55
import sys
66

77
from code_intelligence.detectors.base import DetectorContext, DetectorResult
8+
from code_intelligence.detectors.utils import decode_text
89
from code_intelligence.models.graph import (
910
EdgeKind,
1011
GraphEdge,
@@ -52,7 +53,7 @@ def detect(self, ctx: DetectorContext) -> DetectorResult:
5253

5354
# Parse TOML from raw content
5455
try:
55-
data = tomllib.loads(ctx.content.decode("utf-8", errors="replace"))
56+
data = tomllib.loads(decode_text(ctx))
5657
except Exception:
5758
return result
5859

0 commit comments

Comments
 (0)