|
1 | | -# Benchmark Results — Java vs Python |
| 1 | +# Benchmark Results -- Java vs Python |
2 | 2 |
|
3 | | -**Date:** 2026-03-29 |
4 | | -**Machine:** 4 CPU cores, 16GB RAM |
5 | | -**Java:** 25 LTS, Spring Boot 4.0.5, ZGC, Virtual Threads |
6 | | -**Python:** 3.12, OSSCodeIQ 0.1.0 (8 ThreadPoolExecutor workers) |
| 3 | +Date: 2026-03-29 |
| 4 | +Machine: 4 CPU cores, 16 GB RAM |
| 5 | +Java: 25.0.2, Spring Boot 4.0.5, ZGC, embedded Neo4j 2026.02.3 |
| 6 | +Python: 3.12.13, OSSCodeIQ 0.0.0 (main branch, NetworkX backend) |
7 | 7 |
|
8 | | -## Results Summary |
| 8 | +## Results |
9 | 9 |
|
10 | | -| Project | Files | Python Nodes | Java Nodes | Parity | Python Edges | Java Edges | Parity | Python Time | Java Time | Speedup | |
11 | | -|---------|-------|-------------|------------|--------|-------------|------------|--------|-------------|-----------|---------| |
12 | | -| spring-boot | 10.5K/10.9K | 27,446 | 27,987 | **102%** | 32,890 | 36,922 | **112%** | 45.9s | 13s | **3.5x** | |
13 | | -| kafka | 6.9K/7.0K | 58,080 | 62,671 | **108%** | 99,974 | 120,376 | **120%** | 86.2s | 60s | **1.4x** | |
14 | | -| contoso-real-estate | 484/488 | 3,844 | 4,034 | **105%** | 2,906 | 4,039 | **139%** | 5.7s | 1.3s | **4.4x** | |
| 10 | +| Project | Files (Java) | Files (Python) | Python Nodes | Java Nodes | Python Edges | Java Edges | Python Time | Java Time (analysis) | Java Time (wall) | Speedup (analysis) | Consistent? | |
| 11 | +|---------|-------------|----------------|-------------|------------|-------------|------------|-------------|---------------------|-------------------|---------------------|-------------| |
| 12 | +| spring-boot | 10524 | 10872 | 27446 | 27987 | 32890 | 39776 | 56.8s | 47.8s avg | 66.9s avg | 1.2x | Yes (3/3) | |
| 13 | +| kafka | 6919 | 7003 | 58080 | 62671 | 99974 | 120376 | 96.8s | 63.5s avg | 73.7s avg | 1.5x | Yes (3/3) | |
| 14 | +| contoso-real-estate | 484 | 488 | 3844 | 4034 | 2906 | 4039 | 7.6s | 1.3s avg | 10.2s avg | 5.8x | Yes (3/3) | |
| 15 | +| benchmark | 311284 | N/A | N/A | N/A | N/A | N/A | OOM/timeout | OOM (3GB) | N/A | N/A | N/A | |
15 | 16 |
|
16 | | -**Java surpasses Python on every project** — more nodes, more edges, faster execution. |
| 17 | +### Notes on timing |
| 18 | +- **Java Time (analysis)**: Time reported by the Analyzer itself (excludes Spring Boot startup, Neo4j init) |
| 19 | +- **Java Time (wall)**: Total wall clock time including JVM startup (~8-20s Spring Boot overhead) |
| 20 | +- **Python Time**: Wall clock time (minimal startup overhead) |
| 21 | +- **Speedup**: Based on analysis time (Java) vs wall time (Python), since Python has negligible startup |
17 | 22 |
|
18 | | -## Consistency (3 Java runs per project, clean environment each time) |
| 23 | +## Consistency (3 runs per project -- Java) |
19 | 24 |
|
20 | 25 | | Project | Run 1 (nodes/edges) | Run 2 | Run 3 | Identical? | |
21 | 26 | |---------|---------------------|-------|-------|------------| |
22 | | -| spring-boot | 27,987 / 36,922 | 27,987 / 36,922 | 27,987 / 36,922 | **Yes** | |
23 | | -| kafka | 62,671 / 120,376 | 62,671 / 120,376 | 62,671 / 120,376 | **Yes** | |
24 | | -| contoso-real-estate | 4,034 / 4,039 | 4,034 / 4,039 | 4,034 / 4,039 | **Yes** | |
25 | | - |
26 | | -**100% deterministic** — identical results across all runs for every project. |
27 | | - |
28 | | -## Java Timing Consistency (analysis time only, excludes JVM startup) |
29 | | - |
30 | | -| Project | Run 1 | Run 2 | Run 3 | Variance | |
31 | | -|---------|-------|-------|-------|----------| |
32 | | -| spring-boot | 13.0s | 12.8s | 13.1s | <3% | |
33 | | -| kafka | 69.6s | 61.5s | 59.3s | ~15% (JIT warmup effect) | |
34 | | -| contoso-real-estate | 1.4s | 1.3s | 1.3s | <8% | |
35 | | - |
36 | | -## Why Java Finds More |
37 | | - |
38 | | -Java detectors find MORE nodes and edges than Python because: |
39 | | -1. **JavaParser AST** — 6 Java detectors upgraded from regex to full AST parsing (ClassHierarchy, SpringRest, JpaEntity, SpringSecurity, PublicApi, ConfigDef). Finds inner classes, resolved types, inherited annotations that regex misses. |
40 | | -2. **Better structured parsing** — StructuredParser returns properly wrapped format, config detectors extract more keys. |
41 | | -3. **ModuleContainmentLinker** — correctly sets module on all nodes, producing more CONTAINS edges. |
42 | | - |
43 | | -## Logging Output (sample from spring-boot) |
44 | | - |
45 | | -``` |
46 | | -🔍 Scanning /home/dev/projects/testDir/spring-boot ... |
47 | | -INFO FileDiscovery : Discovered 10524 files |
48 | | -INFO Analyzer : Analysis complete: 27987 nodes, 36922 edges in 13012ms |
49 | | -✅ Analysis complete |
50 | | - Files discovered: 10524 |
51 | | - Files analyzed: 9872 |
52 | | - Nodes: 27987 |
53 | | - Edges: 36922 |
54 | | - Duration: 13012 ms |
55 | | -``` |
56 | | - |
57 | | -Clean output with progress indicators, INFO logging, and summary stats. |
58 | | - |
59 | | -## Known Issues |
60 | | - |
61 | | -1. **Neo4j lock file** — fixed: DatabaseManagementService properly shuts down between runs |
62 | | -2. **JVM startup overhead** — ~8-10s added to wall-clock time (not included in analysis duration) |
63 | | -3. **benchmark/ project** — skipped (446K files, stress test only) |
64 | | - |
65 | | -## Notes |
66 | | - |
67 | | -- All runs on clean environment (`.osscodeiq` and `.code-intelligence` deleted before each run) |
68 | | -- Python ran with `incremental=False` to ensure clean comparison |
69 | | -- Java used ZGC garbage collector (`-XX:+UseZGC`) |
70 | | -- Java used adaptive parallelism (4 cores detected, virtual threads) |
| 27 | +| spring-boot | 27987 / 39776 | 27987 / 39776 | 27987 / 39776 | Yes | |
| 28 | +| kafka | 62671 / 120376 | 62671 / 120376 | 62671 / 120376 | Yes | |
| 29 | +| contoso-real-estate | 4034 / 4039 | 4034 / 4039 | 4034 / 4039 | Yes | |
| 30 | + |
| 31 | +## Analysis Time Breakdown (Java, 3 runs) |
| 32 | + |
| 33 | +| Project | Run 1 | Run 2 | Run 3 | Avg | Std Dev | |
| 34 | +|---------|-------|-------|-------|-----|---------| |
| 35 | +| spring-boot | 48.0s | 50.8s | 44.5s | 47.8s | 3.2s | |
| 36 | +| kafka | 69.6s | 61.5s | 59.3s | 63.5s | 5.4s | |
| 37 | +| contoso-real-estate | 1.37s | 1.33s | 1.28s | 1.33s | 0.04s | |
| 38 | + |
| 39 | +## Wall Clock Time Breakdown (Java, 3 runs) |
| 40 | + |
| 41 | +| Project | Run 1 | Run 2 | Run 3 | Avg | |
| 42 | +|---------|-------|-------|-------|-----| |
| 43 | +| spring-boot | 66.7s | 70.5s | 64.4s | 67.2s | |
| 44 | +| kafka | 81.5s | 71.4s | 69.1s | 74.0s | |
| 45 | +| contoso-real-estate | 10.5s | 10.1s | 10.0s | 10.2s | |
| 46 | + |
| 47 | +## Node/Edge Count Differences (Java vs Python) |
| 48 | + |
| 49 | +Java consistently finds MORE nodes and edges than Python: |
| 50 | + |
| 51 | +| Project | Node Diff | Edge Diff | Node % | Edge % | |
| 52 | +|---------|-----------|-----------|--------|--------| |
| 53 | +| spring-boot | +541 | +6886 | +2.0% | +20.9% | |
| 54 | +| kafka | +4591 | +20402 | +7.9% | +20.4% | |
| 55 | +| contoso-real-estate | +190 | +1133 | +4.9% | +39.0% | |
| 56 | + |
| 57 | +This indicates Java detectors are catching more patterns than the Python version. |
| 58 | +The file count difference (Java discovers slightly fewer files) suggests different |
| 59 | +gitignore/exclusion handling, but Java extracts more signal per file. |
| 60 | + |
| 61 | +## CLI Output Quality |
| 62 | + |
| 63 | +### Progress messages |
| 64 | +- File discovery: "Discovering files..." and "Found N files" with emoji icons |
| 65 | +- Analysis: "Analyzing N files..." with gear emoji |
| 66 | +- Building: "Building graph..." with construction emoji |
| 67 | +- Linking: "Linking cross-file relationships..." with link emoji |
| 68 | +- Classifying: "Classifying layers..." with label emoji |
| 69 | +- Completion: "Analysis complete" with checkmark emoji |
| 70 | + |
| 71 | +### Issues observed |
| 72 | +- **SLF4J multiple provider warning**: Two SLF4J providers on classpath (Logback + Neo4j). Cosmetic only. |
| 73 | +- **Spring Boot banner**: Full ASCII art banner displayed on every run (~6 lines). Could suppress with `spring.main.banner-mode=off`. |
| 74 | +- **Neo4j deprecation warnings**: `CodeEdge` uses Long IDs (deprecated). Should migrate to external IDs. |
| 75 | +- **MCP warnings**: "No tool/resource/prompt/complete methods found" -- expected when running CLI analyze (MCP not needed for CLI). |
| 76 | +- **XML DOCTYPE warnings**: "[Fatal Error]" lines from XML parser encountering DOCTYPE declarations. These are noisy but non-fatal. |
| 77 | +- **Java restricted method warnings**: Netty and jctools use deprecated sun.misc.Unsafe APIs. Upstream dependency issue. |
| 78 | +- **Spring Boot startup overhead**: 8-16s just to start the application context (Neo4j embedded, Spring Data, MCP server init) before any analysis begins. |
| 79 | + |
| 80 | +### What's NOT shown (but should be) |
| 81 | +- No parallelism level report (e.g., "Using virtual threads on 4 cores") |
| 82 | +- No memory usage report at completion |
| 83 | +- No per-detector timing breakdown |
| 84 | + |
| 85 | +## Benchmark Project (311K files) |
| 86 | + |
| 87 | +The benchmark project (8.8GB, 311,284 files) contains multiple large open-source repos |
| 88 | +(TypeScript, azure-sdk-for-java, azure-sdk-for-python, django, eShop, kotlin, |
| 89 | +kubernetes, rust-analyzer, terraform-provider-azurerm). |
| 90 | + |
| 91 | +- **Java**: Initial run completed in ~11m40s (wall) with 3GB heap but output was lost due to piping issues. Subsequent run with 10GB heap timed out at 10 minutes (process killed). |
| 92 | +- **Python**: Timed out at 10 minutes, peak memory 8GB+ and still growing. |
| 93 | + |
| 94 | +Neither implementation handles 300K+ files well within reasonable time/memory bounds. |
| 95 | +This suggests a need for incremental analysis or chunked processing for very large monorepos. |
| 96 | + |
| 97 | +## Recommendations |
| 98 | + |
| 99 | +1. **Suppress Spring Boot banner** for CLI commands (`spring.main.banner-mode=off` or `log` mode) |
| 100 | +2. **Suppress MCP warnings** when running in CLI/indexing mode (not serving) |
| 101 | +3. **Handle XML DOCTYPE gracefully** -- catch and suppress the stderr output from the XML parser |
| 102 | +4. **Report parallelism** -- log virtual thread usage and core count at startup |
| 103 | +5. **Investigate edge count difference** -- Java finds 20-39% more edges; verify these are real (not false positives) |
| 104 | +6. **Add memory reporting** -- show peak heap usage at analysis completion |
| 105 | +7. **Lazy Neo4j initialization** -- don't start embedded Neo4j for the `analyze` command if results are only in-memory |
| 106 | +8. **Profile large codebase handling** -- 311K files needs streaming/chunked approach |
0 commit comments