Conversation
WalkthroughAdds an integration test harness, multiple manual integration tests and fixtures, a benchmarking and export CLI, and a new opengraph package providing JSON types, validation, load (persist to DB) and export (serialize from DB) to enable graph persistence and round‑trip testing. Changes
Sequence Diagram(s)sequenceDiagram
participant Test as Test Harness
participant FS as Testdata JSON
participant OG as opengraph (Load/Export)
participant DB as Graph Database
Test->>FS: open fixture.json
Test->>OG: Load(ctx, db, reader)
OG->>DB: WriteTransaction (create nodes)
DB-->>OG: node IDs
OG->>DB: create relationships (resolve IDs)
DB-->>OG: commit
Test->>OG: Export(ctx, db, writer)
OG->>DB: ReadTransaction (iterate nodes & relationships)
DB-->>OG: nodes & relationships
OG-->>Test: encoded JSON
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (4)
integration/shortest_path_test.go (1)
50-53: Isolate subtests by clearing graph state before each dataset load.Using a shared DB without
ClearGraphcan make subtests interdependent as fixtures grow.♻️ Suggested change
for _, tt := range tests { t.Run(tt.dataset, func(t *testing.T) { + ClearGraph(t, db, ctx) idMap := LoadDataset(t, db, ctx, tt.dataset) paths := QueryPaths(t, ctx, db, ASPQuery(idMap, tt.start, tt.end)) AssertPaths(t, paths, idMap, tt.expected) }) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration/shortest_path_test.go` around lines 50 - 53, The subtests reuse a shared DB and don't clear graph state, causing interdependent fixtures; before loading each dataset call ClearGraph to reset DB state so LoadDataset starts from a clean graph. Update the subtest setup in the loop where LoadDataset, ASPQuery, and QueryPaths are invoked to invoke ClearGraph(ctx, db) (or the project’s equivalent ClearGraph function) immediately prior to LoadDataset(t, db, ctx, tt.dataset) so each t.Run uses an isolated graph state.integration/traversal_test.go (1)
48-51: Clear DB state per subtest to keep traversal fixtures independent.This keeps expected reachability assertions insulated from prior fixture loads.
♻️ Suggested change
for _, tt := range tests { t.Run(tt.dataset, func(t *testing.T) { + ClearGraph(t, db, ctx) idMap := LoadDataset(t, db, ctx, tt.dataset)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration/traversal_test.go` around lines 48 - 51, Subtests share DB state causing fixture interference; before calling LoadDataset(t, db, ctx, tt.dataset) inside t.Run, reset the DB to a clean state so each subtest's fixtures are independent (e.g., invoke a Reset/Truncate helper or run a transaction rollback/clean function against the shared db). Ensure the reset runs for each iteration (before LoadDataset) and optionally defer a per-subtest cleanup to guarantee isolation for the idMap and subsequent reachability assertions.opengraph/export.go (1)
57-65: Fail fast instead of silently synthesizing endpoint IDs.If an edge endpoint is missing from
idToString, export should return an error rather than emitting potentially inconsistent references.♻️ Suggested change
startStr, ok := idToString[rel.StartID] if !ok { - startStr = strconv.FormatInt(int64(rel.StartID), 10) + return fmt.Errorf("relationship %d has unmapped start node ID %d", rel.ID, rel.StartID) } endStr, ok := idToString[rel.EndID] if !ok { - endStr = strconv.FormatInt(int64(rel.EndID), 10) + return fmt.Errorf("relationship %d has unmapped end node ID %d", rel.ID, rel.EndID) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@opengraph/export.go` around lines 57 - 65, The code currently synthesizes string IDs when a mapping for rel.StartID or rel.EndID is missing in idToString; change this to fail fast by returning an error from the export function (the function that iterates relations and uses idToString) whenever a StartID or EndID lookup is missing, instead of generating a numeric fallback. Locate the loop using idToString and rel.StartID/rel.EndID, replace the fallback branches with early returns of a descriptive error that includes the missing numeric ID and whether it was the start or end, and update the function signature and any callers to propagate/handle the error accordingly.integration/match_test.go (1)
119-121: Apply the same graph reset pattern inTestMatchReturnNodes.This test is the only one in this file that skips
ClearGraphbetween dataset subtests.♻️ Suggested change
for _, tt := range tests { t.Run(tt.dataset, func(t *testing.T) { + ClearGraph(t, db, ctx) idMap := LoadDataset(t, db, ctx, tt.dataset)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration/match_test.go` around lines 119 - 121, TestMatchReturnNodes currently omits the ClearGraph call between dataset subtests, causing state leakage; update TestMatchReturnNodes to follow the same pattern used elsewhere by calling ClearGraph(ctx, db) (or the test helper that resets the graph) before each t.Run/LoadDataset invocation and after each subtest as appropriate so each dataset subtest starts with a clean graph; locate TestMatchReturnNodes and insert the ClearGraph invocation around the LoadDataset/tt.dataset usage to mirror other tests' reset pattern.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration/harness.go`:
- Around line 163-167: ASPQuery currently indexes idMap[start] and idMap[end]
directly which returns 0 for missing keys; modify ASPQuery to accept a
*testing.T parameter, validate both keys exist in the idMap (use the comma-ok
form) and call t.Fatalf with a clear message if either is missing, then build
and return the query using the validated IDs; update the single call site in
integration/shortest_path_test.go to pass the test *testing.T into ASPQuery.
In `@opengraph/load.go`:
- Around line 54-60: When persisting nodes in opengraph/load.go, preserve the
logical node.ID by adding it into the node properties (e.g., under key
"_opengraph_id") before calling
tx.CreateNode(graph.AsProperties(node.Properties), ...), so the original ID is
stored alongside the DB-generated ID; continue to populate nodeMap[node.ID] =
dbNode.ID as before. Then update opengraph/export.go to prefer the stored
"_opengraph_id" property when exporting nodes, falling back to the DB ID only if
"_opengraph_id" is absent. Ensure you reference the CreateNode call and nodeMap
handling in load.go and the export path in export.go when making these changes.
- Around line 49-53: WriteGraph currently dereferences g (len(g.Nodes)) and will
panic if a nil *Graph is passed; add a nil check at the start of WriteGraph
(e.g., if g == nil) and return a zero-valued IDMap (or nil) with a descriptive
error instead of continuing; adjust callers/tests if they rely on panics and
ensure the check is placed before nodeMap := make(IDMap, len(g.Nodes)) so
symbols to modify are WriteGraph, g and g.Nodes.
In `@opengraph/opengraph_integration_test.go`:
- Around line 173-174: The test currently ignores the error from
json.Marshal(original) before calling Load; change the marshal to capture and
check the error (e.g., inputBytes, err := json.Marshal(original); if err != nil
{ t.Fatalf("json.Marshal failed: %v", err) } or use your test helper like
require.NoError) so marshal failures are reported immediately instead of
surfacing as confusing Load errors; update the code around the json.Marshal call
and ensure bytes.NewReader(inputBytes) is only used after the successful
marshal.
---
Nitpick comments:
In `@integration/match_test.go`:
- Around line 119-121: TestMatchReturnNodes currently omits the ClearGraph call
between dataset subtests, causing state leakage; update TestMatchReturnNodes to
follow the same pattern used elsewhere by calling ClearGraph(ctx, db) (or the
test helper that resets the graph) before each t.Run/LoadDataset invocation and
after each subtest as appropriate so each dataset subtest starts with a clean
graph; locate TestMatchReturnNodes and insert the ClearGraph invocation around
the LoadDataset/tt.dataset usage to mirror other tests' reset pattern.
In `@integration/shortest_path_test.go`:
- Around line 50-53: The subtests reuse a shared DB and don't clear graph state,
causing interdependent fixtures; before loading each dataset call ClearGraph to
reset DB state so LoadDataset starts from a clean graph. Update the subtest
setup in the loop where LoadDataset, ASPQuery, and QueryPaths are invoked to
invoke ClearGraph(ctx, db) (or the project’s equivalent ClearGraph function)
immediately prior to LoadDataset(t, db, ctx, tt.dataset) so each t.Run uses an
isolated graph state.
In `@integration/traversal_test.go`:
- Around line 48-51: Subtests share DB state causing fixture interference;
before calling LoadDataset(t, db, ctx, tt.dataset) inside t.Run, reset the DB to
a clean state so each subtest's fixtures are independent (e.g., invoke a
Reset/Truncate helper or run a transaction rollback/clean function against the
shared db). Ensure the reset runs for each iteration (before LoadDataset) and
optionally defer a per-subtest cleanup to guarantee isolation for the idMap and
subsequent reachability assertions.
In `@opengraph/export.go`:
- Around line 57-65: The code currently synthesizes string IDs when a mapping
for rel.StartID or rel.EndID is missing in idToString; change this to fail fast
by returning an error from the export function (the function that iterates
relations and uses idToString) whenever a StartID or EndID lookup is missing,
instead of generating a numeric fallback. Locate the loop using idToString and
rel.StartID/rel.EndID, replace the fallback branches with early returns of a
descriptive error that includes the missing numeric ID and whether it was the
start or end, and update the function signature and any callers to
propagate/handle the error accordingly.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0474eeed-7027-43be-b15d-107ed10ce53a
📒 Files selected for processing (17)
.gitignoreintegration/harness.gointegration/match_test.gointegration/shortest_path_test.gointegration/testdata/dead_end.jsonintegration/testdata/diamond.jsonintegration/testdata/direct_shortcut.jsonintegration/testdata/disconnected.jsonintegration/testdata/linear.jsonintegration/testdata/wide_diamond.jsonintegration/traversal_test.goopengraph/export.goopengraph/load.goopengraph/opengraph_integration_test.goopengraph/types.goopengraph/validate.goopengraph/validate_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
integration/match_test.go (1)
122-125:⚠️ Potential issue | 🟡 MinorAdd validation for missing fixture ID key.
Direct map access
idMap[tt.queryID]returns0if the key doesn't exist, silently producing a query with an invalid node ID. This mirrors the same pattern previously flagged forASPQuery. Validate the key exists to fail fast on test configuration errors.Proposed fix
t.Run(tt.dataset, func(t *testing.T) { + ClearGraph(t, db, ctx) idMap := LoadDataset(t, db, ctx, tt.dataset) + startID, ok := idMap[tt.queryID] + if !ok { + t.Fatalf("fixture ID %q not found in dataset %q", tt.queryID, tt.dataset) + } + cypher := fmt.Sprintf( "MATCH (s)-[:EdgeKind1]->(e) WHERE id(s) = %d RETURN e", - idMap[tt.queryID], + startID, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration/match_test.go` around lines 122 - 125, The test constructs a Cypher query using idMap[tt.queryID] which returns 0 for missing keys and produces an invalid query; before building cypher, look up the ID with the comma-ok pattern (e.g., value, ok := idMap[tt.queryID]) and if !ok fail the test immediately (t.Fatalf or t.Fatalf-like helper) with a clear message mentioning tt.queryID and idMap, then use the retrieved value when formatting the cypher string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration/match_test.go`:
- Around line 118-129: TestMatchReturnNodes is missing a ClearGraph call,
causing datasets to accumulate across subtests; before calling LoadDataset(t,
db, ctx, tt.dataset) inside the t.Run loop, invoke ClearGraph(t, db, ctx) so
each subtest starts with a clean DB state; update the loop where LoadDataset is
used (referencing LoadDataset, ClearGraph, QueryNodeIDs, and AssertIDSet) to
call ClearGraph first to prevent cross-test contamination.
---
Duplicate comments:
In `@integration/match_test.go`:
- Around line 122-125: The test constructs a Cypher query using
idMap[tt.queryID] which returns 0 for missing keys and produces an invalid
query; before building cypher, look up the ID with the comma-ok pattern (e.g.,
value, ok := idMap[tt.queryID]) and if !ok fail the test immediately (t.Fatalf
or t.Fatalf-like helper) with a clear message mentioning tt.queryID and idMap,
then use the retrieved value when formatting the cypher string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: dfbb35a1-b0d6-45a0-afc7-086b5b8e8e51
📒 Files selected for processing (4)
integration/harness.gointegration/match_test.gointegration/shortest_path_test.gointegration/traversal_test.go
✅ Files skipped from review due to trivial changes (1)
- integration/shortest_path_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- integration/traversal_test.go
2121e36 to
f6372ea
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🧹 Nitpick comments (2)
integration/harness.go (2)
38-41: Flag names may conflict with other test packages.Package-level
flag.Stringcalls use generic names like-driverand-connection. If other packages in the same test binary define flags with the same names, this will cause a panic at init time.Consider prefixing with a namespace (e.g.,
-integration.driver,-integration.connection) to avoid collisions.♻️ Suggested change
var ( - driverFlag = flag.String("driver", "pg", "database driver to test against (pg, neo4j)") - connStrFlag = flag.String("connection", "", "database connection string (overrides PG_CONNECTION_STRING env var)") + driverFlag = flag.String("integration.driver", "pg", "database driver to test against (pg, neo4j)") + connStrFlag = flag.String("integration.connection", "", "database connection string (overrides PG_CONNECTION_STRING env var)") )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration/harness.go` around lines 38 - 41, The package-level flags driverFlag and connStrFlag use generic names ("driver", "connection") that can collide with flags from other test packages; change their flag.String key names to a namespaced prefix (for example "integration.driver" and "integration.connection") and update the corresponding usage/help strings to reflect the new names so consumers know the namespace, ensuring the variables driverFlag and connStrFlag remain but the first argument to flag.String is the prefixed name to avoid init-time conflicts.
198-214: Mutating input slices may cause unexpected side effects.
sort.Strings(got)andsort.Strings(expected)modify the caller's slices in place. While this may not cause issues in current tests, it's defensive to copy slices before sorting to avoid surprises if callers reuse them.♻️ Suggested fix
func AssertIDSet(t *testing.T, got, expected []string) { t.Helper() - sort.Strings(got) - sort.Strings(expected) + gotSorted := make([]string, len(got)) + copy(gotSorted, got) + sort.Strings(gotSorted) + + expectedSorted := make([]string, len(expected)) + copy(expectedSorted, expected) + sort.Strings(expectedSorted) - if len(got) != len(expected) { - t.Fatalf("ID set length: got %d, want %d\n got: %v\n want: %v", len(got), len(expected), got, expected) + if len(gotSorted) != len(expectedSorted) { + t.Fatalf("ID set length: got %d, want %d\n got: %v\n want: %v", len(gotSorted), len(expectedSorted), gotSorted, expectedSorted) } - for i := range got { - if got[i] != expected[i] { - t.Fatalf("ID set mismatch at index %d:\n got: %v\n want: %v", i, got, expected) + for i := range gotSorted { + if gotSorted[i] != expectedSorted[i] { + t.Fatalf("ID set mismatch at index %d:\n got: %v\n want: %v", i, gotSorted, expectedSorted) } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration/harness.go` around lines 198 - 214, AssertIDSet mutates the caller's slices by calling sort.Strings on the passed-in got and expected; make shallow copies first (e.g., gotCopy and expectedCopy using make and copy) and call sort.Strings on those copies, then perform the length check and element comparisons against the copies so the original slices passed into AssertIDSet are not modified.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@integration/harness.go`:
- Around line 64-75: When pg.NewPool succeeds but dawgs.Open fails, the created
pool is leaked; after creating the pool (pg.NewPool) assign it to cfg.Pool and
then call dawgs.Open, but if dawgs.Open returns an error explicitly Close the
pool (call pool.Close() or the pool's appropriate cleanup method) before calling
t.Fatalf; reference the symbols driverFlag, pg.DriverName, pg.NewPool, cfg.Pool
and dawgs.Open so the cleanup is done in the same block where the pool is
created.
In `@integration/match_test.go`:
- Around line 122-125: The test builds a Cypher using idMap[tt.queryID] without
checking the map membership, which silently yields 0 for missing keys; update
the test around the cypher construction to first lookup the id with the comma-ok
pattern (checking idMap and tt.queryID), and if the id is missing call t.Fatalf
(or t.Fatalf equivalent) with a clear error message so the test fails fast
instead of producing "id(s) = 0"; only then use the validated id to build the
cypher via fmt.Sprintf.
In `@integration/traversal_test.go`:
- Around line 46-56: The test loop calls LoadDataset without clearing prior
data, causing data accumulation; before calling LoadDataset in the t.Run
anonymous function add a call to ClearGraph (or the test helper that empties the
DB) so each subtest starts with a fresh graph; specifically modify the anonymous
test body containing LoadDataset(t, db, ctx, tt.dataset) to invoke
ClearGraph(ctx, db) (or the existing ClearGraph helper) immediately prior to
LoadDataset, ensuring isolation for the QueryNodeIDs/assertions.
- Around line 50-53: The code constructs the Cypher query using idMap[tt.start]
which returns 0 when the key is missing and can produce an invalid/unintended
query; change the lookup in the test (where cypher := fmt.Sprintf(...
idMap[tt.start] ...)) to validate presence first—e.g., check the map with the
comma-ok pattern and fail the test (t.Fatalf) or call a small helper (e.g.,
getIDOrFail/idFromMapOrFail) that panics/fails the test if tt.start is not
present—then use the validated id value when building the cypher string.
---
Nitpick comments:
In `@integration/harness.go`:
- Around line 38-41: The package-level flags driverFlag and connStrFlag use
generic names ("driver", "connection") that can collide with flags from other
test packages; change their flag.String key names to a namespaced prefix (for
example "integration.driver" and "integration.connection") and update the
corresponding usage/help strings to reflect the new names so consumers know the
namespace, ensuring the variables driverFlag and connStrFlag remain but the
first argument to flag.String is the prefixed name to avoid init-time conflicts.
- Around line 198-214: AssertIDSet mutates the caller's slices by calling
sort.Strings on the passed-in got and expected; make shallow copies first (e.g.,
gotCopy and expectedCopy using make and copy) and call sort.Strings on those
copies, then perform the length check and element comparisons against the copies
so the original slices passed into AssertIDSet are not modified.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 2ec080fd-87b7-4eed-8415-8aeed18b1735
📒 Files selected for processing (5)
integration/harness.gointegration/match_test.gointegration/shortest_path_test.gointegration/traversal_test.goopengraph/load.go
✅ Files skipped from review due to trivial changes (1)
- integration/shortest_path_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- opengraph/load.go
Support loading arbitrary opengraph datasets by deriving the schema from the dataset's node/edge kinds instead of hardcoding them. Add batch edge writes for performance and a flag-driven test for large local datasets.
…opengraph doc and one to run benchmarks
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (7)
integration/shortest_path_test.go (1)
57-60: Direct map indexing may produce invalid queries for missing fixture IDs.
idMap[tt.start]andidMap[tt.end]return0if the keys don't exist, creating queries withid(s) = 0that may silently match nothing or an unintended node. For controlled test fixtures this is low-risk, but adding validation would catch dataset/test case mismatches early.♻️ Optional validation
+ startID, ok := idMap[tt.start] + if !ok { + t.Fatalf("fixture ID %q not found in dataset %q", tt.start, tt.dataset) + } + endID, ok := idMap[tt.end] + if !ok { + t.Fatalf("fixture ID %q not found in dataset %q", tt.end, tt.dataset) + } + cypher := fmt.Sprintf( "MATCH p = allShortestPaths((s)-[:EdgeKind1*1..]->(e)) WHERE id(s) = %d AND id(e) = %d RETURN p", - idMap[tt.start], idMap[tt.end], + startID, endID, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration/shortest_path_test.go` around lines 57 - 60, The cypher string is built using direct map indexing idMap[tt.start] and idMap[tt.end], which yields 0 for missing keys and can produce invalid queries; update the test to look up values with the comma-ok form (e.g., startID, ok := idMap[tt.start]) and similarly for endID, validate ok for both and call t.Fatalf or t.Fatalf-like failure with a clear message if either is missing, then use the validated startID/endID in fmt.Sprintf to build the cypher; reference idMap, tt.start, tt.end, cypher and fmt.Sprintf when making the change.integration/harness.go (1)
110-120: Usedeferfor file close to ensure cleanup on all exit paths.Currently
f.Close()is called inline afterParseDocument, but ifParseDocumentpanics, the file handle leaks. Usingdeferensures the file is closed regardless of how the function exits.♻️ Suggested fix
for _, name := range datasets { f, err := os.Open(datasetPath(name)) if err != nil { t.Fatalf("failed to open dataset %q for kind scanning: %v", name, err) } + defer f.Close() doc, err := opengraph.ParseDocument(f) - f.Close() if err != nil { t.Fatalf("failed to parse dataset %q: %v", name, err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@integration/harness.go` around lines 110 - 120, The file opened with os.Open(datasetPath(name)) must be closed via defer immediately after a successful open to guarantee cleanup on all exit paths; update the loop around opengraph.ParseDocument to call defer f.Close() right after the os.Open success, and to avoid accumulating defers across iterations either wrap the per-dataset logic in a short anonymous func (e.g., func() { f, err := os.Open(...); if err != nil { ... }; defer f.Close(); doc, err := opengraph.ParseDocument(f); ... }()) or explicitly close f on every return/error path—modify the loop that uses os.Open, datasetPath, opengraph.ParseDocument and f.Close accordingly.cmd/benchmark/scenarios.go (1)
80-99: Missing key validation in query builders may produce invalid queries.
shortestPathQuery,traversalQuery, andmatchReturnQueryuse direct map indexing (idMap[start],idMap[end]) without checking if the keys exist. If a key is missing, the query will use0as the node ID, which may silently match nothing or an unintended node.In contrast,
phantomScenarios(lines 188, 218-219) validates key presence before building queries. Apply the same pattern here for consistency and to fail fast on misconfigured scenarios.♻️ Suggested validation pattern
func shortestPathQuery(idMap opengraph.IDMap, start, end string) func(tx graph.Transaction) error { + startID, startOK := idMap[start] + endID, endOK := idMap[end] + if !startOK || !endOK { + return func(tx graph.Transaction) error { + return fmt.Errorf("missing fixture ID: start=%q (found=%v), end=%q (found=%v)", start, startOK, end, endOK) + } + } return cypherQuery(fmt.Sprintf( "MATCH p = allShortestPaths((s)-[*1..]->(e)) WHERE id(s) = %d AND id(e) = %d RETURN p", - idMap[start], idMap[end], + startID, endID, )) }Apply similar changes to
traversalQueryandmatchReturnQuery.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/benchmark/scenarios.go` around lines 80 - 99, shortestPathQuery, traversalQuery, and matchReturnQuery currently index idMap[start] / idMap[end] without checking presence; update each builder to validate keys like phantomScenarios does and fail fast: check for start (and end in shortestPathQuery) using the comma-ok idiom (e.g., id, ok := idMap[start]); if a key is missing return/build a failing func or panic with a clear message (e.g., "missing idMap entry for start/ end") instead of using zero, so the generated cypherQuery isn't silently invalid. Ensure you reference and modify the functions shortestPathQuery, traversalQuery, and matchReturnQuery accordingly.cmd/benchmark/main.go (3)
125-125: Usefilepath.Joinfor cross-platform path construction.String concatenation with
/may not work correctly on Windows.♻️ Proposed fix
Add import:
import "path/filepath"Then:
- path := *datasetDir + "/" + ds + ".json" + path := filepath.Join(*datasetDir, ds+".json")Apply similarly in
scanKindsat line 178.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/benchmark/main.go` at line 125, Replace the manual string concat that builds file paths (the assignment to variable "path" using *datasetDir + "/" + ds + ".json") with filepath.Join to ensure cross-platform correctness; add the import "path/filepath" and change the construction to join *datasetDir and ds + ".json" (and make the same change inside the scanKinds function where paths are built) so Windows path separators are handled correctly.
174-196: Silent errors inscanKindscould mask configuration issues.If all dataset files fail to open or parse (e.g., wrong
datasetDirpath), the function returns empty kinds, leading to a schema with no node/edge types. Consider logging failures to stderr for debugging.♻️ Proposed fix
func scanKinds(datasetDir string, datasets []string) (graph.Kinds, graph.Kinds) { var nodeKinds, edgeKinds graph.Kinds for _, ds := range datasets { path := datasetDir + "/" + ds + ".json" f, err := os.Open(path) if err != nil { + fmt.Fprintf(os.Stderr, "warning: could not open %s: %v\n", path, err) continue } doc, err := opengraph.ParseDocument(f) f.Close() if err != nil { + fmt.Fprintf(os.Stderr, "warning: could not parse %s: %v\n", path, err) continue } nk, ek := doc.Graph.Kinds() nodeKinds = nodeKinds.Add(nk...) edgeKinds = edgeKinds.Add(ek...) } return nodeKinds, edgeKinds }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/benchmark/main.go` around lines 174 - 196, The scanKinds function currently swallows file-open and parse errors which can produce an empty schema silently; update scanKinds to log failures to stderr (including the dataset path and the error) when os.Open or opengraph.ParseDocument returns an error so configuration/path issues are visible. Specifically, inside scanKinds where os.Open(path) and opengraph.ParseDocument(f) now just continue, add logging calls that print the path and err (using fmt.Fprintf(os.Stderr, ...) or log.Printf) before continuing; keep returning nodeKinds and edgeKinds as before but ensure every open/parse error for the given dataset is reported for debugging.
50-56: Consider a driver-agnostic env var name.
PG_CONNECTION_STRINGis used as fallback even when-driver=neo4j. A generic name likeDB_CONNECTION_STRINGwould be clearer, or add separate env vars per driver.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/benchmark/main.go` around lines 50 - 56, The code uses PG_CONNECTION_STRING as the fallback for conn (derived from connStr) even when a different driver is selected; update the fallback logic in the block that initializes conn (the conn := *connStr section) to use a driver-agnostic env var (e.g., DB_CONNECTION_STRING) or to check driver-specific env vars based on the -driver flag (inspect the driver flag variable, then try DRIVER-specific env var like NEO4J_CONNECTION_STRING before falling back to DB_CONNECTION_STRING); ensure the fatal(...) message remains accurate if no connection string is found.cmd/benchmark/report.go (1)
34-56: Consider handling or removing the error return.
writeMarkdownalways returnsnil, making the error return type misleading. Either propagate write errors fromfmt.Fprintfor simplify the signature.♻️ Option A: Propagate write errors
func writeMarkdown(w io.Writer, r Report) error { - fmt.Fprintf(w, "# Benchmarks — %s @ %s (%s, %d iterations)\n\n", r.Driver, r.GitRef, r.Date, r.Iterations) - fmt.Fprintf(w, "| Query | Dataset | Median | P95 | Max |\n") - fmt.Fprintf(w, "|-------|---------|-------:|----:|----:|\n") + if _, err := fmt.Fprintf(w, "# Benchmarks — %s @ %s (%s, %d iterations)\n\n", r.Driver, r.GitRef, r.Date, r.Iterations); err != nil { + return err + } + if _, err := fmt.Fprintf(w, "| Query | Dataset | Median | P95 | Max |\n"); err != nil { + return err + } + if _, err := fmt.Fprintf(w, "|-------|---------|-------:|----:|----:|\n"); err != nil { + return err + } for _, res := range r.Results { // ... + if _, err := fmt.Fprintf(w, "| %s | %s | %s | %s | %s |\n", ...); err != nil { + return err + } } - fmt.Fprintln(w) + _, err := fmt.Fprintln(w) + return err - return nil }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cmd/benchmark/report.go` around lines 34 - 56, The function writeMarkdown currently declares an error return but always returns nil; update writeMarkdown to either propagate write errors from the fmt.Fprintf / fmt.Fprintln calls (capture each call's returned error and return it on first failure, e.g., if err := fmt.Fprintf(...); err != nil { return err }) or remove the error return from the signature and return nothing; locate writeMarkdown and the fmt.Fprintf / fmt.Fprintln invocations to implement the chosen fix (ensure callers are updated if you change the signature).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cmd/benchmark/runner.go`:
- Around line 67-77: computeStats currently assumes durations is non-empty and
will panic on index access if called with an empty slice; update computeStats to
check n := len(durations) and if n == 0 return a zero-valued Stats (e.g.,
Stats{}) immediately, otherwise proceed with the existing sort and index-based
Median/P95/Max computation so the function never panics when given an empty
slice.
In `@integration/shortest_path_test.go`:
- Around line 53-65: The tests are accumulating data because each subtest calls
LoadDataset without clearing prior data; update the subtest body inside the
t.Run to invoke ClearGraph (the existing ClearGraph helper) before calling
LoadDataset so the DB is reset per subtest (i.e., call ClearGraph with the same
context/db parameters used by LoadDataset and QueryPaths at the start of the
anonymous func in the for loop). Ensure the ClearGraph call uses the correct
signature (ctx, db or t, db, ctx as appropriate) so idMap from LoadDataset
refers to a fresh graph for each tt.
In `@opengraph/load.go`:
- Around line 67-103: The current Load implementation writes nodes via
db.WriteTransaction and then creates edges via db.BatchOperation which uses a
separate transaction, risking orphan nodes if edge creation fails; update Load
so node creation and edge creation occur in the same transactional context
(e.g., perform both node CreateNode and relationship creation using
CreateRelationshipByIDs inside the same db.WriteTransaction callback or, if the
driver exposes a way to obtain a batch from the active graph.Transaction, reuse
that instead of calling db.BatchOperation), ensuring you still use
graph.Transaction, CreateNode and CreateRelationshipByIDs and preserve the
nodeMap mapping; alternatively, if you keep separate calls, add a clear comment
in Load documenting that partial writes are possible and acceptable for
test-only usage.
---
Nitpick comments:
In `@cmd/benchmark/main.go`:
- Line 125: Replace the manual string concat that builds file paths (the
assignment to variable "path" using *datasetDir + "/" + ds + ".json") with
filepath.Join to ensure cross-platform correctness; add the import
"path/filepath" and change the construction to join *datasetDir and ds + ".json"
(and make the same change inside the scanKinds function where paths are built)
so Windows path separators are handled correctly.
- Around line 174-196: The scanKinds function currently swallows file-open and
parse errors which can produce an empty schema silently; update scanKinds to log
failures to stderr (including the dataset path and the error) when os.Open or
opengraph.ParseDocument returns an error so configuration/path issues are
visible. Specifically, inside scanKinds where os.Open(path) and
opengraph.ParseDocument(f) now just continue, add logging calls that print the
path and err (using fmt.Fprintf(os.Stderr, ...) or log.Printf) before
continuing; keep returning nodeKinds and edgeKinds as before but ensure every
open/parse error for the given dataset is reported for debugging.
- Around line 50-56: The code uses PG_CONNECTION_STRING as the fallback for conn
(derived from connStr) even when a different driver is selected; update the
fallback logic in the block that initializes conn (the conn := *connStr section)
to use a driver-agnostic env var (e.g., DB_CONNECTION_STRING) or to check
driver-specific env vars based on the -driver flag (inspect the driver flag
variable, then try DRIVER-specific env var like NEO4J_CONNECTION_STRING before
falling back to DB_CONNECTION_STRING); ensure the fatal(...) message remains
accurate if no connection string is found.
In `@cmd/benchmark/report.go`:
- Around line 34-56: The function writeMarkdown currently declares an error
return but always returns nil; update writeMarkdown to either propagate write
errors from the fmt.Fprintf / fmt.Fprintln calls (capture each call's returned
error and return it on first failure, e.g., if err := fmt.Fprintf(...); err !=
nil { return err }) or remove the error return from the signature and return
nothing; locate writeMarkdown and the fmt.Fprintf / fmt.Fprintln invocations to
implement the chosen fix (ensure callers are updated if you change the
signature).
In `@cmd/benchmark/scenarios.go`:
- Around line 80-99: shortestPathQuery, traversalQuery, and matchReturnQuery
currently index idMap[start] / idMap[end] without checking presence; update each
builder to validate keys like phantomScenarios does and fail fast: check for
start (and end in shortestPathQuery) using the comma-ok idiom (e.g., id, ok :=
idMap[start]); if a key is missing return/build a failing func or panic with a
clear message (e.g., "missing idMap entry for start/ end") instead of using
zero, so the generated cypherQuery isn't silently invalid. Ensure you reference
and modify the functions shortestPathQuery, traversalQuery, and matchReturnQuery
accordingly.
In `@integration/harness.go`:
- Around line 110-120: The file opened with os.Open(datasetPath(name)) must be
closed via defer immediately after a successful open to guarantee cleanup on all
exit paths; update the loop around opengraph.ParseDocument to call defer
f.Close() right after the os.Open success, and to avoid accumulating defers
across iterations either wrap the per-dataset logic in a short anonymous func
(e.g., func() { f, err := os.Open(...); if err != nil { ... }; defer f.Close();
doc, err := opengraph.ParseDocument(f); ... }()) or explicitly close f on every
return/error path—modify the loop that uses os.Open, datasetPath,
opengraph.ParseDocument and f.Close accordingly.
In `@integration/shortest_path_test.go`:
- Around line 57-60: The cypher string is built using direct map indexing
idMap[tt.start] and idMap[tt.end], which yields 0 for missing keys and can
produce invalid queries; update the test to look up values with the comma-ok
form (e.g., startID, ok := idMap[tt.start]) and similarly for endID, validate ok
for both and call t.Fatalf or t.Fatalf-like failure with a clear message if
either is missing, then use the validated startID/endID in fmt.Sprintf to build
the cypher; reference idMap, tt.start, tt.end, cypher and fmt.Sprintf when
making the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 6f6f9137-eb21-4d6c-acf7-aa8d2ab33c2c
📒 Files selected for processing (15)
cmd/benchmark/README.mdcmd/benchmark/main.gocmd/benchmark/report.gocmd/benchmark/runner.gocmd/benchmark/scenarios.gocmd/export/README.mdcmd/export/main.gointegration/BENCHMARKS.mdintegration/harness.gointegration/local_dataset_test.gointegration/match_test.gointegration/shortest_path_test.gointegration/traversal_test.goopengraph/load.goopengraph/types.go
✅ Files skipped from review due to trivial changes (4)
- integration/BENCHMARKS.md
- cmd/export/README.md
- cmd/benchmark/README.md
- integration/traversal_test.go
Add opengraph package and integration test harness
Summary
opengraph/package for graph serialization (load, export, validate) backed bygraph.Databaseinterfacesintegration/test harness that loads JSON graph datasets and runs Cypher queries against a real database to verify correctness-driverflag (defaults topg)What this enables
Today dawgs only tests "did the SQL generate correctly" (translation tests). This PR adds tests that verify "did I get the correct answer" — load a known graph, run a query, assert the results. This is the safety net needed before making changes to the BFS harness,
edges_to_path, or the query planner.opengraph package
Load(ctx, db, reader)— decode JSON, validate, write nodes/edges, return ID mappingExport(ctx, db, writer)— read all nodes/edges, write indented JSONValidate(doc)— structural checks (duplicate IDs, dangling edge refs, empty fields)integration test harness
SetupDB(t)— connect to database, assert schema, cleanup on teardownLoadDataset(t, db, ctx, name)— load a named JSON fixture fromtestdata/QueryPaths,AssertPaths,AssertIDSet— query and assertion helpersClearGraph— wipe nodes/edges between subtestsintegration/testdata/local/gitignored for custom datasetsTest suites (5 suites, 23 tests)
TestMatchNodesByKindTestMatchEdgesByKindTestMatchReturnNodesTestAllShortestPathsallShortestPathsTestVariableLengthTraversalRunning
Summary by CodeRabbit
New Features
Tests
Documentation
Chores