Skip to content

Commit 2c38749

Browse files
aksOpsclaude
andauthored
block 6: testing & CI hardening (govulncheck, npm audit, fuzz, flake gate, smokes, pipeline integration) (#70)
* ci: add govulncheck step to Go test job Runs govulncheck ./... on every PR touching Go code. Reachability-based scan catches High/Critical CVEs in the call graph before tests run; aligns CI with ~/.claude/rules/security.md policy. Installs the tool on-the-fly since it's a first-party golang.org/x module and dependabot can bump the install target as needed. Local dogfood: "No vulnerabilities found". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci(ui): fail build on moderate+ npm advisories Adds npm audit --audit-level=moderate between npm ci and typecheck in the UI job. Short-circuits the rest of the job on a failing audit so CVE-introducing dep bumps surface immediately. Matches the rule-book policy in ~/.claude/rules/security.md. Local dogfood: "found 0 vulnerabilities". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test: add FuzzSearchTokenize and FuzzMCPToolArgs to fuzz-smoke Two new fuzz targets wired into the existing fuzz (smoke) CI job: - FuzzSearchTokenize exercises Store.SearchNotes against arbitrary FTS5-grammar inputs; asserts no "malformed MATCH expression" leaks, which would indicate missing pre-sanitisation at the HTTP boundary. - FuzzMCPToolArgs exercises stringArg/intArg/projectArg against any JSON payload an MCP client might send; asserts no helper panics on unexpected types. Each runs 30s on every PR. Local 15s smoke: both targets PASS with new-interesting counts of 31 and 171 respectively; ~84k and ~554k execs in 15s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test,ci: enforce flake-register — every skip gets a tracked issue Annotates the 8 existing t.Skip() sites with TODO(#N) references and adds a CI grep gate (in the test job) that fails if any future t.Skip or test.skip lacks an adjacent TODO(#N): comment. Issues filed: - #62 large-tar import test under -short - #63 1000-note scale test under -short - #64 10k HNSW benchmarks (-short, -race) - #65 environmental skips (platform/tool availability) Converts silent skips into a queryable backlog without changing test behaviour. Fuzz-callback skips (*_fuzz_test.go) are excluded: those are input filtering, not flake-register entries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(ui): add 404, unauthed, and upload-happy-path Playwright smokes Three high-risk flows get dedicated specs: - 404.spec.ts (2 tests): unknown and deep-nested unknown routes render the NotFound component while keeping the Shell mounted. Guards against react-router catch-all regressions. - auth.spec.ts (2 tests, .fixme): asserts a visible auth-required affordance when /api/* returns 401. Today the UI has no such affordance (apiFetch throws into React Query error states with no recognisable copy) so the tests are .fixme'd with TODO(#66): tracked in flake-register. - upload.spec.ts (1 test): opens DocumentsList > Upload, attaches a fixture markdown to the <input type=file>, stubs POST /api/upload, asserts the dialog closes on success. Mirrors the real UploadModal flow (onChange auto-submit, no explicit submit button). Local run: 10 specs, 8 pass + 2 fixme. Playwright job grows from 5 to 10; regressions in any of the three flows now fail CI in ~10s. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(llm): deterministic mock provider for tests internal/llm/mock implements llm.Provider with: - Complete: substring-matched canned JSON for entity/relationship/claim extraction prompts and a TITLE:/SUMMARY: formatted response for community summarisation prompts. Schema matches internal/extractor and internal/community exactly so parsing succeeds. - Embed/EmbedBatch: SHA-256-derived L2-normalised vectors, 128-dim by default. Equal text yields equal vectors; determinism is the only semantic contract. Intended for integration tests; not exposed outside internal/. No network, no API keys, no external processes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(pipeline): end-to-end integration test over markdown corpus New integration test drives pipeline.New().IndexPath().Finalize() over 5 small markdown files with the mock LLM provider, then asserts: - Document count is exactly 5. - Chunk count is in the 5..50 band. - Embedding count equals chunk count (Phase 2 invariant). - Entity count is in the 2..2*chunks band (mock returns 2 entities per extraction prompt; dedup collapses duplicates). - Relationship count is >=1. - LocalSearch("Apollo program", topK=5) returns >=1 chunk containing "Apollo". Gated by //go:build integration && sqlite_fts5 so the default `go test ./...` path is unaffected. The test-integration CI job picks it up automatically via its existing -tags "sqlite_fts5 integration" invocation. No CI workflow change needed. Local run: 0.05s without -race, 1.07s with -race. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(go): bump toolchain 1.25.5 -> 1.25.7 to close stdlib CVEs Block 6's new govulncheck gate flagged two crypto/tls CVEs (GO-2026-4340, GO-2026-4337) in the 1.25.5 stdlib. 1.25.7 carries both fixes. Bump go.mod directive so setup-go picks the patched toolchain in CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(go): bump toolchain 1.25.7 -> 1.25.9 for additional stdlib CVEs Second govulncheck sweep surfaced crypto/x509 + archive/tar + os + net/url fixes landed in 1.25.8 and 1.25.9. Jump straight to 1.25.9 to close every known-reachable stdlib vuln in one commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 08f0c56 commit 2c38749

22 files changed

Lines changed: 762 additions & 1 deletion

File tree

.github/workflows/ci.yml

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,9 @@ jobs:
2323
- name: Install UI dependencies
2424
run: npm --prefix ui ci
2525

26+
- name: npm audit
27+
run: npm --prefix ui audit --audit-level=moderate
28+
2629
- name: Type check
2730
run: npm --prefix ui run typecheck
2831

@@ -92,12 +95,60 @@ jobs:
9295
- name: go vet (cgo + fts5)
9396
run: CGO_ENABLED=1 go vet -tags sqlite_fts5 $(go list ./... | grep -v /ui/node_modules/)
9497

98+
- name: govulncheck
99+
run: |
100+
set -eu
101+
# govulncheck is a first-party golang.org/x module; @latest is
102+
# acceptable here and dependabot can bump the install target.
103+
go install golang.org/x/vuln/cmd/govulncheck@latest
104+
CGO_ENABLED=1 govulncheck -tags sqlite_fts5 ./...
105+
95106
- name: go test (cgo + fts5)
96107
run: CGO_ENABLED=1 go test -tags sqlite_fts5 -timeout 300s $(go list ./... | grep -v /ui/node_modules/)
97108

98109
- name: go build (cgo + fts5)
99110
run: CGO_ENABLED=1 go build -tags sqlite_fts5 -o docsiq ./
100111

112+
- name: flake-register (every t.Skip / test.skip has a tracked TODO)
113+
run: |
114+
set -euo pipefail
115+
# Every skip must be either:
116+
# (a) on a line with an inline `// TODO(#N):` comment, OR
117+
# (b) immediately preceded by a `// TODO(#N):` comment line.
118+
# Fuzz-callback skips (input filtering) are excluded: they are
119+
# not flake-register entries and carry no issue.
120+
echo "Scanning for t.Skip( without a tracked TODO..."
121+
violations=0
122+
# Go side
123+
while IFS=: read -r file lineno _; do
124+
if sed -n "${lineno}p" "$file" | grep -qE '// TODO\(#[0-9]+\):'; then
125+
continue
126+
fi
127+
prev=$((lineno - 1))
128+
if [ "$prev" -gt 0 ] && sed -n "${prev}p" "$file" | grep -qE '// TODO\(#[0-9]+\):'; then
129+
continue
130+
fi
131+
echo "::error file=$file,line=$lineno::t.Skip without TODO(#N): annotation"
132+
violations=$((violations + 1))
133+
done < <(grep -rn 't\.Skip(' --include='*.go' . | grep -v '_fuzz_test\.go' | grep -v node_modules || true)
134+
# TypeScript side
135+
while IFS=: read -r file lineno _; do
136+
if sed -n "${lineno}p" "$file" | grep -qE '// TODO\(#[0-9]+\):'; then
137+
continue
138+
fi
139+
prev=$((lineno - 1))
140+
if [ "$prev" -gt 0 ] && sed -n "${prev}p" "$file" | grep -qE '// TODO\(#[0-9]+\):'; then
141+
continue
142+
fi
143+
echo "::error file=$file,line=$lineno::test.skip without TODO(#N): annotation"
144+
violations=$((violations + 1))
145+
done < <(grep -rn 'test\.skip(' --include='*.ts' --include='*.tsx' ui/ 2>/dev/null | grep -v node_modules || true)
146+
if [ "$violations" -gt 0 ]; then
147+
echo "::error::Found $violations skipped test(s) without a tracking issue. File a flake-register issue and add // TODO(#N): <why> adjacent to the skip."
148+
exit 1
149+
fi
150+
echo "All skips accounted for."
151+
101152
- name: Upload docsiq binary
102153
uses: actions/upload-artifact@043fb46d1a93c77aae656e7c1c64a875d1fc6a0a # v7
103154
with:

.github/workflows/fuzz.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,8 @@ jobs:
2727
targets=(
2828
"./internal/crawler::FuzzResolveURL"
2929
"./internal/chunker::FuzzChunker"
30+
"./internal/store::FuzzSearchTokenize"
31+
"./internal/mcp::FuzzMCPToolArgs"
3032
)
3133
for entry in "${targets[@]}"; do
3234
pkg="${entry%%::*}"

go.mod

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
module github.com/RandomCodeSpace/docsiq
22

3-
go 1.25.5
3+
go 1.25.9
44

55
require (
66
github.com/google/uuid v1.6.0

internal/api/notes_import_limits_test.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,6 +78,7 @@ func TestImportTar_EntryCountCap(t *testing.T) {
7878
// MaxImportTotalBytes must be rejected with 413.
7979
func TestImportTar_TotalBytesCap(t *testing.T) {
8080
if testing.Short() {
81+
// TODO(#62): large-tar import test skipped under -short; tracked in flake-register.
8182
t.Skip("skipping large-tar test in -short mode")
8283
}
8384
h, slug, _ := setupNotesRouter(t)

internal/hookinstaller/installer_test.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -262,6 +262,7 @@ func TestClaudeInstaller(t *testing.T) {
262262

263263
t.Run("symlinked_config_is_written_through", func(t *testing.T) {
264264
if runtime.GOOS == "windows" {
265+
// TODO(#65): environmental skip (windows symlink admin); tracked in flake-register.
265266
t.Skip("symlink support requires admin on Windows")
266267
}
267268
home := fakeHome(t)

internal/llm/mock/mock.go

Lines changed: 164 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
// Package mock provides a deterministic llm.Provider implementation for
2+
// tests. It does NOT require any network, API key, or external process.
3+
// Callers import it directly (no build tag) — the package lives under
4+
// internal/ so it cannot leak into the public API surface.
5+
package mock
6+
7+
import (
8+
"context"
9+
"crypto/sha256"
10+
"encoding/binary"
11+
"fmt"
12+
"math"
13+
"strings"
14+
15+
"github.com/RandomCodeSpace/docsiq/internal/llm"
16+
)
17+
18+
// DefaultDims is the default embedding dimensionality.
19+
const DefaultDims = 128
20+
21+
// Provider is a deterministic, in-memory llm.Provider useful for unit
22+
// and integration tests. It inspects the prompt for known substrings
23+
// and returns canned, schema-valid JSON; embeddings are derived from a
24+
// SHA-256 of the input so equal text yields equal vectors.
25+
type Provider struct {
26+
Dims int
27+
}
28+
29+
// Compile-time check that *Provider satisfies llm.Provider.
30+
var _ llm.Provider = (*Provider)(nil)
31+
32+
// New returns a mock provider. Pass 0 for DefaultDims (128).
33+
func New(dims int) *Provider {
34+
if dims <= 0 {
35+
dims = DefaultDims
36+
}
37+
return &Provider{Dims: dims}
38+
}
39+
40+
func (p *Provider) Name() string { return "mock" }
41+
func (p *Provider) ModelID() string { return "mock-llm" }
42+
43+
// Complete returns a deterministic response chosen by prompt substring.
44+
// Schema must match what internal/extractor and internal/community
45+
// expect; see entityPrompt in internal/extractor/entities.go and
46+
// communityPrompt in internal/community/summarizer.go.
47+
func (p *Provider) Complete(ctx context.Context, prompt string, _ ...llm.Option) (string, error) {
48+
if err := ctx.Err(); err != nil {
49+
return "", err
50+
}
51+
lower := strings.ToLower(prompt)
52+
53+
switch {
54+
case strings.Contains(lower, "knowledge graph") && strings.Contains(lower, "entities"):
55+
// Entity + relationship extraction. The pipeline parses this
56+
// JSON via internal/extractor — schema must match exactly.
57+
// Stable entity names derived from prompt-hash so different
58+
// chunks yield different graphs; dedup then collapses across
59+
// the corpus.
60+
tag := hashTag(prompt, 2)
61+
return fmt.Sprintf(`{
62+
"entities": [
63+
{"name": "Entity_%s_A", "type": "Concept", "description": "deterministic mock entity A"},
64+
{"name": "Entity_%s_B", "type": "Concept", "description": "deterministic mock entity B"}
65+
],
66+
"relationships": [
67+
{"source": "Entity_%s_A", "target": "Entity_%s_B", "predicate": "relates_to", "description": "mock edge", "weight": 1.0}
68+
]
69+
}`, tag, tag, tag, tag), nil
70+
71+
case strings.Contains(lower, "claim"):
72+
tag := hashTag(prompt, 2)
73+
return fmt.Sprintf(`{
74+
"claims": [
75+
{"subject": "Entity_%s_A", "predicate": "is", "object": "mock claim", "description": "deterministic"}
76+
]
77+
}`, tag), nil
78+
79+
case strings.Contains(lower, "community") || strings.Contains(lower, "summar"):
80+
// Must match parseCommunityReport which looks for "TITLE:" and "SUMMARY:" prefixes.
81+
return "TITLE: Mock community\nSUMMARY: A deterministic, test-only paragraph describing the community of entities in scope.", nil
82+
83+
default:
84+
// Unknown prompt — return empty JSON so whatever caller gets
85+
// it can proceed without a parse error.
86+
return `{}`, nil
87+
}
88+
}
89+
90+
// Embed returns a Dims-length vector derived from SHA-256(text). Equal
91+
// text yields equal vectors.
92+
func (p *Provider) Embed(ctx context.Context, text string) ([]float32, error) {
93+
if err := ctx.Err(); err != nil {
94+
return nil, err
95+
}
96+
return hashEmbedding(text, p.Dims), nil
97+
}
98+
99+
func (p *Provider) EmbedBatch(ctx context.Context, texts []string) ([][]float32, error) {
100+
out := make([][]float32, len(texts))
101+
for i, t := range texts {
102+
v, err := p.Embed(ctx, t)
103+
if err != nil {
104+
return nil, err
105+
}
106+
out[i] = v
107+
}
108+
return out, nil
109+
}
110+
111+
// hashEmbedding derives a stable dims-length unit vector from SHA-256(text).
112+
// Runs SHA-256 repeatedly with a counter suffix until dims float32s have
113+
// been produced, then L2-normalises. O(dims) time, zero allocations in
114+
// the hot path beyond the output slice.
115+
func hashEmbedding(text string, dims int) []float32 {
116+
if dims <= 0 {
117+
dims = DefaultDims
118+
}
119+
out := make([]float32, dims)
120+
seed := []byte(text)
121+
var i int
122+
for counter := uint32(0); i < dims; counter++ {
123+
var ctrBuf [4]byte
124+
binary.LittleEndian.PutUint32(ctrBuf[:], counter)
125+
h := sha256.New()
126+
h.Write(seed)
127+
h.Write(ctrBuf[:])
128+
sum := h.Sum(nil)
129+
// Each sha256 gives 32 bytes → 8 float32s via uint32 LE.
130+
for j := 0; j < len(sum) && i < dims; j += 4 {
131+
u := binary.LittleEndian.Uint32(sum[j : j+4])
132+
// Map uint32 into (-1, 1).
133+
out[i] = float32(int32(u))/float32(math.MaxInt32) - 0
134+
i++
135+
}
136+
}
137+
// L2-normalise so cosine similarity stays well defined.
138+
var norm float64
139+
for _, v := range out {
140+
norm += float64(v) * float64(v)
141+
}
142+
if norm == 0 {
143+
out[0] = 1
144+
return out
145+
}
146+
inv := float32(1.0 / math.Sqrt(norm))
147+
for k := range out {
148+
out[k] *= inv
149+
}
150+
return out
151+
}
152+
153+
// hashTag returns the first n hex chars of SHA-256(s) — used as a
154+
// stable, short identifier in canned entity names.
155+
func hashTag(s string, n int) string {
156+
sum := sha256.Sum256([]byte(s))
157+
const hex = "0123456789abcdef"
158+
out := make([]byte, n*2)
159+
for i := 0; i < n; i++ {
160+
out[2*i] = hex[sum[i]>>4]
161+
out[2*i+1] = hex[sum[i]&0x0f]
162+
}
163+
return string(out)
164+
}

internal/mcp/tools_fuzz_test.go

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
//go:build sqlite_fts5
2+
3+
package mcp
4+
5+
import (
6+
"encoding/json"
7+
"strings"
8+
"testing"
9+
)
10+
11+
// FuzzMCPToolArgs asserts that the argument-coercion helpers (stringArg,
12+
// intArg, and the `project` shortcut projectArg) never panic on any JSON
13+
// payload an MCP client might send. We fuzz a JSON blob, unmarshal it
14+
// into map[string]any (the exact type the real handlers receive via
15+
// mcpgo.CallToolRequest.GetArguments()), and poke each helper with the
16+
// known keys plus a couple of keys that intentionally don't exist.
17+
func FuzzMCPToolArgs(f *testing.F) {
18+
// Seeds cover the shapes that flow through the real tool registrations
19+
// in tools.go: strings, numbers (float64 after JSON round-trip),
20+
// booleans, nulls, nested objects, and arrays. Malformed JSON is fed
21+
// via the "ignore" branch — the unmarshal error is expected and
22+
// skipped so it does not count as a fuzzer-discovered crash.
23+
seeds := []string{
24+
`{}`,
25+
`{"query":"hello"}`,
26+
`{"query":""}`,
27+
`{"top_k":5}`,
28+
`{"top_k":5.5}`,
29+
`{"top_k":-1}`,
30+
`{"top_k":"not a number"}`,
31+
`{"project":null}`,
32+
`{"project":true}`,
33+
`{"project":["nested","array"]}`,
34+
`{"project":{"nested":"object"}}`,
35+
`{"entity_name":"foo","depth":2}`,
36+
`{"community_level":0}`,
37+
`{"` + strings.Repeat("a", 1024) + `":"long-key"}`,
38+
`not json at all`,
39+
``,
40+
}
41+
for _, s := range seeds {
42+
f.Add(s)
43+
}
44+
45+
// All known argument keys used across internal/mcp/tools.go and
46+
// notes_tools.go. Exhaustive is cheap; if a new tool adds a new
47+
// key this list lags but the fuzz target still covers the helpers.
48+
keys := []string{
49+
"query", "top_k", "doc_type", "project",
50+
"community_level", "entity_name", "depth",
51+
"from", "to", "predicate",
52+
"note_key", "content", "tags", "limit",
53+
"max_nodes", "graph_depth", "doc_id", "type",
54+
"nonexistent_key_for_default_path",
55+
}
56+
57+
f.Fuzz(func(t *testing.T, raw string) {
58+
var args map[string]any
59+
if err := json.Unmarshal([]byte(raw), &args); err != nil {
60+
// Not valid JSON — not our target. MCP transport layer
61+
// already rejects these before they reach tool handlers.
62+
t.Skip()
63+
}
64+
if args == nil {
65+
// JSON "null" at the top level — nothing to coerce.
66+
return
67+
}
68+
69+
for _, k := range keys {
70+
_ = stringArg(args, k, "default")
71+
_ = intArg(args, k, 0)
72+
}
73+
// projectArg lives in server.go and wraps stringArg for "project".
74+
_ = projectArg(args)
75+
})
76+
}

internal/notes/history_test.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import (
1313
func skipIfNoGit(t *testing.T) {
1414
t.Helper()
1515
if _, err := exec.LookPath("git"); err != nil {
16+
// TODO(#65): environmental skip (git binary missing); tracked in flake-register.
1617
t.Skip("git not available")
1718
}
1819
}

internal/notes/notes_test.go

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -227,6 +227,7 @@ func TestUnicodeKey(t *testing.T) {
227227

228228
func TestScale_1000Notes(t *testing.T) {
229229
if testing.Short() {
230+
// TODO(#63): 1000-note scale test skipped under -short; tracked in flake-register.
230231
t.Skip("skipping 1000-note scale test in -short mode")
231232
}
232233
dir := t.TempDir()

0 commit comments

Comments
 (0)