Skip to content

ci: tier-1 fuzz smoke (crawler + chunker)#28

Merged
aksOps merged 5 commits intomainfrom
ci-fuzz-smoke
Apr 23, 2026
Merged

ci: tier-1 fuzz smoke (crawler + chunker)#28
aksOps merged 5 commits intomainfrom
ci-fuzz-smoke

Conversation

@aksOps
Copy link
Copy Markdown
Contributor

@aksOps aksOps commented Apr 23, 2026

Summary

Adds native Go fuzzing at the `FuzzXxx` level — Tier 1 of the OSS-Fuzz roadmap. Scorecard's Fuzzing check detects these targets.

  • `FuzzResolveURL` — found + fixed a 2nd edge case in `resolveURL`: a non-http(s) *base* URL could escape the scheme allow-list via `url.ResolveReference`. Added belt-and-braces check on the resolved scheme. 1.76M executions clean at 30s/target.
  • `FuzzChunker` — exercises text splitter on UTF-8 edges, zero bytes, CRLF, large inputs.
  • `.github/workflows/fuzz.yml` — runs each target for 30s on every PR / main push.

Depends on #19 for the initial `resolveURL` scheme allow-list. Once #19 merges this PR collapses to just the fuzz additions + belt-and-braces delta.

Follow-up (tier 2)

Register with google/oss-fuzz for continuous fuzzing — separate PR, ~1 week review turnaround.

Test plan

  • `go test -fuzz=FuzzResolveURL -fuzztime=30s` clean (1.76M execs)
  • `go test -fuzz=FuzzChunker -fuzztime=10s` clean
  • CI fuzz job passes

aksOps and others added 2 commits April 23, 2026 00:00
Addresses 3 open code-scanning alerts:

- CodeQL go/incomplete-url-scheme-check (high): crawler's denylist only
  covered mailto:/javascript: — data:, vbscript:, tel:, file:, blob:
  could slip through. Replaced with an http(s)-only allow-list on the
  parsed URL scheme (case-insensitive). Added table-driven tests.

- SonarCloud go:S2612 (x2): hookinstaller wrote config with 0o644 and
  hook scripts with 0o755 — both world-readable. Tightened to 0o600 /
  0o700 since these files live in the user's own ~/.claude dir and
  only the owner needs access.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- FuzzResolveURL — found and fixed a second edge case in resolveURL:
  non-http(s) base URLs could escape the scheme allow-list via
  ResolveReference. Added belt-and-braces scheme check on the resolved
  URL. 1.76M executions clean at 30s/target.
- FuzzChunker — exercises text splitter on UTF-8 edges, zero bytes,
  CRLF, large inputs, and random (size, overlap) pairs.
- .github/workflows/fuzz.yml — runs each target for 30s on every PR +
  main push. Scorecard's Fuzzing check detects Go native fuzz targets.

Not a replacement for continuous fuzzing; see OSS-Fuzz tier-2 for that.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@aksOps aksOps enabled auto-merge (squash) April 23, 2026 00:09
@aksOps aksOps merged commit ab864e5 into main Apr 23, 2026
12 checks passed
@aksOps aksOps deleted the ci-fuzz-smoke branch April 23, 2026 01:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant