Skip to content

feat(validation): probe repository URL reachability at publish (#395)#1266

Open
rhinocap wants to merge 1 commit intomodelcontextprotocol:mainfrom
rhinocap:rav-101-validate-repo-reachable
Open

feat(validation): probe repository URL reachability at publish (#395)#1266
rhinocap wants to merge 1 commit intomodelcontextprotocol:mainfrom
rhinocap:rav-101-validate-repo-reachable

Conversation

@rhinocap
Copy link
Copy Markdown

@rhinocap rhinocap commented May 7, 2026

Closes #395.

The PGA-golf submission (com.pga/pga-golfhttps://github.com/pgahq/mcp-pga-com) is currently live on the production registry with a repository URL that 404s, because semantic validation only checks URL shape, not whether the repo actually resolves. This PR adds a publish-time HTTP HEAD probe that catches that class of bug.

Summary

  • New ProbeRepositoryReachable(ctx, repoURL) in internal/validators/repository_reachability.go issues a HEAD with a 10s timeout, follows up to 5 redirects, and rejects 4xx/5xx with a clear error including the URL + status code. Mirrors the existing prior-art HEAD probe in internal/validators/registries/mcpb.go.
  • Wired into ValidatePublishRequest so it runs before package-registry ownership checks. No-ops when repository is unset (preserves the field's optionality) or when the new flag is off.
  • New MCP_REGISTRY_ENABLE_REPOSITORY_REACHABILITY_CHECK config flag (default true). Kept independent of MCP_REGISTRY_ENABLE_REGISTRY_VALIDATION so operators can flip just this check without disabling npm/PyPI/OCI ownership validation. Threaded through docker-compose.yml, tests/integration/docker-compose.integration-test.yml, and the README's offline-dev quick-start.
  • Unit tests (net/http/httptest) cover: 200, 301→200 follow, 404, 410, 5xx (500/502/503), client-side timeout, redirect-chain cap, malformed URL, and unreachable host.

Community alignment (re: 8 comments on #395)

This PR implements the minimum check the maintainers converged on, not the larger ownership-binding follow-on:

  • @domdomegg: "Currently repository is optional… maybe fine to say if it is there it must exist as a public repo." → exactly what this PR does. Optional field stays optional; when present it must reach.
  • @tadasant: argued for a stronger check that also binds name to the server.json at repository.url (with optional subfolder), to disambiguate canonical-vs-platform-republished entries.
  • @mgyarmathy (PGA author): confirmed pgahq/mcp-pga-com is a private repo — agreed offline with @tadasant to omit repository.url for private repos. This PR's behavior matches: private/404 repos must drop the field.
  • @Avish34 (issue assignee): asked about private-repo policy and whether to document that authors should omit repository.url for private repos. Documentation of that convention is intentionally not in this PR — happy to follow up in a docs-only patch, or defer to @Avish34 if she's already drafting the larger ownership-binding work.

The ownership-binding (matching name against the repo's hosted server.json) is a meaningful design effort with subfolder-search semantics still being debated in-thread; this PR does not block it. It just stops the simplest 404 case that surfaced #395 in the first place.

Notes for reviewers

  • @Avish34 is the assigned issue owner. Happy to close this PR in favor of her larger patch if that's already in flight — flag here or on the issue. This stayed scoped to the URL-reachability slice precisely so it can land independently.
  • No new dependency: net/http HEAD only.
  • Probe runs on publish, not update — keeps blast radius small.
  • Status codes accepted: 2xx and 3xx (covers GitHub's legitimate 301/308 patterns and 304s). The spec wording in the issue mentioned "200, 301, 302" and this is the consistent superset.

Test plan

  • go test ./internal/validators/... — all probe tests pass (200, 301→200, 404, 410, 500/502/503, timeout, redirect-chain cap, malformed URL, unreachable host).
  • go vet ./... clean.
  • golangci-lint run clean against new files (internal/validators/repository_reachability*.go, internal/config/config.go, internal/validators/validators.go).
  • CI: full unit + integration suite (postgres-dependent suites couldn't run locally, but no DB code is touched).

…contextprotocol#395)

Issue modelcontextprotocol#395 surfaced a published server (com.pga/pga-golf) whose
repository.url was a 404 — the registry had accepted it because semantic
validation only verifies URL shape, not reachability. Add a publish-time
HTTP HEAD probe that follows up to 5 redirects and rejects 4xx/5xx
responses with the URL and status code in the error message. Reuses the
prior-art HEAD-probe pattern from internal/validators/registries/mcpb.go.

The probe is gated behind a new MCP_REGISTRY_ENABLE_REPOSITORY_REACHABILITY_CHECK
config flag (default true; set to false for offline dev), kept independent
of MCP_REGISTRY_ENABLE_REGISTRY_VALIDATION so operators can disable just
this check.

Tests cover 200, 301→200, 404, 410, 5xx, timeout, redirect-chain limit,
malformed URL, and unreachable host using net/http/httptest.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Validate provided repositories are publicly reachable (PGA golf server's github link is wrong)

1 participant