feat: Add llms.txt catalog and execute backstop with hard read/write/delete gating by egorpavlikhin · Pull Request #80 · OctopusDeploy/mcp-server

egorpavlikhin · 2026-05-07T07:36:56Z

Summary

Closes AIF-359 and AIF-377. Two phases shipped together as one progressive change.

Phase 1 — catalog resource + grep tool

octopus://api/llms.txt resource with a 5-min in-memory TTL cache, shared with the grep tool so a search doesn't trigger a re-fetch.
octopus://api/capabilities resource: server version, enabled toolsets, available tools, feature flags.
grep_llms_txt tool — same GNU-grep parameter shape as grep_task_log. The catalog body is ~360 KB; reading it directly is what we're avoiding.
Lifted grepLines into src/helpers/grepLines.ts so both grep tools share one implementation. Old test migrated alongside.

Phase 2 — execute backstop with hard read/write/delete tiering

HTTP method is the authoritative classifier — never an isWrite flag the LLM sets. classifyMethod(method) maps GET → read, POST/PUT/PATCH → write, DELETE → delete.
New --allow-deletes CLI flag + ToolsetConfig.allowDeletes. DELETE requires both --no-read-only AND --allow-deletes — a deliberate two-flag opt-in for irreversible operations.
Six gates inside execute.ts, in order: sensitive denylist → tier mode gate → path allowlist by enabled toolset → elicitation (DELETE gets a stronger "IRREVERSIBLE" message) → dispatch → audit-on-stderr.
Sensitive denylist covers API-key endpoints and catastrophic deletes (DELETE on /api/users/{id} and /api/spaces/{id}); enforced even with both flags on.
Glob engine in pathGlob.ts is shared between allowlist and denylist. Only * and ** are wildcards; every other character is escaped, so denylist patterns can't be turned into regex injection.

Refactors landed in the same change

Shared compilePathGlob between allow/deny (was duplicated).
activeToolsetConfig lives under src/helpers/ (was misplaced under resources/catalog/ since execute also reads it).
grepLines test migrated to src/helpers/__tests__/; grepTaskLog.ts no longer needs to re-export it.
Audit emission in execute.ts uses a closure that captures the start time, so each gate's emit is one line (audit(\"blocked\", \"readOnlyMode\")) instead of six.

Out of scope (intentionally — Phase 3)

AIF-360 (JSON schema resources for write payloads) and AIF-367 (describe tool) are deferred. Once this lands, agents have grep_llms_txt to discover endpoint shapes — adequate as a stopgap.

Test plan

npm run build — clean
npm run lint — clean
npx vitest run --exclude '**/*.integration.test.ts' — 137 unit tests pass (60+ new)
Live smoke test against https://main.testoctopus.app: read octopus://api/capabilities, run grep_llms_txt for 'POST /releases', exercise the read tier of execute with GET /api/spaces, verify the readOnlyMode gate blocks a POST, verify the deletesNotAllowed gate blocks a DELETE under --no-read-only only.
Pre-existing integration test failures (11) for findInterruptions/listTenants are unchanged — they fail on a missing 'Octopus Server' test space, unrelated to this work.

Coordination

The Linear tickets AIF-359 and AIF-377 describe the work as written before the three-tier DELETE design was confirmed. Worth updating the ticket bodies post-merge to reflect:

AIF-359: scope grew to include grep_llms_txt and the grepLines helper extraction (~250 LOC, was ~150).
AIF-377: three-tier read/write/delete classification with the new --allow-deletes flag, plus catastrophic-delete entries on the sensitive denylist.

🤖 Generated with Claude Code

…delete gating Closes AIF-359 and AIF-377 in one progressive change. Phase 1 — catalog resource + grep tool: - octopus://api/llms.txt resource (5-minute in-memory TTL cache, shared with the grep tool so a search doesn't trigger a re-fetch). - octopus://api/capabilities resource composing server version, enabled toolsets, available tools, and feature flags. - grep_llms_txt tool with the GNU-grep parameter shape already used by grep_task_log. SERVER_INSTRUCTIONS steers agents to grep the catalog rather than reading the ~360 KB body. - grepLines pure function lifted into src/helpers/grepLines.ts; both grep tools share one implementation. Tests migrated alongside. Phase 2 — execute backstop with hard read/write/delete classification: - HTTP method enum is the authoritative classifier; classifyMethod in src/helpers/methodTier.ts maps GET → read, POST/PUT/PATCH → write, DELETE → delete. Three-tier semantics, not two. - New --allow-deletes CLI flag plus ToolsetConfig.allowDeletes; DELETE requires both --no-read-only and --allow-deletes (deliberate two-flag opt-in for irreversible operations). - Six gates in execute.ts, in order: sensitive denylist (always-on), tier-based mode gate, path allowlist by enabled toolset, elicitation on every non-GET (stronger 'IRREVERSIBLE' message for DELETE), dispatch, audit-on-stderr. - Sensitive denylist covers API-key endpoints (any method) and catastrophic deletes (DELETE on /api/users/{id} and /api/spaces/{id}), enforced even with both flags set. - Path allowlist is a static toolset → glob map. Conservative initial coverage; expand based on real usage. Glob engine in pathGlob.ts is shared between allowlist and denylist; only `*` and `**` wildcards, every other character escaped so denylist patterns can't be turned into regex injection. 60+ new tests added; build, lint, and the full unit suite pass. The 11 pre-existing integration test failures (live testoctopus space lookup) are unchanged — not caused by this work. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…s flag - Add an "API Catalog & Backstop" subsection covering grep_llms_txt, the execute tool's tier gating, and the catalog resources. - Update Read-Only Mode with the three-tier read/write/delete classification and the new --allow-deletes opt-in for DELETE requests. - Add --no-read-only and --allow-deletes to "Other command line arguments". - Update Security Considerations to spell out method-driven gating and the sensitive denylist. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

akirayamamoto

PR Review Notes

Nice direction overall. The split between read, write, and delete modes makes
sense, and I like that DELETE has a separate opt-in rather than being bundled
into general write access.

A couple of UX suggestions:

Consider renaming `--no-read-only`

--no-read-only works, but it asks users to reason through a double negative.
Since the flag enables write operations across the MCP server, I think
--allow-writes would be easier to understand at a glance.

You could keep --no-read-only as an alias for compatibility, but document
--allow-writes as the preferred flag:

--allow-writes
--allow-deletes

That reads more naturally: writes are off by default, and deletes require an
additional opt-in.

Make `--allow-deletes` clearer when read-only mode is still active

Since --allow-deletes only takes effect when write mode is also enabled, I’d
consider either rejecting this combination at startup:

--allow-deletes

without:

--no-read-only

or printing a clear startup warning.

Otherwise a user may think they have enabled deletes when read-only mode still
blocks them. That is safe behavior, but it could be confusing during setup or
debugging.

Something like this would help:

--allow-deletes was provided, but read-only mode is still enabled. DELETE
requests remain blocked unless write mode is enabled.

Not a blocker from my side, but I think these tweaks would make the safety model
easier to understand.

Additional Codex Findings

Codex found a few issues worth looking at before merge:

1. Path checks should run on a canonical path

execute appears to run the allowlist and denylist checks against the raw path
argument, then passes that path to the API client. Codex found that paths with
dot segments, such as:

/api/spaces/Spaces-1/../../users/me/apikeys

can pass the /api/spaces/** allowlist check while resolving to a different API
path. I’d suggest normalizing or rejecting paths with .., encoded slashes,
backslashes, query strings, or fragments before any safety checks run.

2. `/api/spaces/**` makes the toolset allowlist too broad

Codex also found that core includes /api/spaces/**. Since core is always
enabled and checked first, this appears to allow all space-scoped endpoints
through execute, even when the owning toolset is disabled.

That makes the documented “toolset filtering as a kill switch” model weaker than
it looks. I’d suggest narrowing core to space discovery paths only, then adding
explicit /api/spaces/{space}/... patterns to each owning toolset.

3. `octopus://api/capabilities` may describe `execute` as read-only

Codex found that execute is registered as readOnly: true so it remains
available in read-only mode for GET requests. That makes sense internally, but
the capabilities resource may lead agents to think execute is fully read-only,
even when write mode allows POST/PUT/PATCH.

It may be clearer to expose execute as method-gated, or include effective
tiers such as read, write, and delete.

4. Minor docs issue: `audit` toolset

Codex noticed a couple of references to an audit toolset in comments/docs, but
that does not seem to be a current toolset. This is minor, but it could confuse
users reading the safety model.

…te capabilities Addresses the Codex findings and the operator-warning UX nit from #80 review. Path canonicalization (Codex #1): - New validateExecutePath helper rejects '..' segments, backslashes, query strings, fragments, double slashes, and percent-encoded slashes (%2F/%5C) before any allowlist/denylist runs. Without this, a path like /api/spaces/Spaces-1/../../users/me/apikeys could pass the legacy /api/spaces/** core allowlist while resolving to a different endpoint server-side. - Wired in as Gate 0 in execute.ts (runs before sensitive denylist) and surfaces a structured 'invalidPath' reason on rejection. Narrower core toolset (Codex #2): - Dropped /api/spaces/** from core. core now covers /api/spaces (list) and /api/spaces/* (single-space metadata) only — every per-resource path beneath a space must register against its owning toolset. - Each toolset now declares BOTH /api/<single>/X and /api/spaces/<single>/X prefix forms. Introduced a small spaceScoped() helper to avoid hand-typing four patterns per resource family. - This restores the "toolset filtering as kill switch" promise: disabling 'certificates' now actually makes /api/spaces/Spaces-1/certificates unreachable. Accurate execute capabilities (Codex #3): - The CapabilityToolEntry for execute now carries methodGated:true and a tiersAvailable array reflecting the current session — ['read'] in read-only mode, ['read','write'] with --no-read-only, and the full ['read','write','delete'] with both flags. Static read-only tools keep the existing readOnly:true and omit the new fields. Stray 'audit' references (Codex #4): - Removed the non-existent 'audit' toolset from the pathAllowlist JSDoc, the execute tool description, and the README. The example now references 'certificates', a real toolset. Operator warning when --allow-deletes is inert (review UX #2): - src/index.ts emits a clear stderr warning at startup if --allow-deletes is passed without --no-read-only, so an operator doesn't think they've opened up DELETE while the read-only gate still blocks everything. Tests: 13 new tests (validateExecutePath unit, execute Gate 0 coverage, core-narrowing regression, methodGated/tiersAvailable behaviour). 150 unit tests pass; build and lint clean. Not addressed (intentionally — out of scope, called out as non-blocking in the review): rename --no-read-only -> --allow-writes alias. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The /api/experimental/llms.txt endpoint shipped in 2026.2.3916. Set the minimum on grep_llms_txt's tool registration so --list-tools-by-version reports it correctly, and note the requirement on the catalog resource description and in the README so older-Octopus users see the version constraint before they hit a 404 from the upstream API. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

egorpavlikhin and others added 2 commits May 7, 2026 17:35

akirayamamoto approved these changes May 8, 2026

View reviewed changes

egorpavlikhin and others added 2 commits May 8, 2026 11:36

egorpavlikhin merged commit 868e4cf into main May 8, 2026
1 check passed

egorpavlikhin deleted the egorpavlikhin/aif-359-llms-txt-catalog-and-grep branch May 8, 2026 01:48

github-actions Bot mentioned this pull request May 8, 2026

chore(main): release 2.0.0 #72

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add llms.txt catalog and execute backstop with hard read/write/delete gating#80

feat: Add llms.txt catalog and execute backstop with hard read/write/delete gating#80
egorpavlikhin merged 4 commits intomainfrom
egorpavlikhin/aif-359-llms-txt-catalog-and-grep

egorpavlikhin commented May 7, 2026

Uh oh!

akirayamamoto left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

egorpavlikhin commented May 7, 2026

Summary

Out of scope (intentionally — Phase 3)

Test plan

Coordination

Uh oh!

akirayamamoto left a comment

Choose a reason for hiding this comment

PR Review Notes

Consider renaming --no-read-only

Make --allow-deletes clearer when read-only mode is still active

Additional Codex Findings

1. Path checks should run on a canonical path

2. /api/spaces/** makes the toolset allowlist too broad

3. octopus://api/capabilities may describe execute as read-only

4. Minor docs issue: audit toolset

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Consider renaming `--no-read-only`

Make `--allow-deletes` clearer when read-only mode is still active

2. `/api/spaces/**` makes the toolset allowlist too broad

3. `octopus://api/capabilities` may describe `execute` as read-only

4. Minor docs issue: `audit` toolset