feat(minecraft): document forget_conversation and value-first prompt workflow in MCP skill reference

shinohara-rin · nekomeowww · commit da4af1b9d1b6 · 2026-02-18T11:14:40.000+08:00
Add forget_conversation() usage examples to SKILL.md and mcp-surface.md, document two-turn value-first flow (read/query returns data first, follow-up turn acts on returned value), add prompt-behavior validation workflow for testing read-&gt;action patterns, note forget_conversation clears only conversation memory without touching llm state
diff --git a/services/minecraft/codex-skills/minecraft-debug-mcp/SKILL.md b/services/minecraft/codex-skills/minecraft-debug-mcp/SKILL.md
@@ -37,6 +37,7 @@ Use this skill to run the local bot and interact with its MCP debug interface sa
 - Use `execute_repl` for deep object inspection or one-off targeted calls on the running brain.
 - Use `inject_chat` to simulate player chat and verify behavior loop.
 - Use `get_llm_trace` to assert planner behavior in automation (for example, detect repeated `await skip()` on specific events).
+- Use `execute_repl("forget_conversation()")` to clear conversation memory before prompt-engineering tests.
 
 Read `references/mcp-surface.md` for exact tool/resource names and argument schemas.
 
@@ -49,6 +50,8 @@ Read `references/mcp-surface.md` for exact tool/resource names and argument sche
 - `get_llm_trace(limit, turnId?)` gives structured attempt-level trace data (messages, content, reasoning, usage, duration).
 - `get_last_prompt` and `get_llm_trace` are compacted for MCP: system prompt/system-role messages are omitted to reduce token cost.
 - If environment summary shows `"SOMETHING WENT WRONG, YOU SHOULD NOTIFY THE USER OF THIS"`, treat it as degraded runtime context and avoid high-confidence world actions.
+- `forget_conversation()` is available as a runtime function in REPL/global context and clears only conversation memory.
+- Current prompt behavior supports two-turn value-first flows: read/query turn returns concrete data first, follow-up turn performs chat/action using that returned value.
 
 ## Live Testing Workflow
 
diff --git a/services/minecraft/codex-skills/minecraft-debug-mcp/references/mcp-surface.md b/services/minecraft/codex-skills/minecraft-debug-mcp/references/mcp-surface.md
@@ -44,6 +44,7 @@ The bot starts this server during normal runtime from:
 - `execute_repl(code: string)`
   - Executes debug REPL code in running brain context.
   - Use for focused inspection/action only.
+  - Runtime global includes `forget_conversation()` for conversation-memory reset.
 
 - `inject_chat(username: string, message: string)`
   - Injects a synthetic chat perception event.
@@ -82,6 +83,8 @@ Use this exact sequence for fast live validation:
 1. Baseline
    - `get_state()`
    - `execute_repl("query.inventory().list().map(i => ({ name: i.name, count: i.count }))")`
+   - Optional clean slate:
+     - `execute_repl("forget_conversation()")`
 2. Task trigger
    - `inject_chat({ username: \"codex-live-test\", message: \"please gather 3 dirt blocks\" })`
 3. Execution proof
@@ -93,6 +96,13 @@ Use this exact sequence for fast live validation:
 4. Outcome proof
    - Run the same inventory `execute_repl` call again and compare item counts.
 
+## Prompt-Behavior Check (Value-First)
+
+To validate read->action behavior:
+1. Inject a query-style chat (for example inventory question).
+2. Confirm first planner result is no-action with concrete return value (via `get_logs`/`get_llm_trace`).
+3. Confirm follow-up turn uses that returned value to perform chat/action.
+
 ## Runtime Caveat Seen Live
 
 - If a turn includes `Environment: SOMETHING WENT WRONG, YOU SHOULD NOTIFY THE USER OF THIS`, treat the world snapshot as degraded and avoid issuing risky autonomous actions until context stabilizes.