Skip to content

Commit 59670f0

Browse files
basiclinesCopilot
andauthored
Copilot SDK-powered spec compiler (#7)
* feat: add compile build command using Copilot SDK Adds a new `build` subcommand to the spec compiler that uses `@github/copilot-sdk` to launch a Copilot agent session, feed it the generated compilation prompt, and stream the agent's work to the terminal. The build command: - Validates Copilot auth before starting (with clear error messages) - Prompts for model selection via `gum choose` (falls back to defaults) - Prompts for reasoning effort (skipped if model doesn't support it) - Creates an autopilot session (approveAll — no permission prompts) - Streams output in two modes: - Normal: compact phase-level status with tool activity - Verbose (--verbose): raw agent transcript with streaming deltas - Tracks metrics: tokens, tool calls, files written - Prints a compilation summary (time, files, LOC, tokens) - Auto-locks specs on success (skip with --no-lock) - Handles SIGINT for graceful cancellation New flags: --model, --effort, --verbose, --no-lock Existing commands (status, prompt, lock, clean) are unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: always prompt for model, effort, and output dir with good defaults Flags (--model, --effort, --out) now pre-select the default value in the interactive prompt rather than skipping it entirely. Adds a new output directory picker via `gum input`. Flow order changed: auth runs first (fail fast), then interactive config, then prompt generation uses the user-chosen distDir. Non-TTY / no gum: falls back to sensible defaults silently. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add multi-pass compilation loop with improvement prompts After each compilation pass, the user is prompted to run another pass. Subsequent passes send the agent a focused improvement prompt that re-reads specs and fixes missed implementations, failing tests, and inconsistencies. - printSummary now shows pass number and cumulative pass count - Session stays alive between passes (only disconnects on exit) - gum confirm with readline fallback for pass prompt - Auto-lock deferred until all passes complete Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: count only agent-written files in build metrics Replace scanOutput() filesystem walk with countAgentOutput() that uses the tracked filesWritten set from tool execution events. This excludes node_modules and other non-agent files from the count. Also track file deletions separately so the summary shows 'N written, M deleted' when files are removed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: show lock file path when no dirty specs found Helps users understand which lock file is keeping specs clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: replace gum log with chalk for all status output Use chalk-based console.log consistently for compilation status messages instead of shelling out to gum log. Removes gumLog helper. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: broaden tool name matching for file metrics and phase detection The Copilot SDK agent may use varying tool names (edit, create, write_to_file, str_replace_editor, etc.) depending on the model. Use substring matching on normalized tool names instead of exact string comparisons so file writes are always tracked. Also broadens the path argument lookup to check path, file_path, filePath, and file argument names. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: use user-chosen dir directly as output, don't append target The output dir picker default is now dist/<target>/ (e.g. dist/bun/). If the user changes it to dist/bun-claude/, that's used as-is — no extra /<target> suffix appended. The prompt command still correctly joins distDir + target since it receives the base dist/ dir. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * style: remove gear icons from tool activity output Use indentation and dim gray text only for cleaner output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: replace gum with @clack/prompts for interactive CLI Remove gum (external Go binary) dependency entirely. All interactive prompts (model picker, effort picker, output dir, multi-pass confirm) now use @clack/prompts which runs in-process with zero external deps. - Single code path instead of gum-vs-chalk branching - No install prerequisites beyond Bun - Add Copilot subscription to README prerequisites Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: show agent's last message after each pass completes Displays the agent's final message between the phase checkmarks and the summary stats. Gives visibility into what the agent accomplished in each pass without needing verbose mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: reframe prompt for depth-first compilation Restructure the compilation prompt to prioritize a working interactive playground over surface coverage. Key changes: - Add 'depth over breadth' philosophy section - Require components to be fully complete (impl + tests + demo) before moving to the next one - Make --interactive the primary output, not a stub - Verification happens per-component, not just at the end - Multi-pass improvement prompt reinforces interactive demo priority Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: render agent summary as markdown in a box Use marked + marked-terminal to render the agent's final message with proper markdown formatting (headings, bold, code, lists) and wrap it in a boxen container with dim border and 'Agent summary' title. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add multi-pass summary instructions to system prompt Tell the agent to end each pass with a structured summary of what was accomplished and explicit next-pass priorities. This makes the boxed agent summary actionable and helps the user decide whether to run another pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: instruct agent to verify dependencies via web browsing Add rule to system prompt telling the agent to check npm/GitHub for libraries before assuming they don't exist. Prevents knowledge cutoff issues where the agent falls back to polyfills for packages that are actually published. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: lock after each pass to bank completed work Previously the lock only ran once after all passes finished. Now each successful pass (no errors) locks immediately, so completed components are banked incrementally. A bad subsequent pass won't lose the progress from earlier passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: delegate lock to the agent per-component Remove auto-lock after passes. Instead, instruct the agent in the system prompt to run 'bun run compile lock --target <t> --component <Name>' after fully completing each component (tests pass + demo wired). This ensures only genuinely completed components get locked. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: add --autopilot flag for unattended multi-pass builds Auto-continues passes without prompting, up to a max of (dirty specs + 5) passes. Shows pass count as 'Pass N/max' in the log. Useful for running full compilations unattended. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: update README with build command, flags, and autopilot Rewrite the 'Compiling to a target' section to document: - All CLI commands in a table - Full build flags reference (including --autopilot) - Interactive vs autopilot workflow examples - Agent-driven per-component locking - Multi-pass philosophy (depth over breadth) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: add 'Generating prompts manually' section to README Document the compile prompt command as an alternative to compile build, for users who want to feed prompts to external agents manually. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: note that build sessions must run from dist target folder Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: address PR review -- no-lock suppresses agent instructions, fix prompt path docs - --no-lock now conditionally omits lock instructions from system prompt - Updated help text to reflect actual behavior - Fixed README prompt path: <out>/<target>/_compile-prompt.md - Fixed working directory guidance: repo root, not dist folder Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: use SDK session modes -- autopilot vs interactive Set session.rpc.mode.set() based on --autopilot flag: - Default: 'interactive' mode, user confirms each pass - --autopilot: 'autopilot' mode, agent runs autonomously Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: keep manual multi-pass loop in both modes, SDK mode sets agent behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: autopilot flag only sets SDK mode, no custom pass logic Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: move sessionMode declaration before first use Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs: update README with new autopilot approach (SDK-driven, no custom loop) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat: offer manual passes in all modes, not just interactive Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor: unified clack timeline for build output Replace disconnected log blocks with @clack/prompts timeline: - printBuildHeader uses clack.log.step/info - Normal mode phases: step → tool messages → success - Agent summary uses clack.note (replaces boxen) - printSummary uses clack.log.success/message - Errors/warnings use clack.log.error/warn - Multi-pass labels use clack.log.step - clack.outro on cleanup and errors - Remove boxen dependency (replaced by clack.note) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: compact tool call lines in clack timeline Use raw log with │ prefix for tool activity instead of clack.log.message() which adds extra blank lines between entries. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: count files/LOC by scanning output directory Replace unreliable event-based file tracking (tool name heuristics didn't match actual SDK tool names → 0 files) with a post-compilation directory scan. Skips node_modules, .git, vendor, target, and other common dependency/build folders. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix: capture agent summary from deltas as fallback The assistant.message event can have empty content. Accumulate message_delta content and use it as fallback so the Agent summary note always shows the last assistant response. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 4b9fb7b commit 59670f0

4 files changed

Lines changed: 979 additions & 64 deletions

File tree

README.md

Lines changed: 109 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@ flowchart LR
2727
### Prerequisites
2828

2929
- [Bun](https://bun.sh/) 1.1+ installed (`bun --version`)
30+
- A [GitHub Copilot](https://github.com/features/copilot) subscription (for the `compile build` command)
3031

3132
### Install dependencies
3233

@@ -171,63 +172,139 @@ All normative sections (Visual rules, Behavior, Edge cases) use
171172
| `bun` | TypeScript | OpenTUI + React (Bun) | `targets/bun.md` |
172173
| `rust` | Rust | Ratatui + Crossterm | `targets/rust.md` |
173174

175+
### Commands
176+
177+
| Command | Purpose |
178+
| ------- | ------- |
179+
| `compile status` | Show dirty/locked specs per target |
180+
| `compile prompt` | Generate a compilation prompt file |
181+
| `compile build` | Run an agent session to compile specs (requires Copilot) |
182+
| `compile lock` | Lock spec hashes after verified compilation |
183+
| `compile clean` | Remove lock file for a target |
184+
185+
### Build flags
186+
187+
```
188+
bun run compile build --target <name> [flags]
189+
190+
Required:
191+
--target <name> Target to compile (go, node, bun, rust)
192+
193+
Optional:
194+
--component <name> Compile a single component/token only
195+
--out <dir> Output directory (default: dist/<target>)
196+
--model <id> Model to use (e.g. claude-sonnet-4, gpt-5)
197+
--effort <level> Reasoning effort: low | medium | high | xhigh
198+
--verbose Show full agent transcript (raw streaming)
199+
--no-lock Prevent the agent from locking components
200+
--autopilot Use SDK autopilot mode (agent runs fully autonomously)
201+
--all-targets Compile all targets sequentially
202+
```
203+
174204
### Workflow
175205

176206
```bash
177207
# 1. See what's changed
178208
bun run compile status
179209

180-
# 2. Generate the compilation prompt
181-
bun run compile prompt --target go
182-
183-
# 3. Feed dist/go/_compile-prompt.md to an LLM agent
184-
# The agent generates code into dist/go/
210+
# 2. Compile interactively (prompts for model, effort, output dir)
211+
bun run compile build --target bun
185212

186-
# 4. Verify: run tests, check the demo CLI
187-
cd dist/go && go test ./... && go run ./cmd/demo
213+
# 3. Or compile non-interactively with all options
214+
bun run compile build --target bun --model claude-sonnet-4 --out dist/bun-claude
188215

189-
# 5. Lock the hashes
190-
bun run compile lock --target go
216+
# 4. Or fire-and-forget with autopilot (SDK handles everything)
217+
bun run compile build --target bun --model claude-sonnet-4 --autopilot
191218
```
192219

193220
### Multi-pass compilation
194221

195-
A single compilation pass across the full component suite (17 components +
196-
tokens + demo) is usually not enough to reach production quality. We've found
197-
that **2–3 passes** produce notably better results:
222+
The compiler supports two modes, controlled by the `--autopilot` flag:
223+
224+
**Interactive mode** (default): The SDK agent runs in `interactive` mode.
225+
After the initial compilation pass, the compiler asks whether to continue with
226+
another pass. Each pass sends an improvement prompt — the agent reviews, fixes,
227+
and extends its own work. You see a boxed markdown summary after each pass.
228+
229+
**Autopilot mode** (`--autopilot`): Sets the SDK agent mode to `autopilot`.
230+
The agent runs fully autonomously — it decides when to iterate, how many passes
231+
to make, and when the work is complete. No user confirmation is needed.
198232

199233
| Pass | Focus | Typical outcome |
200234
| ---- | ----- | --------------- |
201-
| **1st** | Initial generation | All components scaffold correctly, most tests pass, demo wires up. Expect rough edges — missing edge cases, incomplete keybindings, demo wiring bugs. |
202-
| **2nd** | Review & fix | Agent reviews its own output against specs, fixes test failures, fills in missing behavior, improves demo interactivity. Test count typically grows 30–50%. |
203-
| **3rd** | Polish | Catches subtle spec violations, improves accessibility, hardens demo `--snapshot` smoke tests. Diminishing returns after this point. |
235+
| **1st** | Initial generation | Core tokens, first components fully wired into interactive demo. |
236+
| **2nd** | Extend & fix | More components added, test failures fixed, demo polished. |
237+
| **3rd** | Polish | Catches subtle spec violations, hardens edge cases. |
204238

205-
To run a follow-up pass, generate a new prompt and tell the agent to review
206-
and complete its existing work:
239+
The agent is instructed to follow a **depth-over-breadth** philosophy: it fully
240+
completes each component (implementation + tests + interactive demo) before
241+
moving to the next one.
207242

208-
```bash
209-
# Generate a fresh prompt (it sees the current dist/ state)
210-
bun run compile prompt --target go
243+
### Component locking
211244

212-
# Feed to the agent with instructions like:
213-
# "Review your existing implementation against the specs.
214-
# Fix any test failures, fill in missing behavior,
215-
# and ensure all --snapshot smoke tests pass."
245+
The agent locks components individually as it completes them by running:
246+
247+
```bash
248+
bun run compile lock --target bun --component Select
216249
```
217250

218-
Each pass is fast because the agent builds on its own prior output rather than
219-
starting from scratch. The demo's `--list` and `--snapshot` flags make it easy
220-
for the agent to self-verify between passes.
251+
This records the spec hash so the component won't be recompiled unless its spec
252+
changes. You can also lock manually after verifying generated code:
253+
254+
```bash
255+
# Lock a single component
256+
bun run compile lock --target go --component Input
257+
258+
# Lock all specs for a target
259+
bun run compile lock --target go
260+
261+
# Lock all targets
262+
bun run compile lock --all-targets
263+
```
221264

222265
### Custom output directory
223266

224-
By default, compiled code goes to `dist/`. Override with `--out`:
267+
By default, compiled code goes to `dist/<target>/`. Override with `--out`:
268+
269+
```bash
270+
# Output to a custom directory
271+
bun run compile build --target go --out dist/go-experimental
272+
273+
# The prompt and generated code go directly to dist/go-experimental/
274+
```
275+
276+
### Generating prompts manually
277+
278+
If you prefer to feed the prompt to an external agent (Claude, ChatGPT, Copilot
279+
Chat, etc.) instead of using `compile build`, use the `prompt` command:
225280

226281
```bash
227-
# Output to a separate repo or directory
228-
bun run compile prompt --target go --out ~/my-tuikit-go
282+
# Generate a prompt for a target
283+
bun run compile prompt --target go
229284

230-
# The prompt and generated code go to ~/my-tuikit-go/go/
285+
# Generate for a single component
286+
bun run compile prompt --target bun --component Select
287+
288+
# Generate to a custom directory
289+
bun run compile prompt --target node --out ~/my-project
290+
```
291+
292+
The prompt is written to `<out>/<target>/_compile-prompt.md` (e.g. `dist/go/_compile-prompt.md`). It contains:
293+
294+
- The target definition (framework, paradigm, file structure)
295+
- An index of all dirty specs with file paths and summaries
296+
- Instructions for the agent (depth-first, verification steps)
297+
- Demo specification reference
298+
299+
> **Important:** Any coding session that uses this prompt should set its working
300+
> directory to the repository root (where `components/`, `tokens/`, and `docs/`
301+
> live). The prompt references spec files using paths relative to the repo root.
302+
303+
Feed this file to any LLM agent, then lock manually once verified:
304+
305+
```bash
306+
# After the agent generates code and tests pass:
307+
bun run compile lock --target go
231308
```
232309

233310
### Adding a new target
@@ -237,7 +314,7 @@ bun run compile prompt --target go --out ~/my-tuikit-go
237314
machine pattern, token access, styling, composition, test pattern, key
238315
mapping, dependencies, and demo CLI
239316
3. Run `bun run compile status` — your target will show up with all specs dirty
240-
4. Run `bun run compile prompt --target {name}` and compile
317+
4. Run `bun run compile build --target {name}` to compile
241318

242319
## Linting
243320

0 commit comments

Comments
 (0)