Transform complex projects into coordinated, verified, production-ready deliverables — with intelligent agents that research, plan, implement, test, and document autonomously.
A modular, high-performance multi-agent team designed for complex project execution, feature implementation, and automated verification.
Available in awesome-copilot — the official GitHub repository for Copilot extensions.
Traditional AI coding assistants hit walls when projects get complex:
- Context overload — One agent trying to hold everything leads to mistakes
- No specialization — Jack of all trades, master of none
- Sequential bottlenecks — Tasks execute one-by-one, wasting time
- Missing verification — Changes ship without proper testing
- No audit trail — What changed? Why? Who knows...
| Challenge | Gem Team Approach |
|---|---|
| 🧠 Context Overload | Specialized agents with focused expertise — each holds only what it needs |
| 🎯 Lack of Specialization | 7 expert agents: researcher, planner, implementer, tester, reviewer, devops, and documentation specialist |
| 🐢 Sequential Bottlenecks | DAG-based parallel execution — up to 4 agents work simultaneously |
| ❌ Missing Verification | Verification-first: no task completes without passing its verification command |
| 📜 No Audit Trail | Persistent plan.yaml state file tracks every decision, status, and outcome |
- 🚀 10x Faster Execution — Parallel agent execution eliminates bottlenecks
- 🎯 Higher Quality Output — Specialized agents + mandatory verification = fewer bugs
- 🔒 Built-in Security — Dedicated reviewer agent applies OWASP scanning on critical tasks
- 📊 Full Visibility — Real-time plan status, clear approval gates, comprehensive summaries
- 🔄 Resilient Workflows — Pre-mortem analysis, failure handling, and automatic replanning
- 📋 Strict Communication Protocol — Standardized input/output formats for reliable delegation and handoffs
- 🎯 Autonomous Execution — Most agents work independently without user intervention (except approval gates)
- 🔧 Context-Efficient Operations — Smart file reading (semantic search, 200-line limits) and batch operations for speed
- 📄 PRD Support — Machine-readable Product Requirements Document with state machines, error codes, and decision tracking
Gem Team follows a Delegation-First pattern. The Orchestrator never executes—it only detects phase, routes to agents, and synthesizes results. All state operations are managed directly by the Orchestrator.
┌─────────────────────────────────────────────────────────────────┐
│ USER GOAL │
└──────────────────────────────┬──────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────────────┐
│ ORCHESTRATOR │
│ • Detect phase • Route to agents │
│ • Synthesize results • Manage plan.yaml state │
│ • Manage todos • Never execute │
└──────────────────────────────┬───────────────────────────────────┘
▼
┌──────────────────────┴──────────────────────┐
▼ ▼
┌───────────────────┐ ┌────────────────────────┐
│ RESEARCHER │ ──────────────────▶ │ PLANNER │
│ Explore codebase │ findings │ DAG Task Decomposition│
└───────────────────┘ └────────────┬───────────┘
▼
┌────────────────────────┐
│ plan.yaml │
│ (Task DAG + State) │
└────────────┬───────────┘
▼
┌────────────────────┬─────────────────────────┼─────────────────────────┬────────────────────┐
▼ ▼ ▼ ▼ ▼
┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ IMPLEMENTER │ │ BROWSER │ │ DEVOPS │ │ REVIEWER │ │ DOC WRITER │
│ TDD Execution│ │ TESTER │ │ CI/CD + Infra│ │ Security Gate│ │ Technical │
└──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘ └──────────────┘
| Agent | Role | Primary Responsibility |
|---|---|---|
gem-orchestrator |
ORCHESTRATOR | Team Lead - Coordinate workflow with energetic announcements. Detect phase → Route to agents → Synthesize results. Manage plan.yaml state and todos. Never execute. |
gem-researcher |
RESEARCHER | Explore codebase, identify patterns, map dependencies. Deliver structured findings in YAML. Never implement. |
gem-planner |
PLANNER | Design DAG-based plans, decompose tasks, identify failure modes. Create plan.yaml. Never implement. |
gem-implementer |
IMPLEMENTER | Write code using TDD. Follow plan specifications. Ensure tests pass. Never review. |
gem-browser-tester |
BROWSER TESTER | Run E2E scenarios in browser (Chrome DevTools MCP, Playwright, Agent Browser). Verify UI/UX, accessibility. Deliver test results. Never implement. |
gem-devops |
DEVOPS | Deploy infrastructure, manage CI/CD, configure containers. Ensure idempotency. Never implement. |
gem-reviewer |
REVIEWER | Scan for security issues, detect secrets, verify spec compliance. Deliver audit report. Never implement. |
gem-documentation-writer |
DOCUMENTATION WRITER | Write technical docs, generate diagrams, maintain code-documentation parity. Never implement. |
flowchart TD
A[🎯 User Goal] --> B[🎭 Orchestrator]
B --> C[🔍 Researcher]
D -->|Feedback| C
D -->|Approved| E[📋 Planner]
E --> F[📄 plan.yaml]
F --> G{⏸️ Plan Approval}
G -->|Feedback| F
G -->|Approved| H[🚀 Parallel Execution]
H --> I[💻 Implementer]
H --> J[🌐 Chrome Tester]
H --> K[⚙️ DevOps]
H --> L[🛡️ Reviewer]
H --> M[📝 Doc Writer]
I & J & K & L & M --> N[🔄 Synthesize Results]
N --> O{All Tasks Done?}
O -->|No| H
O -->|Yes| P[📊 Walkthrough Summary]
- Detection — Orchestrator checks plan existence and task status
- Research — Parallel researchers gather context per focus area
- Planning — Planner creates DAG plan with pre-mortem analysis
- Execution — Up to 4 agents execute tasks in waves (dependencies first)
- Completion — Doc Writer finalizes walkthrough + PRD
Send a steer message to gem-orchestrator and it automatically redirects to the appropriate agent — researcher for new context, planner for plan updates — integrating your request into the active workflow.
The Orchestrator acts as an energetic team lead — announces phase/wave starts, celebrates wins, acknowledges setbacks. Keeps the team motivated with concise, action-oriented updates.
The Orchestrator identifies key domains or features and launches multiple Researcher agents in parallel, each targeting a specific focus_area. This ensures deep, specific context is gathered for every part of the system before the Planner synthesizes it all into a unified plan.yaml.
No task completes without passing its defined verification command. Implementers follow strict TDD discipline:
- Write tests FIRST
- Confirm tests FAIL
- Write MINIMAL code to pass
- Check
get_errorsafter every edit
Browser Tester supports Chrome DevTools MCP, Playwright, and Agent Browser — run E2E tests across different browser tools for maximum coverage.
The Reviewer agent acts as a security gatekeeper for critical tasks:
- OWASP Top 10 scanning
- Secrets/PII detection
- Compliance verification
- Tiered review depth (Full → Standard → Lightweight)
Planner identifies failure modes (likelihood, impact, mitigation) for complex plans BEFORE execution.
State in docs/plan/{plan_id}/plan.yaml provides recovery, retry handling, and full decision traceability.
Machine-readable spec at docs/prd.yaml — Planner creates draft, Doc Writer finalizes. Contains state machines, error codes, performance thresholds, and decision log.
User → ORCHESTRATOR → WORKERS (execute)
- Orchestrator:
disable-model-invocation: true— delegates ALL work, manages plan.yaml state and todos, never executes - Workers:
disable-model-invocation: false— execute tasks via tools- RESEARCHER, PLANNER, IMPLEMENTER, BROWSER TESTER, DEVOPS, REVIEWER, DOC WRITER
- Isolation: Workers cannot call other subagents — all collaboration mediated by Orchestrator
gem-team/
├── gem-*.agent.md # Agent definitions (7 agents)
├── docs/
│ ├── prd.yaml # Product Requirements Document (project-level)
│ └── plan/{plan_id}/
│ ├── plan.yaml # Task DAG + state
│ ├── research_findings_*.yaml # Researcher output
│ ├── walkthrough-*.md # Completion documentation
│ ├── evidence/{task_id}/ # Browser test failures
│ └── logs/ # Failure logs
└── README.md
| Agent | Generates | Path |
|---|---|---|
| gem-planner | plan.yaml, PRD (draft) | docs/plan/{plan_id}/plan.yaml, docs/prd.yaml |
| gem-researcher | findings YAML | docs/plan/{plan_id}/research_findings_{focus}.yaml |
| gem-documentation-writer | walkthrough, PRD (final) | docs/plan/{plan_id}/walkthrough-*.md, docs/prd.yaml |
| gem-browser-tester | evidence (on failure) | docs/plan/{plan_id}/evidence/{task_id}/ |
| All agents | failure logs | docs/plan/{plan_id}/logs/{agent}_{task_id}_{ts}.yaml |
Delegation (Input):
task_id, plan_id, plan_path, task_definition (agent-specific)Completion (Output):
{"status": "completed|failed|needs_revision", "task_id", "plan_id", "summary": "≤3 sentences", "extra": {}}- Output ONLY requested deliverable (code: code ONLY)
- Think-Before-Action via internal
<thought>block - Batch independent operations; context-efficient reads (≤200 lines)
- Agent-specific
verificationcriteria from plan.yaml
| Agent | Verification |
|---|---|
| Implementer | get_errors → typecheck → unit tests |
| Browser Tester | validation matrix → console → network → accessibility |
| Reviewer | OWASP scan → code quality → logic |
| DevOps | deployment → health checks → idempotency |
| Doc Writer | completeness → code parity → formatting |
- Most agents: Fully autonomous
- DevOps: Approval gates for production/security
- Planner: Mandatory
plan_reviewbefore execution - Orchestrator: Delegates all via
runSubagent
| Scenario | How Gem Team Helps |
|---|---|
| Large Feature Implementation | Decomposes into parallel subtasks, implements with TDD, verifies each component |
| Codebase Refactoring | Researches patterns, plans migration, executes incrementally with tests |
| Security Audit | Reviewer scans for OWASP issues, secrets, compliance gaps |
| Documentation Overhaul | Doc Writer generates accurate docs maintaining code-documentation parity |
| CI/CD Pipeline Setup | DevOps agent creates containers, pipelines, deploys with health checks |
| UI/UX Testing | Chrome Tester automates validation matrix, captures visual evidence |
This project is licensed under the Apache License 2.0 — see the LICENSE file for details.
Built for Gem Team — Precision. Parallelism. Progress.
Transform complexity into coordinated execution.