AI-powered penetration testing agent for defensive security analysis. Automates vulnerability assessment by combining reconnaissance tools with AI-powered code analysis.
Prerequisites: Docker, Anthropic API key in .env
# Setup
cp .env.example .env && edit .env # Set ANTHROPIC_API_KEY
# Prepare repo (REPO is a folder name inside ./repos/, not an absolute path)
git clone https://github.com/org/repo.git ./repos/my-repo
# or symlink: ln -s /path/to/existing/repo ./repos/my-repo
# Run
./shannon start URL=<url> REPO=my-repo
./shannon start URL=<url> REPO=my-repo CONFIG=./configs/my-config.yaml
# Workspaces & Resume
./shannon start URL=<url> REPO=my-repo WORKSPACE=my-audit # New named workspace
./shannon start URL=<url> REPO=my-repo WORKSPACE=my-audit # Resume (same command)
./shannon start URL=<url> REPO=my-repo WORKSPACE=<auto-name> # Resume auto-named run
./shannon workspaces # List all workspaces
# Monitor
./shannon logs # Real-time worker logs
# Temporal Web UI: http://localhost:8233
# Stop
./shannon stop # Preserves workflow data
./shannon stop CLEAN=true # Full cleanup including volumes
# Build
npm run buildOptions: CONFIG=<file> (YAML config), OUTPUT=<path> (default: ./audit-logs/), WORKSPACE=<name> (named workspace; auto-resumes if exists), PIPELINE_TESTING=true (minimal prompts, 10s retries), REBUILD=true (force Docker rebuild), ROUTER=true (multi-model routing via claude-code-router)
src/session-manager.ts— Agent definitions (AGENTSrecord). Agent types insrc/types/agents.tssrc/config-parser.ts— YAML config parsing with JSON Schema validationsrc/ai/claude-executor.ts— Claude Agent SDK integration with retry logicsrc/services/— Business logic layer (Temporal-agnostic). Activities delegate here. Key:agent-execution.ts,error-handling.ts,container.tssrc/types/— Consolidated types:Result<T,E>,ErrorCode,AgentName,ActivityLogger, etc.src/utils/— Shared utilities (file I/O, formatting, concurrency)
Durable workflow orchestration with crash recovery, queryable progress, intelligent retry, and parallel execution (5 concurrent agents in vuln/exploit phases).
src/temporal/workflows.ts— Main workflow (pentestPipelineWorkflow)src/temporal/activities.ts— Thin wrappers — heartbeat loop, error classification, container lifecycle. Business logic delegated tosrc/services/src/temporal/activity-logger.ts—TemporalActivityLoggerimplementation ofActivityLoggerinterfacesrc/temporal/summary-mapper.ts— MapsPipelineSummarytoWorkflowSummarysrc/temporal/worker.ts— Worker entry pointsrc/temporal/client.ts— CLI client for starting workflowssrc/temporal/shared.ts— Types, interfaces, query definitions
- Pre-Recon (
pre-recon) — External scans (nmap, subfinder, whatweb) + source code analysis - Recon (
recon) — Attack surface mapping from initial findings - Vulnerability Analysis (5 parallel agents) — injection, xss, auth, authz, ssrf
- Exploitation (5 parallel agents, conditional) — Exploits confirmed vulnerabilities
- Reporting (
report) — Executive-level security report
- Configuration — YAML configs in
configs/with JSON Schema validation (config-schema.json). Supports auth settings, MFA/TOTP, and per-app testing parameters - Prompts — Per-phase templates in
prompts/with variable substitution ({{TARGET_URL}},{{CONFIG_CONTEXT}}). Shared partials inprompts/shared/viasrc/services/prompt-manager.ts - SDK Integration — Uses
@anthropic-ai/claude-agent-sdkwithmaxTurns: 10_000andbypassPermissionsmode. Playwright MCP for browser automation, TOTP generation via MCP tool. Login flow template atprompts/shared/login-instructions.txtsupports form, SSO, API, and basic auth - Audit System — Crash-safe append-only logging in
audit-logs/{hostname}_{sessionId}/. Tracks session metrics, per-agent logs, prompts, and deliverables. WorkflowLogger (audit/workflow-logger.ts) provides unified human-readable per-workflow logs, backed by LogStream (audit/log-stream.ts) shared stream primitive - Deliverables — Saved to
deliverables/in the target repo via thesave_deliverableMCP tool - Workspaces & Resume — Named workspaces via
WORKSPACE=<name>or auto-named from URL+timestamp. Resume passes--workspaceto the Temporal client (src/temporal/client.ts), which loadssession.jsonto detect completed agents.loadResumeState()insrc/temporal/activities.tsvalidates deliverable existence, restores git checkpoints, and cleans up incomplete deliverables. Workspace listing viasrc/temporal/workspaces.ts
- Define agent in
src/session-manager.ts(add toAGENTSrecord).ALL_AGENTS/AgentNametypes live insrc/types/agents.ts - Create prompt template in
prompts/(e.g.,vuln-newtype.txt) - Two-layer pattern: add a thin activity wrapper in
src/temporal/activities.ts(heartbeat + error classification).AgentExecutionServiceinsrc/services/agent-execution.tshandles the agent lifecycle automatically via theAGENTSregistry - Register activity in
src/temporal/workflows.tswithin the appropriate phase
- Variable substitution:
{{TARGET_URL}},{{CONFIG_CONTEXT}},{{LOGIN_INSTRUCTIONS}} - Shared partials in
prompts/shared/included viasrc/services/prompt-manager.ts - Test with
PIPELINE_TESTING=truefor fast iteration
- Configuration-Driven — YAML configs with JSON Schema validation
- Progressive Analysis — Each phase builds on previous results
- SDK-First — Claude Agent SDK handles autonomous analysis
- Modular Error Handling —
ErrorCodeenum,Result<T,E>for explicit error propagation, automatic retry (3 attempts per agent) - Services Boundary — Activities are thin Temporal wrappers;
src/services/owns business logic, acceptsActivityLogger, returnsResult<T,E>. No Temporal imports in services - DI Container — Per-workflow in
src/services/container.ts.AuditSessionexcluded (parallel safety)
Defensive security tool only. Use only on systems you own or have explicit permission to test.
- Optimize for readability, not line count — three clear lines beat one dense expression
- Use descriptive names that convey intent
- Prefer explicit logic over clever one-liners
- Keep functions focused on a single responsibility
- Use early returns and guard clauses instead of deep nesting
- Never use nested ternary operators — use if/else or switch
- Extract complex conditions into well-named boolean variables
- Use
functionkeyword for top-level functions (not arrow functions) - Explicit return type annotations on exported/top-level functions
- Prefer
readonlyfor data that shouldn't be mutated exactOptionalPropertyTypesis enabled — use spread for optional props, not directundefinedassignment
- Combining multiple concerns into a single function to "save lines"
- Dense callback chains when sequential logic is clearer
- Sacrificing readability for DRY — some repetition is fine if clearer
- Abstractions for one-time operations
- Backwards-compatibility shims, deprecated wrappers, or re-exports for removed code — delete the old code, don't preserve it
Comments must be timeless — no references to this conversation, refactoring history, or the AI.
Patterns used in this codebase:
/** JSDoc */— file headers (after license) and exported functions/interfaces// N. Description— numbered sequential steps inside function bodies. Use when a function has 3+ distinct phases where at least one isn't immediately obvious from the code. Each step marks the start of a logical phase. Reference:AgentExecutionService.execute(steps 1-9) andinjectModelIntoReport(steps 1-5)// === Section ===— high-level dividers between groups of functions in long files, or to label major branching/classification blocks (e.g.,// === SPENDING CAP SAFEGUARD ===). Not for sequential steps inside function bodies — use numbered steps for that// NOTE:/// WARNING:/// IMPORTANT:— gotchas and constraints
Never: obvious comments, conversation references ("as discussed"), history ("moved from X")
Entry Points: src/temporal/workflows.ts, src/temporal/activities.ts, src/temporal/worker.ts, src/temporal/client.ts
Core Logic: src/session-manager.ts, src/ai/claude-executor.ts, src/config-parser.ts, src/services/, src/audit/
Config: shannon (CLI), docker-compose.yml, configs/, prompts/
- "Repository not found" —
REPOmust be a folder name inside./repos/, not an absolute path. Clone or symlink your repo there first:ln -s /path/to/repo ./repos/my-repo - "Temporal not ready" — Wait for health check or
docker compose logs temporal - Worker not processing — Check
docker compose ps - Reset state —
./shannon stop CLEAN=true - Local apps unreachable — Use
host.docker.internalinstead oflocalhost - Missing tools — Use
PIPELINE_TESTING=trueto skip nmap/subfinder/whatweb (graceful degradation) - Container permissions — On Linux, may need
sudofor docker commands