Building a component? → START-HERE.md This is the ONLY place you need to start. It will route you to exactly what you need.
ai/
├── START-HERE.md ⭐ START HERE - Entry point for all tasks
├── README.md 📄 This file
│
├── decision-trees/ 🌳 Help AI choose approach
│ ├── api-type-selector.md (REST vs GraphQL vs SQL)
│ ├── auth-selector.md (OAuth vs API Key vs Basic)
│ └── pagination-selector.md (Offset vs Cursor vs Link)
│
├── recipes/ 📖 Complete working examples
│ ├── rest-api-extractor.md (REST API template)
│ ├── graphql-extractor.md (GraphQL template)
│ ├── sql-extractor.md (SQL database template)
│ └── pagination-cursor.md (Cursor pagination pattern)
│
├── build-new-component.md 📋 Complete implementation checklist
├── e2b-compatibility.md ☁️ Cloud sandbox requirements
├── error-patterns.md 🔧 Common errors and fixes
├── dependency-management.md 📦 requirements.txt and venv
│
├── checklists/ ✅ Validation rules
│ ├── COMPONENT_AI_CHECKLIST.md (57 validation rules)
│ ├── discovery_contract.md (Discovery requirements)
│ ├── connections_doctor_contract.md (Healthcheck)
│ └── metrics_events_contract.md (Telemetry)
│
└── llms/ 🤖 Detailed contracts
├── components.md (Component spec patterns)
├── drivers.md (Driver implementation)
├── testing.md (Test patterns)
└── overview.md (Determinism principles)
- Start with START-HERE.md
- It will route you based on your task
- You'll only read 4-6 docs (not all 20+)
- Read START-HERE.md → Prerequisites section
- Follow link to
../human/CONCEPTS.md - Return to START-HERE and choose your task
- Read error-patterns.md
- Find your error
- Apply the fix
- Validate with checklist
| Your Task | Start Here | Then Read |
|---|---|---|
| Build new component | START-HERE.md | Decision trees → Recipes → Checklist |
| Debug failing component | START-HERE.md → error-patterns.md | Apply fix → Validate |
| Understand architecture | START-HERE.md → Prerequisites | ../human/CONCEPTS.md |
| Add capability | build-new-component.md | Relevant checklist |
| Review component PR | COMPONENT_AI_CHECKLIST.md | Verify all 57 rules |
DON'T:
- ❌ Read all 20+ documents
- ❌ Start with llms/ directory (too detailed)
- ❌ Skip START-HERE.md
DO:
- ✅ Always start with START-HERE.md
- ✅ Follow the task-based routing
- ✅ Use decision trees before coding
- ✅ Validate with checklists
All docs in ai/ directory follow:
- Task-oriented structure (not reference dumps)
- Machine-verifiable rules (SPEC-001, DRV-002, etc.)
- Working code examples (copy-paste ready)
- Cross-references (no dead links)
- Version tracking (Last Updated dates)
- ../human/ - Human-friendly guides with explanations
- ../../reference/ - Complete schema references
- ../../adr/ - Architecture decision records
Remember: When in doubt, START-HERE.md is your answer. It routes to everything you need.
Purpose: Route AI agents to the correct documentation based on development intent.
Audience: AI agents, automated validators, CI systems
AI agents should:
- Identify intent from user request
- Look up intent in routing table below
- Load specified documents in order
- Generate/validate code according to loaded contracts
- Verify compliance using checklists
| Intent | Load These Documents (in order) | Purpose |
|---|---|---|
| Build extractor | llms/components.md → llms/drivers.md → checklists/COMPONENT_AI_CHECKLIST.md → checklists/discovery_contract.md |
Generate extractor component with discovery |
| Build writer | llms/components.md → llms/drivers.md → checklists/COMPONENT_AI_CHECKLIST.md |
Generate writer component |
| Build processor | llms/components.md → llms/drivers.md → checklists/COMPONENT_AI_CHECKLIST.md |
Generate processor component |
| Validate connections | llms/connectors.md → checklists/connections_doctor_contract.md |
Implement connection resolution and healthcheck |
| Implement driver | llms/drivers.md → checklists/COMPONENT_AI_CHECKLIST.md → checklists/metrics_events_contract.md |
Implement Driver protocol |
| Implement discovery | llms/components.md → checklists/discovery_contract.md → schemas/discovery_output.schema.json |
Add discovery mode to component |
| Emit telemetry | llms/drivers.md → checklists/metrics_events_contract.md → schemas/events.schema.json → schemas/metrics.schema.json |
Add proper metric/event emission |
| Run CLI commands | llms/cli.md |
Generate CLI command sequences |
| Write tests | llms/testing.md → checklists/COMPONENT_AI_CHECKLIST.md |
Generate test cases |
| Full component audit | llms/overview.md → checklists/COMPONENT_AI_CHECKLIST.md → All checklists |
Comprehensive validation |
ai/
├── README.md (this file) ← Router for AI agents
│
├── llms/ ← LLM contracts (how to generate code)
│ ├── overview.md ← Determinism, fingerprints, machine-readable outputs
│ ├── components.md ← Component spec generation
│ ├── connectors.md ← Connection resolution patterns
│ ├── drivers.md ← Driver implementation patterns
│ ├── cli.md ← CLI command generation
│ └── testing.md ← Test generation patterns
│
├── checklists/ ← Validation rules (what to verify)
│ ├── COMPONENT_AI_CHECKLIST.md ← 57 component rules
│ ├── discovery_contract.md ← Discovery mode requirements
│ ├── metrics_events_contract.md ← Telemetry requirements
│ └── connections_doctor_contract.md ← Connection/healthcheck requirements
│
└── schemas/ ← JSON schemas (machine-readable formats)
├── events.schema.json ← Event stream schema
├── metrics.schema.json ← Metrics stream schema
└── discovery_output.schema.json ← Discovery output format
User Request: "Build a Shopify extractor component"
1. Intent Recognition: "build extractor"
2. Load Documents:
- llms/components.md (learn spec format)
- llms/drivers.md (learn driver patterns)
- checklists/COMPONENT_AI_CHECKLIST.md (validation rules)
- checklists/discovery_contract.md (discovery requirements)
3. Generate:
- components/shopify.extractor/spec.yaml
- osiris/drivers/shopify_extractor_driver.py
4. Validate Against Checklists:
- SPEC-001 through SPEC-010 (spec completeness)
- DRIVER-001 through DRIVER-006 (driver protocol)
- DISC-001 through DISC-003 (discovery mode)
- LOG-001 through LOG-006 (telemetry)
5. Output:
- Generated code
- Validation report
- CLI commands to test
From llms/overview.md:
- All outputs MUST be deterministic (same input → same output)
- JSON keys MUST be sorted (
sort_keys=True) - Timestamps MUST be ISO 8601 UTC
- Evidence IDs MUST follow stable pattern:
ev.<type>.<step_id>.<name>.<timestamp_ms>
All generated code must produce:
- Structured logs: JSON Lines format
- Typed metrics: name, value, unit, tags
- Deterministic artifacts: Sorted keys, stable filenames
- Schema-compliant events: Validate against
schemas/events.schema.json
Before generating code:
- Load relevant checklists
- Understand MUST vs SHOULD rules
- Generate code that passes all MUST rules
- Add comments for SHOULD rules not implemented
Required Files:
components/<name>/spec.yamlosiris/drivers/<name>_driver.py
Required Sections in spec.yaml:
- name, version, modes, capabilities, configSchema
- secrets (JSON Pointers)
- x-runtime.driver
Validation Command:
osiris components validate <name> --level strict --jsonExpected Output:
{
"component": "<name>",
"is_valid": true,
"errors": []
}Protocol Signature:
def run(*, step_id: str, config: dict, inputs: dict | None, ctx: Any) -> dict:Required Returns:
- Extractors:
{"df": pandas.DataFrame} - Writers:
{} - Processors:
{"df": pandas.DataFrame}
Required Metrics:
- Extractors:
rows_read - Writers:
rows_written - Processors:
rows_processed
Validation:
# Check protocol compliance
assert hasattr(driver, "run")
assert callable(driver.run)
# Check signature
import inspect
sig = inspect.signature(driver.run)
assert all(p.kind == inspect.Parameter.KEYWORD_ONLY for p in list(sig.parameters.values())[1:])Input (from user):
config:
connection: "@shopify.default"Output (to driver):
config = {
"resolved_connection": {
"shop_domain": "mystore.myshopify.com",
"access_token": "actual_token_from_env"
}
}Validation Command:
osiris connections doctor --jsonExpected Output:
{
"connections": [
{
"family": "shopify",
"alias": "default",
"ok": true,
"latency_ms": 150,
"category": "ok",
"message": "Connection successful"
}
]
}When validation fails, AI agents should:
- Parse error output:
{
"is_valid": false,
"errors": [
{
"rule_id": "SPEC-001",
"message": "Missing required field: modes",
"fix_hint": "Add modes: [extract] to spec.yaml"
}
]
}-
Apply fixes based on
fix_hint -
Re-validate until
is_valid: true
Use these ready-to-go templates to instruct an LLM (like Claude) to generate new Osiris components automatically:
- build-new-component.md - Template for building a new component (fill placeholders for
<COMPONENT_NAME>,<API_OR_RESOURCE>,<connection_fields>)
These templates include all necessary context from LLM contracts and checklists, formatted for direct use with AI assistants.
- Human Docs:
../human/README.md - Reference:
../../reference/ - ADRs:
../../adr/
For AI Agents: Start with llms/overview.md to understand core principles, then use routing table above for specific tasks.