`kbc llm export`: Data lineage graph missing extractors, writers, and implicit SQL references

## Problem

The `kbc llm export` command generates a data lineage graph (`indices/graph.jsonl`) that only contains `table:` and `transform:` node types. Non-transformation components — extractors, writers, and applications — are completely absent from the graph, even though they are the actual data sources and sinks of the pipeline.

## Current behavior

- **All nodes** in the graph are either `table:*` or `transform:*`
- **Missing from graph**: extractors, writers, applications — zero representation
- **Edge types**: only `consumed_by` and `produces` — no edges connecting components to tables they extract into or write from
- `components/index.json` correctly catalogs all components, but the lineage graph ignores everything except transformations

## Impact on AI agents

The primary consumer of `kbc llm export` output is AI agents/LLMs. Without extractor/writer edges, an agent cannot:

1. **Trace data origin** — "Where does this table come from?" → No answer from lineage (it may be extracted from an external DB, but the graph doesn't show this)
2. **Trace data destination** — "Where does this reporting table go?" → No answer (it may feed a writer or PowerBI refresh app, but graph doesn't show this)
3. **Assess blast radius** — "If I change this extractor config, what transformations are affected?" → Requires manually cross-referencing bucket names with component configs

## Expected behavior

The lineage graph should include edges for all component types:

```jsonl
{"source":"extractor:component-id:config-id","target":"table:bucket/table-name","type":"produces"}
{"source":"table:bucket/table-name","target":"writer:component-id:config-id","type":"consumed_by"}
{"source":"table:bucket/table-name","target":"application:component-id:config-id","type":"consumed_by"}
```

This would make the lineage graph a true end-to-end representation of the data pipeline.

## Additional request: Implicit SQL table references

A secondary (but related) gap: the lineage graph is built solely from explicit input/output mappings declared in transformation configs. However, Snowflake transformations can reference tables by fully-qualified name directly in SQL code (e.g., `SELECT * FROM "bucket"."table"`) without declaring them in the input mapping. These implicit dependencies are invisible to the current lineage graph.

Ideally, the export could optionally perform a lightweight static analysis of SQL code blocks to detect `FROM`/`JOIN` clauses referencing fully-qualified table names that are not present in the declared input mapping, and add these as a separate edge type (e.g., `type: "implicit_ref"`).

## Environment

- KBC CLI version: v2.44.0


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`kbc llm export`: Data lineage graph missing extractors, writers, and implicit SQL references #2523

Problem

Current behavior

Impact on AI agents

Expected behavior

Additional request: Implicit SQL table references

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kbc llm export: Data lineage graph missing extractors, writers, and implicit SQL references #2523

Description

Problem

Current behavior

Impact on AI agents

Expected behavior

Additional request: Implicit SQL table references

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`kbc llm export`: Data lineage graph missing extractors, writers, and implicit SQL references #2523