Skip to content

RandomCodeSpace/docsiq

docsiq

CI CodeQL OpenSSF Best Practices OpenSSF Scorecard Go Report Card License: MIT Release Go Version

A single-binary GraphRAG knowledge base — index documents, extract an entity graph, ask questions across it, and browse the result in an embedded React UI over MCP.

Three-minute onboarding

# 1. Install (Linux amd64 shown; macOS arm64 is published alongside)
VERSION=$(curl -s https://api.github.com/repos/RandomCodeSpace/docsiq/releases/latest | grep tag_name | cut -d '"' -f4)
curl -LO "https://github.com/RandomCodeSpace/docsiq/releases/latest/download/docsiq-${VERSION}-linux-amd64"
chmod +x "docsiq-${VERSION}-linux-amd64" && sudo mv "docsiq-${VERSION}-linux-amd64" /usr/local/bin/docsiq

# 2. Index the sample corpus
git clone https://github.com/RandomCodeSpace/docsiq && cd docsiq
docsiq init && docsiq index docs/samples/

# 3. Ask a question
docsiq search "What are the main themes in this corpus?"

For a UI session:

docsiq serve
# → http://localhost:8080

Full walk-through with expected output: docs/quickstart.md.

Screenshots

Home Graph
Home view Graph view

More: Notes · Documents · MCP Console.

What it does

docsiq is a GraphRAG-powered knowledge base that runs as a single Go binary. It ingests unstructured documents, builds a knowledge graph with community detection, persists wikilinked markdown notes, and exposes the whole thing over MCP + an embedded React SPA on one port.

Inspired by Microsoft GraphRAG; storage is CGO-backed SQLite (mattn/go-sqlite3 with FTS5) + the sqlite-vec extension for ANN vector search.

Features

  • GraphRAG pipeline — load → chunk → embed → extract entities / relationships / claims → detect communities, all in one docsiq index run.
  • Notes subsystem — markdown on disk with [[wikilinks]], project scopes, cross-project references, and a live note graph view. Works without any LLM configured.
  • Interactive graph — SVG force-directed viz with d3-zoom (pinch/wheel pan/zoom 0.1×–40×), hover-to-highlight neighbourhood, degree-scaled nodes.
  • Community detection — pure-Go Louvain, hierarchical, no external deps.
  • Three LLM providers — Azure OpenAI, OpenAI, Ollama — via tmc/langchaingo. Set provider: "none" to run the server in notes-only mode with no LLM.
  • MCP server — 12+ tools (local/global search, graph walk, community reports, note read/write, …) exposed at /mcp via Streamable HTTP transport with session handshake.
  • Embedded SPA — React 19 + Tailwind 4 + shadcn/ui, served from //go:embed ui/dist. PWA-installable with manifest + service worker.
  • Per-repo projects — each scope has its own SQLite store + notes directory, addressable by slug.

UI

  • Stack: React 19, Vite 6, Tailwind 4, shadcn/ui primitives, Geist typography, Lucide icons.
  • Architecture: CSS lives in a single globals.css with an @layer components section; JSX uses semantic class names only; shadcn primitives are the only place Tailwind utilities live inline.
  • Navigation: labelled sidebar (Home · Notes · Documents · Graph · MCP) with ⌘K command palette.
  • Responsiveness: mobile drawer via shadcn Sheet; iOS safe-area respected; inputs forced to 16px below sm: to kill Safari auto-zoom.
  • PWA: manifest + 192/512 PNG icons + minimal service worker, installable on Android/iOS.
  • Hard reload: refresh button in the header purges service worker + CacheStorage and reloads from network — mobile-friendly ⌘⇧R substitute.

Keyboard shortcuts

Key Action
⌘K / Ctrl+K Command palette
G H Home
G N Notes
G D Documents
G G Graph
G M MCP console
⌘/ Toggle tree drawer (Notes)
⌘L Toggle links drawer (Notes)

MCP

docsiq speaks the MCP Streamable HTTP transport at POST /mcp. The UI's MCP Console (inspector-style) gives you the same tool list with typed argument forms. For external clients (Claude Desktop, Cursor, etc.) register the server URL directly, or use the hooks helper:

docsiq hooks install --client claude-desktop

Architecture

cmd/            CLI commands (cobra): index, serve, search, projects, init, hooks, vec
internal/
  api/          REST API + /mcp handler
  chunker/      Text splitting (textsplitter.RecursiveCharacter)
  community/    Louvain detection + summaries
  config/       Viper YAML config + env override
  crawler/      Web page crawler
  embedder/     Batched text → vector (nil-safe when provider=none)
  extractor/    LLM-based entity / relationship / claim extraction
  llm/          Provider abstraction (Azure, OpenAI, Ollama, none)
  loader/       Document loaders (PDF, DOCX, TXT, MD, web)
  mcp/          Streamable HTTP MCP server (12+ tools)
  notes/        Per-project markdown + wikilinks + graph builder
  pipeline/     5-phase indexing pipeline
  project/      Project registry (git-remote-scoped slugs)
  search/       Query engine (local + global + hybrid)
  store/        SQLite + FTS5 + vector index
  vectorindex/  HNSW ANN vector search
ui/             React 19 + Vite 6 SPA, embedded at compile time

Configuration

Config lives at ~/.docsiq/config.yaml; every key can be overridden by an env var with prefix DOCSIQ_ (dots → underscores, uppercased). A fully annotated reference with every option, default, and env var is at configs/docsiq.example.yaml.

server:
  host: 0.0.0.0
  port: 37778
  api_key: ""          # if set, UI + API require Authorization: Bearer <key>

llm:
  provider: ollama     # azure | openai | ollama | none
  ollama:
    base_url: http://localhost:11434
    chat_model: llama3.2
    embed_model: nomic-embed-text

No LLM? Set provider: none. The server still runs notes, wikilinks, graph, tree, and notes-search. Endpoints that need the model (POST /api/search, POST /api/upload, /mcp tool calls that embed or extract) return 503 {"code": "llm_disabled"}.

Build from source

Prerequisites: Go ≥ 1.25, Node ≥ 22, and a working C toolchain (build-essential on Debian/Ubuntu, xcode-select --install on macOS, MinGW-w64 / MSYS2 on Windows). Without gcc on PATH, CGO is silently disabled and the build fails at the call site with a misleading undefined: sqlitevec.LoadInto rather than a clear toolchain error, because internal/sqlitevec/load.go is gated by //go:build cgo. Full list: docs/getting-started.md.

# First time on a connected machine
npm --prefix ui ci                          # install UI deps
go mod download                             # Go deps

# Build
npm --prefix ui run build                   # produces ui/dist/
CGO_ENABLED=1 go build -tags sqlite_fts5 -o docsiq ./

CI builds UI first and passes ui/dist/ to each Go job as an artifact. ui/dist/ is not committed; only a tiny placeholder ui/dist/index.html exists in the repo to keep //go:embed ui/dist happy at compile time.

Tests

# Go
CGO_ENABLED=1 go test -tags sqlite_fts5 ./...
# Go -race integration
CGO_ENABLED=1 go test -tags "sqlite_fts5 integration" -race -timeout 1200s ./...

# UI
npm --prefix ui run typecheck
npm --prefix ui test -- --run --coverage
npm --prefix ui run build

Community

License

MIT. See LICENSE.

About

GraphRAG-powered local documentation search — entity extraction, vector + graph retrieval, MCP server. Written in Go.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors