DeepRetrieval is a minimal, production-oriented implementation of a Retrieval-Augmented Generation (RAG) agent exposed through a Model Context Protocol (MCP) server, built without using any external systems.
โจ Key Features
-
๐ง Custom RAG Pipeline
- Document chunking
- TF-IDF / classical vector search
- Context injection into prompts
-
๐ MCP Server (JSON-RPC)
- Fully compliant MCP tool server
- Stdio-based transport
-
๐ Agentic Tool Calling
- Explicit tool routing
- Deterministic execution
- No hidden orchestration
-
๐ Local & Web Search Tools
- Local document search
- Optional web search integration
-
โ๏ธ Zero Framework Dependency
-
๐งฑ Use of ollama that is local llm use for seamless streaming of queries
User Query
โ
Agent Controller
โ
Tool Router
โโโ Document Search Tool (TF-IDF)
โโโ Web Search Tool
โโโ Utility Tools
โ
Context Composer
โ
LLM (Answer strictly from provided context)
๐ Why This Project Matters
Most production AI systems do not rely on public agent frameworks. This project demonstrates how to build:
-
Reliable agents
-
Transparent reasoning
-
Auditable tool execution
-
Vendor-neutral architectures
โall from scratch.
- Built from scratch - No LangChain dependency shows deep understanding
- Proper architecture - Clean separation: chunking, embedding, retrieval, inference
- Working semantic search - Similarity scores (0.83, 0.74, 0.71) show it's finding relevant content
- Database integration - Vector storage with proper indexing
- Token-aware chunking - Smart overlap strategy preserves context
Aditya Katkar
