A quick way to get the BharatML Stack Online Feature Store platform up and running locally for development and testing.
- Docker and Docker Compose
- Go 1.22 or later
nc(netcat) command for connectivity checks- Bash shell
grpcurlfor testing gRPC API endpoints (install from https://github.com/fullstorydev/grpcurl)
For Kubernetes support:
-
kind (Kubernetes in Docker) - Required if you want to use Kubernetes features
macOS:
brew install kind
Linux:
curl -Lo ./kind https://kind.sigs.k8s.io/dl/v0.20.0/kind-linux-amd64 chmod +x ./kind sudo mv ./kind /usr/local/bin/kind
For other platforms or the latest version, visit: https://kind.sigs.k8s.io/docs/user/quick-start/#installation
BharatMLStack's Online Feature Store consists of several interconnected services:
Infrastructure Services:
- ScyllaDB: NoSQL database for high-performance feature storage
- MySQL: Relational database for metadata and configuration
- Redis: In-memory data store for caching
- etcd: Distributed key-value store for service coordination
- Apache Kafka: Message broker for feature ingestion pipeline
Application Services:
- Online Feature Store gRPC API Server: High-performance gRPC interface (runs on port 8089)
- ONFS Consumer: Kafka consumer service for real-time feature ingestion from message streams (runs on port 8090)
- Horizon: Backend API service (runs on port 8082)
- Numerix: Matrix operations service (runs on port 8083)
- Trufflebox UI: Frontend web interface (runs on port 3000)
- Inferflow: Inference workflow service (runs on port 8085)
Management Tools:
- etcd Workbench: etcd management interface (runs on port 8081)
- Kafka UI: Kafka cluster management interface (runs on port 8084)
- Kubernetes (kind): Local Kubernetes cluster (optional, requires kind installation)
All services are orchestrated using Docker Compose with pre-built images from GitHub Container Registry (GHCR).
The start script provides an interactive service selector that allows you to choose which application services to run:
./start.shInteractive Options:
- All Services - Starts all application services (API Server, Consumer, Horizon, Numerix, TruffleBox UI, Inferflow)
- Custom Selection - Choose individual services to start
- Exit - Exit without starting
Infrastructure services (ScyllaDB, MySQL, Redis, etcd, Kafka) and Management Tools (etcd-workbench, kafka-ui) are always started.
Optional Services:
- Kubernetes (kind cluster) - Local Kubernetes cluster for container orchestration
Note: Predator is not a Docker Compose service. It is deployed via ArgoCD in Kubernetes. If you need to set up Predator, see the Predator Local Setup Guide for complete instructions.
During interactive mode, you'll be prompted to optionally include Kubernetes. For Predator setup, you'll need to follow the comprehensive guide in PREDATOR_SETUP.md which includes ArgoCD installation, GitHub App setup, and repository configuration.
For testing and development, you can initialize the databases with sample dummy data. This includes:
- Sample entities (user, catalog)
- Feature groups and features
- Example configurations for Inferflow and Numerix
- Test data in MySQL, ScyllaDB, and etcd
Usage:
# Interactive mode with dummy data
./start.sh --dummy-data
# Start all services with dummy data (non-interactive)
./start.sh --all --dummy-data
# Combine with version specification
ONFS_VERSION=v1.2.0 ./start.sh --all --dummy-data
# Combine with local builds
ONFS_VERSION=local ./start.sh --dummy-dataWhat gets initialized:
- ScyllaDB: Creates keyspace, tables, and inserts sample feature data
- MySQL: Creates all tables and inserts:
- API resolvers
- Role permissions
- Sample entities, feature groups, and features
- Inferflow and Numerix configurations
- Service deployable configurations
- etcd: Sets up configuration keys with:
- Security tokens
- Entity and feature group configurations
- Inferflow component configurations
- Model and expression configurations
Note: The dummy data initialization scripts run automatically when you use the --dummy-data flag. The regular initialization scripts (which create empty schemas) are used by default.
Services can run independently based on your needs:
- ONFS API Server - For direct gRPC feature operations
- ONFS Consumer - For real-time Kafka-based feature ingestion (independent of API Server)
- Horizon + TruffleBox UI - For web-based feature store management
- Numerix - For matrix operations
- Inferflow - For inference workflow management
You can specify versions for application services using environment variables:
# Specify individual service versions
ONFS_VERSION=v1.2.3 ./start.sh
# Specify consumer version
ONFS_CONSUMER_VERSION=v1.0.0-beta-d74137 ./start.sh
# Combine multiple versions
ONFS_VERSION=v1.2.0 HORIZON_VERSION=v2.1.0 TRUFFLEBOX_VERSION=v1.0.5 INFERFLOW_VERSION=v1.0.0 ./start.sh
# Start all services with specific versions
ONFS_VERSION=v1.2.0 ONFS_CONSUMER_VERSION=v1.0.0-beta-d74137 INFERFLOW_VERSION=v1.0.0 ./start.sh
# Build services from local source code
ONFS_VERSION=local HORIZON_VERSION=local INFERFLOW_VERSION=local ./start.shAvailable Environment Variables:
ONFS_VERSION- Online Feature Store API Server versionONFS_CONSUMER_VERSION- ONFS Consumer versionHORIZON_VERSION- Horizon Backend versionNUMERIX_VERSION- Numerix Matrix Operations versionTRUFFLEBOX_VERSION- TruffleBox UI versionINFERFLOW_VERSION- Inferflow version
Version Formats:
latest(default) - Latest stable releasemain- Latest development buildv1.2.3- Specific version tagsha-abcd1234- Specific commit SHAlocal- Build from local Dockerfile (requires source code in workspace)
Non-interactive Mode:
# Start all services without prompts
./start.sh --all
# Start with specific versions non-interactively
ONFS_VERSION=v1.2.0 ./start.sh --allYou can build services from local source code by setting the version to local. This is useful for development and testing changes:
# Build a single service from local source
ONFS_VERSION=local ./start.sh
# Build multiple services from local source
ONFS_VERSION=local HORIZON_VERSION=local INFERFLOW_VERSION=local ./start.sh
# Build all services from local source
ONFS_VERSION=local ONFS_CONSUMER_VERSION=local HORIZON_VERSION=local \
NUMERIX_VERSION=local TRUFFLEBOX_VERSION=local INFERFLOW_VERSION=local ./start.shRequirements for Local Builds:
- Source code must be available in the parent directory (relative to
quick-start/) - Python 3 must be installed (for modifying docker-compose.yml)
- Docker must be able to build the Dockerfiles
How it works:
- The script copies the source directory to the workspace
- Modifies
docker-compose.ymlto usebuildinstead ofimage - Docker Compose builds the image from the local Dockerfile
- The built image is used to start the container
Note: Local builds require the source code structure to match the expected Dockerfile locations:
online-feature-store/cmd/api-server/DockerFilefor ONFS API Serveronline-feature-store/cmd/consumer/DockerFilefor ONFS Consumerhorizon/cmd/horizon/Dockerfilefor Horizonnumerix/Dockerfilefor Numerixtrufflebox-ui/DockerFilefor TruffleBox UIinferflow/cmd/inferflow/Dockerfilefor Inferflow
Advanced: Direct docker-compose Usage
You can also work directly with docker-compose in the workspace directory:
# Change to workspace directory
cd workspace
# Start specific service with version
ONFS_VERSION=main docker-compose up -d onfs-api-server
# Start multiple services with different versions
ONFS_VERSION=v1.2.0 HORIZON_VERSION=v2.1.0 docker-compose up -d onfs-api-server horizon
# Restart a service with a different version
ONFS_CONSUMER_VERSION=v1.0.0-beta-d74137 docker-compose up -d onfs-consumer
# View specific service
docker-compose ps onfs-api-serverThis will:
- Check for Go installation (1.22+ required)
- Create a workspace directory with configuration files
- Pull and start all services using
docker-compose up -d - Wait for services to become healthy
- Initialize databases with required schemas
- Display access information and helpful commands
Once complete, you can access:
- Trufflebox UI: http://localhost:3000
- Horizon API: http://localhost:8082
- Online Feature Store gRPC API: http://localhost:8089
- ONFS Consumer: http://localhost:8090 (health check)
- Numerix: http://localhost:8083
- Inferflow: http://localhost:8085
- etcd Workbench: http://localhost:8081
- Kafka UI: http://localhost:8084
- Kubernetes (if enabled): Use
kubectl cluster-info --context kind-bharatml-stack - ArgoCD (if installed): http://localhost:8087 (requires port-forward - see Predator Setup Guide)
Note: Predator is accessed through ArgoCD UI after completing the setup in PREDATOR_SETUP.md.
To stop all services:
./stop.shTo stop and completely purge all containers, volumes, and workspace:
./stop.sh --purge- URL: http://localhost:3000
- Default admin credentials:
- Email:
admin@admin.com - Password:
admin
- Email:
- Horizon API: http://localhost:8082
- Health check: http://localhost:8082/health
- ONFS gRPC API: http://localhost:8089
- Health check: http://localhost:8089/health/self
- ONFS Consumer: http://localhost:8090
- Health check: http://localhost:8090/health/self
- Numerix: http://localhost:8083
- Health check: http://localhost:8083/health
- Inferflow: http://localhost:8085
- Health check: http://localhost:8085/health/self
Kubernetes (kind cluster):
- Cluster Name:
bharatml-stack - Context:
kind-bharatml-stack - Access: Use
kubectlcommands with the context:kubectl cluster-info --context kind-bharatml-stack kubectl get nodes --context kind-bharatml-stack
Note: For Predator setup on Kubernetes, see the Predator Local Setup Guide which includes cluster creation, node labeling, ArgoCD installation, and all required configurations.
Predator is a service deployment and management system that runs on Kubernetes and uses ArgoCD for GitOps-based deployments. It is not a Docker Compose service and requires a comprehensive setup process.
For complete Predator and ArgoCD setup instructions, see the Predator Local Setup Guide.
The PREDATOR_SETUP.md guide includes:
- Automated setup script for quick installation
- Step-by-step manual setup instructions
- Creating and configuring a Kubernetes cluster (kind, minikube, or Docker Desktop)
- Installing and configuring ArgoCD
- Setting up GitHub App for repository access
- Installing required CRDs (Flagger, KEDA) and PriorityClass
- Configuring ArgoCD repository connections
- Setting up automated application onboarding
- Complete Predator deployment workflow
The setup script handles all the complexity and provides an easy one-command installation. Follow the guide for detailed instructions.
Once you have Predator deployed in Kubernetes (via ArgoCD), you can connect Inferflow (running in Docker) to Predator for model inference. This setup enables Inferflow to call ML models hosted in your Kubernetes cluster.
- Kubernetes cluster running with Predator deployed (follow PREDATOR_SETUP.md)
- ArgoCD installed and accessible
- Predator service deployed via ArgoCD with
app_name: predatorin your namespace (e.g.,prd-predator)
Inferflow runs in a Docker container, while Predator runs in Kubernetes. By default, Docker containers cannot reach services running on the host machine (localhost). The extra_hosts configuration in docker-compose.yml creates a special network route that maps a hostname to host-gateway, which is Docker's special DNS name that resolves to the host machine's IP address.
This is why you'll see this configuration in the Inferflow service:
inferflow:
extra_hosts:
- "predator.prd.meesho.int:host-gateway"This allows the Inferflow container to reach Predator services running on your host machine (via port-forward).
1. Deploy Predator in Kubernetes
Follow the complete Predator Setup Guide to deploy Predator. Ensure your Predator deployment has:
- App Name:
predator(or your chosen name matching the namespace) - Namespace:
prd-predator(or your environment-specific namespace) - Service exposed: The Predator service should be accessible via Kubernetes service
2. Port-Forward Predator Service to Host
Predator services in Kubernetes are not directly accessible from Docker containers. You need to create a port-forward tunnel from your host machine to the Kubernetes service:
# Port-forward Predator service to localhost:8090
# Format: kubectl -n <namespace> port-forward svc/<service-name> <local-port>:<service-port>
kubectl -n prd-predator port-forward svc/prd-predator 8090:80 &Important Notes:
- Port
8090is the local port on your host machine - Port
80is the Kubernetes service port (not the target pod port 8001) - Keep this terminal session running or run it in the background with
& - The port-forward must be active for Inferflow to reach Predator
3. Configure Model Endpoint in etcd
In your Inferflow model configuration (stored in etcd), set the endpoint to match the extra_hosts configuration:
{
"model_end_point": "predator.prd.meesho.int:8090"
}Key Points:
- Use the hostname from
extra_hostsconfiguration:predator.prd.meesho.int - Use port
8090(the local port from port-forward command) - Inferflow will resolve this hostname to the host gateway and connect via port-forward
4. Update Inferflow Deadline (if needed)
If you experience DeadlineExceeded errors, increase the timeout in workspace/docker-compose.yml:
inferflow:
environment:
- EXTERNAL_SERVICE_PREDATOR_DEADLINE=5000 # Increase from 200ms to 5000msPort-forwarding to Kubernetes adds latency, so higher timeouts are recommended.
5. Configure Horizon with Deployable ID
After deploying Predator through Horizon's deployment workflow, you'll receive a deployable_id. This ID is stored in the service_deployable_configs table in MySQL.
To configure Horizon for testing:
a. Find the Deployable ID:
# Connect to MySQL
docker exec -it mysql mysql -uroot -proot testdb
# Query for your Predator deployable
SELECT id, service_name, app_name, environment
FROM service_deployable_configs
WHERE service_name = 'predator'
ORDER BY id DESC LIMIT 5;b. Copy the id value from the query result
c. Update Horizon's environment variable in workspace/docker-compose.yml:
horizon:
environment:
- TEST_DEPLOYABLE_ID=<your-deployable-id>
- TEST_GPU_DEPLOYABLE_ID=<your-deployable-id> # If using GPUd. Restart Horizon:
cd workspace && docker-compose restart horizon6. Verify the Connection
Test the connectivity from Inferflow to Predator:
# Check if port-forward is active
netstat -an | grep 8090
# Test from within Docker network
docker exec inferflow-healthcheck nc -zv 172.18.0.1 8090
# Check Inferflow logs for connection status
docker logs inferflow 2>&1 | grep -i "predator\|deadline" | tail -207. Monitor for Issues
Common issues and their solutions:
| Issue | Solution |
|---|---|
DeadlineExceeded errors |
Increase EXTERNAL_SERVICE_PREDATOR_DEADLINE to 5000+ ms |
connection refused |
Verify port-forward is running: ps aux | grep port-forward |
Invalid auth token |
Check Predator authentication configuration |
| DNS resolution fails | Verify extra_hosts in docker-compose.yml |
8. Complete Configuration Example
Your final workspace/docker-compose.yml should have:
inferflow:
extra_hosts:
- "predator.prd.meesho.int:host-gateway"
environment:
- EXTERNAL_SERVICE_PREDATOR_PORT=8090
- EXTERNAL_SERVICE_PREDATOR_GRPC_PLAIN_TEXT=true
- EXTERNAL_SERVICE_PREDATOR_CALLER_ID=inferflow
- EXTERNAL_SERVICE_PREDATOR_CALLER_TOKEN=inferflow
- EXTERNAL_SERVICE_PREDATOR_DEADLINE=5000 # 5 seconds
horizon:
environment:
- TEST_DEPLOYABLE_ID=<your-deployable-id>
- TEST_GPU_DEPLOYABLE_ID=<your-deployable-id>-
MySQL:
- Host: localhost
- Port: 3306
- Username: root
- Password: root
- Database: testdb
-
ScyllaDB:
- Host: localhost
- Port: 9042
- Keyspace: onfs
-
Redis:
- Host: localhost
- Port: 6379
-
etcd:
- Endpoint: http://localhost:2379
- Workbench: http://localhost:8081
-
Kafka:
- Bootstrap Servers: localhost:9092
- Kafka UI: http://localhost:8084
- Default Topic: online-feature-store.feature_ingestion
# View logs for all services
cd workspace && docker-compose logs -f
# View logs for specific services
cd workspace && docker-compose logs -f horizon
cd workspace && docker-compose logs -f trufflebox-ui
cd workspace && docker-compose logs -f onfs-api-server
cd workspace && docker-compose logs -f onfs-consumer
cd workspace && docker-compose logs -f inferflow
cd workspace && docker-compose logs -f kafka
# View logs for multiple services
cd workspace && docker-compose logs -f onfs-api-server onfs-consumer inferflow# Restart a specific service
cd workspace && docker-compose restart horizon
# Stop all services
cd workspace && docker-compose down
# Start services again
cd workspace && docker-compose up -d
# Check service status
cd workspace && docker-compose ps-
Port conflicts: The following ports must be available:
Port Service 3000 TruffleBox UI 8081 etcd Workbench 8082 Horizon API 8083 Numerix 8084 Kafka UI 8085 Inferflow 8089 ONFS gRPC API 8090 ONFS Consumer 9092 Kafka 9042 ScyllaDB 3306 MySQL 6379 Redis 2379 etcd If any ports are in use, stop conflicting services or modify port mappings in
docker-compose.yml. -
Docker network issues: If containers can't communicate, try recreating:
docker network rm onfs-network docker network create onfs-network
-
Service health checks failing: Check if all infrastructure services (databases) are running:
cd workspace && docker-compose ps
-
Image pull issues: Ensure you have access to GitHub Container Registry:
docker login ghcr.io
-
Kafka consumer not receiving messages:
# Check if Kafka topic exists docker exec -it broker kafka-topics.sh --bootstrap-server localhost:9092 --list # Check consumer group status docker exec -it broker kafka-consumer-groups.sh --bootstrap-server localhost:9092 --group onfs-consumer-group --describe # Verify Kafka UI open http://localhost:8084
-
Consumer lag issues: Monitor consumer performance via Kafka UI (http://localhost:8084) to identify bottlenecks
Services start in the following order:
- Infrastructure services (ScyllaDB, MySQL, Redis, etcd, Kafka)
- Infrastructure init services (kafka-init, db-init) - only started if containers don't exist (preserves modifications)
- Application services (can run independently):
- ONFS API Server - depends on databases (ScyllaDB, MySQL, Redis, etcd)
- ONFS Consumer - depends on Kafka and databases (independent of API Server)
- Horizon - depends on databases and ScyllaDB
- Numerix - depends on etcd
- Inferflow - depends on etcd, ONFS API Server, and Numerix
- Trufflebox UI (depends on Horizon)
Key Points:
- ONFS Consumer can run without ONFS API Server
- Services can be started individually based on use case
- All services depend on their respective infrastructure components
If a service fails to start, check its dependencies are healthy first.
The workspace directory contains all runtime configuration:
workspace/docker-compose.yml- Complete service orchestrationworkspace/check_db_and_init.sh- Database initialization script
You can modify environment variables in the docker-compose.yml file and restart services.
We welcome contributions from the community! Please see our Contributing Guide for details on how to get started.
- 💬 Discord: Join our community chat
- 🐛 Issues: Report bugs and request features on GitHub Issues
- 📧 Email: Contact us at ml-oss@meesho.com
BharatMLStack is open-source software licensed under the BharatMLStack Business Source License 1.1.