Backend API for the ArgoCD GlueOps Extension. Provides dynamic links to Grafana dashboards, Loki logs, Tempo traces, Vault secrets, and deployment configurations.
- APM Overview: Application performance monitoring dashboards
- Namespace Overview: Kubernetes namespace dashboards
- Pod Overview: Individual pod metrics
- Logs: Loki log aggregation
- Traces: Tempo distributed tracing
- Vault Secrets: ExternalSecret vault paths
- IaaC: GitHub deployment configuration links
- Python 3.11+
- Docker & Docker Compose
pipenv- Local kubeconfig with access to cluster
-
Install dependencies:
make install
-
Create environment file:
cp .env.local.template .env.local # Edit .env.local with your actual values: # - GRAFANA_BASE_URL (e.g., https://grafana.nonprod.antoniostacos.onglueops.com) # - VAULT_BASE_URL (e.g., https://vault.nonprod.antoniostacos.onglueops.com)
-
Run locally:
make run
This will:
- Start Valkey in Docker
- Export environment variables from .env.local
- Run the API on http://localhost:8000
-
Test the API:
# Health check curl http://localhost:8000/api/v1/health # Readiness check curl http://localhost:8000/api/v1/ready | jq # Get links for an application curl http://localhost:8000/api/v1/applications/taco-backend-prod/links \ -H "Argocd-Application-Name: nonprod:taco-backend-prod" | jq
Test your ArgoCD extension UI without requiring a real Kubernetes cluster:
# All categories with data (200 OK)
curl http://localhost:8000/api/v1/fixtures/all-ok | jq
# Various error states
curl http://localhost:8000/api/v1/fixtures/errors | jq
# Partial data (some categories empty)
curl http://localhost:8000/api/v1/fixtures/partial | jq
# Slow response (3 second delay)
curl http://localhost:8000/api/v1/fixtures/slow | jq
# Mock endpoint (alias for all-ok)
curl http://localhost:8000/api/v1/mock/applications/test-app/links | jqUse these endpoints to:
- Test UI rendering of different states
- Verify error handling
- Demo the extension without cluster access
- Develop offline
Test the Docker image with your local kubeconfig:
# Run the test script (handles volume mounts correctly)
./test-with-kubeconfig.shThis script:
- Copies your
~/.kube/configto.kubeconfig-test - Mounts it into the container at
/app/kubeconfig - Starts the container on port 8001
- Tests health, readiness, and mock endpoints
Note: Kubernetes clusters on 127.0.0.1 (like local k3d) won't be accessible from inside the Docker container. For real K8s testing:
- Use a remote cluster, OR
- Run with
--network host, OR - Deploy to the actual K8s cluster
Manual Docker run example:
# Copy kubeconfig
cp ~/.kube/config .kubeconfig-test
chmod 644 .kubeconfig-test
# Run container with kubeconfig and env mounted
docker run -d --name gluelinks-test \
-p 8001:8000 \
--mount type=bind,source="$(pwd)/.kubeconfig-test",target=/app/kubeconfig,readonly \
--mount type=bind,source="$(pwd)/.env.local",target=/app/.env.local,readonly \
-e KUBECONFIG=/app/kubeconfig \
ghcr.io/glueops/gluelinks-api:latest
# Test it
curl http://localhost:8001/api/v1/health | jq
curl http://localhost:8001/api/v1/fixtures/all-ok | jq
# Cleanup
docker stop gluelinks-test && docker rm gluelinks-test
rm .kubeconfig-testImportant: Always use --mount type=bind instead of -v for file mounts. The -v flag creates directories if the target doesn't exist, which breaks kubeconfig mounting.
If you need to restart or test again after cleanup:
# 1. Ensure Valkey is running
docker-compose up -d valkey
# 2. Verify Valkey is healthy
docker ps | grep valkey
# 3. Start the API (uses .env.local for environment variables)
./run-local.sh
# OR manually:
# pipenv run uvicorn app.main:app --host 0.0.0.0 --port 8000
# 4. In another terminal, test endpoints
curl -s http://localhost:8000/api/v1/health | jq
# 5. Test with real application
curl -s http://localhost:8000/api/v1/applications/taco-backend-prod/links \
-H "Argocd-Application-Name: nonprod:taco-backend-prod" | jq '.categories[] | {id, label, status, link_count: (.links | length)}'
# 6. Test caching (should return same timestamp twice)
echo "First request:" && curl -s http://localhost:8000/api/v1/applications/taco-backend-prod/links \
-H "Argocd-Application-Name: nonprod:taco-backend-prod" | jq -r '.last_updated'
sleep 1
echo "Second request (cached):" && curl -s http://localhost:8000/api/v1/applications/taco-backend-prod/links \
-H "Argocd-Application-Name: nonprod:taco-backend-prod" | jq -r '.last_updated'
# 7. Stop when done
make stopOnce running, visit:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
make install # Install dependencies
make run # Start API locally
make stop # Stop containers
make clean # Clean up containers and volumes
make test # Test with curl
make logs # View Valkey logsAPI won't start:
- Ensure
.env.localexists with proper values - Check if port 8000 is already in use:
lsof -i :8000 - View logs: Check terminal output or container logs
Cache connection issues:
- Verify Valkey is running:
docker ps | grep valkey - Check Valkey health:
docker exec gluelinks-valkey redis-cli ping
Kubernetes connection issues:
- Verify kubeconfig is valid:
kubectl cluster-info - Check if you have access:
kubectl get pods -n nonprod
GRAFANA_BASE_URL: Base URL for Grafana (e.g.,https://grafana.nonprod.example.com)VAULT_BASE_URL: Base URL for Vault (e.g.,https://vault.nonprod.example.com)VALKEY_URL: Valkey/Redis connection URL (e.g.,redis://localhost:6379)
CACHE_TTL_SECONDS: Cache TTL in seconds (default:30)LOG_LEVEL: Logging level (default:INFO, options:DEBUG,INFO,WARNING,ERROR,CRITICAL)
GET /api/v1/health- Health checkGET /api/v1/ready- Readiness check (validates K8s and cache connectivity)
GET /api/v1/applications/{app_name}/links- Get links for an application- Header:
Argocd-Application-Name: {namespace}:{app_name}
- Header:
docker build -t gluelinks-api:latest .-
Update the manifest: Edit
k8s/manifests.yamland update:- Image:
gluelinks-api:latest - Secret values (GRAFANA_BASE_URL, VAULT_BASE_URL)
- Image:
-
Apply manifests:
kubectl apply -f k8s/manifests.yaml
This creates:
- Namespace:
gluelinks-api - ServiceAccount with ClusterRole for read-only access
- Valkey deployment and service
- API deployment (2 replicas) and service
- ConfigMap and Secret for configuration
The API requires read-only access to:
- ArgoCD Applications (
argoproj.io/applications) - Deployments (
apps/deployments) - Pods (
core/pods) - ExternalSecrets (
external-secrets.io/externalsecrets)
Note: No access to actual Secrets is required.
Application names are parsed using regex to extract the service name:
^(?P<service_prefix>.+?)(-[0-9a-f]{9,}(-[a-z0-9]{4,})?|-[0-9]+|-[a-z0-9]{4,6})?$Examples:
taco-backend-prodโtaco-backend-prodtaco-backend-prod-677bfb55b7-942nrโtaco-backend-prod
Responses are cached in Valkey with:
- Key format:
gluelinks:v1:{namespace}:{app_name} - TTL: Configurable via
CACHE_TTL_SECONDS lastUpdatedfield reflects cache timestamp
Resources are discovered using ArgoCD tracking IDs:
- Deployments:
{app}:apps/Deployment:{namespace}/{name} - ExternalSecrets:
{app}:external-secrets.io/ExternalSecret:{namespace}/{name}
All logs are structured JSON with proper log levels:
- DEBUG: Request details, K8s queries
- INFO: Successful operations
- WARNING: Missing resources
- ERROR: API failures, parsing errors
- CRITICAL: Fatal errors
{
"appName": "taco-backend-prod",
"namespace": "nonprod",
"serviceName": "taco-backend-prod",
"lastUpdated": "2025-12-14T00:00:00Z",
"categories": [
{
"id": "apm",
"label": "APM Overview",
"icon": "๐",
"status": "ok",
"message": null,
"links": [
{
"label": "Application Performance Monitoring",
"url": "https://..."
}
]
}
],
"metadata": {
"generatedAt": "2025-12-14T00:00:00Z",
"version": "v1",
"resources": {
"argocdApp": true,
"deployment": true,
"podsFound": 2,
"externalSecretsFound": 1
}
}
}ok: Links present, everything workedempty: No resources found (legitimate,messagefield explains why)error: Something went wrong (messagefield contains error details)
The project uses Python 3.13 and the latest stable versions of all dependencies (as of December 2024).
To update dependencies:
# Edit Pipfile with new version constraints
nano Pipfile
# Update lock file
pipenv lock
# Install updated dependencies
pipenv install
# Test locally
pipenv run uvicorn app.main:app --host 0.0.0.0 --port 8000
# Test Docker build
docker build -t test-image .
# If all tests pass, commit Pipfile and Pipfile.lock
git add Pipfile Pipfile.lock
git commit -m "chore: update dependencies"Current versions:
- Python: 3.13
- FastAPI: 0.115.6
- Uvicorn: 0.34.0
- Kubernetes: 31.0.0
- Pydantic: 2.10.5
- Structlog: 24.4.0
- Valkey: 6.0.2
Problem: "IsADirectoryError: [Errno 21] Is a directory: '/app/kubeconfig'"
Solution: Use --mount type=bind instead of -v for file mounts:
# โ Wrong - creates directory
-v ~/.kube/config:/app/kubeconfig:ro
# โ
Correct - mounts file
--mount type=bind,source=$HOME/.kube/config,target=/app/kubeconfig,readonlyProblem: "Invalid kube-config file. No configuration found."
Solutions:
- Ensure kubeconfig file exists before
docker run - Use absolute paths in mount source
- Verify file permissions (must be readable)
- Check KUBECONFIG environment variable is set correctly
Problem: "Max retries exceeded with url: /apis/... Connection refused"
Cause: Kubeconfig uses 127.0.0.1:6443 which isn't accessible from inside Docker container.
Solutions:
- Use remote cluster (not localhost)
- Run with
--network host(Linux only) - Deploy API to K8s cluster instead of Docker
- Update kubeconfig to use Docker-accessible IP
Problem: Tempo links don't load traces in Grafana
Solutions:
- Set
TEMPO_DATASOURCE_UIDin.env.local - Find UID in Grafana: Settings โ Data Sources โ Tempo โ Copy UID
- Verify service names match exactly (use full app name, not base name)
Example:
# In .env.local
TEMPO_DATASOURCE_UID=de7lydl3hl9fkdSee CLAUDE.md for comprehensive project context and architecture documentation for AI assistants.
[Add your license here]