Troubleshooting & FAQ

Common issues and their solutions, organized by category.

Installation & Setup

Docker containers fail to start

Symptoms: Containers exit immediately or keep restarting.

Solutions:

Check logs: docker compose logs api
Ensure Docker has enough memory (minimum 4GB)
Verify port 5432, 6379, 8000, 3000 are not used by other apps
Run docker compose down -v and restart fresh

Database migration fails

Symptoms: Error when running migrations, tables missing.

Solutions:

Wait for TimescaleDB to be fully ready: docker compose exec timescaledb pg_isready
Check DATABASE_URL in .env matches the docker-compose service name
Run migrations manually: docker compose exec api python -m shared.migrations.run
For fresh start: docker compose down -v && docker compose up -d

Dashboard shows blank page

Symptoms: Browser shows white screen at localhost:3000.

Solutions:

Check dashboard container logs: docker compose logs dashboard
Verify the build completed: look for "Ready" in the logs
Clear browser cache and hard refresh (Ctrl+Shift+R)
Check if API is reachable from dashboard: the API URL should be http://api:8000 inside Docker

Agent Issues

Agent run returns "provider error"

Symptoms: Run fails with provider authentication or connection error.

Solutions:

Verify your API key is set in Settings → Providers (or in .env for the corresponding provider)
Check the provider is configured: OPENAI_API_KEY, ANTHROPIC_API_KEY, or GOOGLE_API_KEY
Test the API key directly: curl https://api.openai.com/v1/models -H "Authorization: Bearer YOUR_KEY"
Check if you have billing enabled on your provider account

Agent stuck in "running" state

Symptoms: Agent run never completes, stays in "running" status.

Solutions:

Check if Celery workers are running: docker compose logs runtime-worker
Verify Redis is accessible: docker compose exec redis redis-cli ping
Cancel the stuck run via API: POST /api/v1/agents/runs/RUN_ID/cancel
Restart the runtime worker: docker compose restart runtime-worker
Add a governance policy with token limits to prevent future hangs

Agent gives wrong or hallucinated answers

Symptoms: Agent responses are inaccurate or make up information.

Solutions:

Add a knowledge base so the agent has real data to reference
Improve the system prompt: be specific about what the agent should and should not do
Use a more capable model (e.g., GPT-4o instead of GPT-3.5)
Add instructions like "If you don't know, say so" to the system prompt
Review run steps in the dashboard to see where the reasoning went wrong

API Issues

401 Unauthorized errors

Symptoms: All API calls return 401.

Solutions:

Include the Authorization header: -H "Authorization: Bearer YOUR_API_KEY"
Generate a new API key in Settings → API Keys
Check if the key has the required scopes for the endpoint
Verify SECRET_KEY in .env hasn't changed (invalidates all tokens)

429 Rate limit exceeded

Symptoms: Getting 429 errors on API calls.

Solutions:

Default rate limit is 60 requests/minute per API key
Increase the limit in .env: RATE_LIMIT_PER_MINUTE=120
Add retry logic with exponential backoff in your client code
Use multiple API keys for different services

CORS errors in browser console

Symptoms: Browser blocks API requests with CORS errors.

Solutions:

Set CORS_ORIGINS in .env to include your frontend URL
For local development: CORS_ORIGINS=http://localhost:3000
For production: CORS_ORIGINS=https://yourdomain.com
Never use CORS_ORIGINS=* in production

Cost & Budgets

Agent blocked by budget limit

Symptoms: Agent run fails with "budget exceeded" message.

Solutions:

Check current usage: GET /api/v1/budgets/BUDGET_ID/usage
Increase the limit: PUT /api/v1/budgets/BUDGET_ID with new limit_usd
Change action to alert_only temporarily while you adjust limits
Wait for the budget period to reset (daily resets at midnight UTC)
Use the Cost Optimizer to reduce spending with model routing

Unexpected high costs

Symptoms: Spending is much higher than expected.

Solutions:

Check which agent is spending the most in the Analytics tab
Look for agent loops: an agent calling tools that trigger more runs
Add token limits via governance policies to cap individual runs
Switch expensive agents to cheaper models (GPT-4o-mini, Claude Haiku)
Enable cost optimizer rules for automatic model routing
Set strict daily budgets with action_on_exceed: "block"

Knowledge Base

Documents stuck in "processing" state

Symptoms: Uploaded documents never finish indexing.

Solutions:

Check runtime worker logs for errors during embedding
Verify your embedding API key is set (default: OpenAI text-embedding-3-small)
Large files may take time; check progress in the Knowledge Base tab
Ensure pgvector extension is installed: CREATE EXTENSION IF NOT EXISTS vector;
Re-upload the document if it's been stuck for more than 10 minutes

Search returns irrelevant results

Symptoms: Knowledge base search doesn't return the right documents.

Solutions:

Reduce chunk_size for more precise retrieval (try 256-512 tokens)
Increase chunk_overlap to preserve context across chunk boundaries
Increase top_k in search to get more results
Add metadata to documents for better filtering
Try different embedding models if available

Deployment

Kubernetes pods keep crashing

Symptoms: Pods in CrashLoopBackOff state.

Solutions:

Check pod logs: kubectl logs pod/api-xxx -n acp
Verify ConfigMaps and Secrets are applied: kubectl get configmap -n acp
Check resource limits: pods may be OOM killed (increase memory limits)
Ensure database is accessible from the cluster
Check readiness and liveness probes match your service endpoints

SSL certificate issues

Symptoms: HTTPS not working, certificate warnings.

Solutions:

Use Caddy for automatic SSL: caddy reverse-proxy --from yourdomain.com --to localhost:3000
Ensure DNS A record points to your server's IP
For Let's Encrypt: port 80 must be open for HTTP-01 challenge
Check certificate expiration with: openssl s_client -connect yourdomain.com:443

Frequently Asked Questions

What LLM providers are supported?

ACP supports OpenAI (GPT-4o, GPT-4o-mini, GPT-3.5-turbo), Anthropic (Claude 3.5 Sonnet, Claude 3 Haiku, Claude 3 Opus), and Google (Gemini Pro, Gemini Flash). You can configure multiple providers simultaneously and use different models for different agents.

Can I use my own API keys (BYOK)?

Yes! Fluxgate is designed for Bring Your Own Key (BYOK). Set your API keys in Settings → Providers or in the environment variables. All LLM calls are made directly to the providers using your keys.

How do I back up my data?

Back up the TimescaleDB database using standard PostgreSQL tools:

Database Backupbash

# Full backup
docker compose exec timescaledb pg_dump -U acp acp > backup.sql

# Restore
docker compose exec -T timescaledb psql -U acp acp < backup.sql

What are the minimum system requirements?

Local development: 2 CPU cores, 4GB RAM, 10GB disk, Docker installed.

Production: 4+ CPU cores, 8GB+ RAM, 20GB+ SSD, Ubuntu 22.04/Debian 12.

Kubernetes: 3+ node cluster with 4GB+ RAM per node.

How do I update to a new version?

Pull the latest code and rebuild:

Update ACPbash

git pull origin main
docker compose down
docker compose build
docker compose up -d
docker compose exec api python -m shared.migrations.run

Need More Help?

If your issue isn't covered here, check the GitHub Issues page or open a new issue with your error logs and environment details.

Architecture

Getting Started