AISBFAI Service Broker Framework — AI Should Be Free
Detailed step-by-step docs for installation, dashboard usage, network graph, clusters, rotations, autoselect, and CoderAI plus RunPod routing.
This is the expanded operator path for AISBF: install the gateway, put it behind a TLS proxy, add providers, build rotations, put autoselect above them, and understand how a clustered deployment shares state. The route examples use the configuration shapes implemented in the source: providers have id, type, endpoint, api_key_required, optional provider defaults, and a model list; rotations contain weighted provider/model entries; autoselect contains a fallback plus described candidate models.
sudo apt update
sudo apt install -y python3 python3-venv python3-pip git nginx redis-server mysql-server
sudo useradd --system --home /opt/aisbf --shell /usr/sbin/nologin aisbf || true
sudo mkdir -p /opt/aisbf /etc/aisbf /var/lib/aisbf
sudo chown -R aisbf:aisbf /opt/aisbf /etc/aisbf /var/lib/aisbf# Source checkout keeps the dashboard templates and extension assets available.
sudo -u aisbf git clone https://git.nexlab.net/nexlab/aisbf /opt/aisbf/app
sudo -u aisbf python3 -m venv /opt/aisbf/venv
sudo -u aisbf /opt/aisbf/venv/bin/pip install -U pip
sudo -u aisbf /opt/aisbf/venv/bin/pip install -e /opt/aisbf/app
sudo cp /opt/aisbf/app/config/*.json /etc/aisbf/
sudo chown -R aisbf:aisbf /etc/aisbfsudo mysql <<'SQL'
CREATE DATABASE IF NOT EXISTS aisbf CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE DATABASE IF NOT EXISTS aisbf_response_cache CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE USER IF NOT EXISTS 'aisbf'@'localhost' IDENTIFIED BY 'change-this-password';
GRANT ALL PRIVILEGES ON aisbf.* TO 'aisbf'@'localhost';
GRANT ALL PRIVILEGES ON aisbf_response_cache.* TO 'aisbf'@'localhost';
FLUSH PRIVILEGES;
SQL{
"database": {"type":"mysql","mysql_host":"localhost","mysql_port":3306,"mysql_user":"aisbf","mysql_password":"change-this-password","mysql_database":"aisbf"},
"cache": {"type":"redis","redis_host":"localhost","redis_port":6379,"redis_db":0,"redis_key_prefix":"aisbf:prod:"},
"response_cache": {"enabled":true,"backend":"redis","ttl":600,"redis_host":"localhost","redis_port":6379,"redis_key_prefix":"aisbf:response:"},
"server": {"host":"127.0.0.1","port":17765,"protocol":"http"},
"dashboard": {"enabled":true,"username":"admin","password":"REPLACE-WITH-SHA256-HASH"},
"auth": {"enabled":true,"tokens":["global-admin-token-only-for-admin-scripts"]}
}The default source config uses SQLite and memory cache; for clustering, switch config DB to MySQL and shared cache/broker state to Redis.
[Unit]
Description=AISBF API Gateway
After=network-online.target mysql.service redis-server.service
Wants=network-online.target
[Service]
User=aisbf
Group=aisbf
WorkingDirectory=/opt/aisbf/app
Environment=AISBF_CONFIG_DIR=/etc/aisbf
ExecStart=/opt/aisbf/venv/bin/uvicorn main:app --host 127.0.0.1 --port 17765
Restart=always
RestartSec=3
[Install]
WantedBy=multi-user.targetlocation / {
proxy_pass http://127.0.0.1:17765;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto https;
proxy_set_header X-Forwarded-Host $host;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_buffering off;
}Add one upstream at a time. Use type=openai for OpenAI-compatible APIs, google, anthropic, ollama, or OAuth-backed providers where configured. Give each provider a stable ID: openai-prod, runpod-gpu-a, ollama-private.
Create reliability units. A rotation is what apps should call when several providers can satisfy the same class of work. Weights bias traffic; error cooldowns stop flapping pods or rate-limited APIs from getting hammered.
Create policy units. Autoselect chooses among described model routes using fallback, privacy/NSFW flags, semantic classification, and capability hints. Keep descriptions concrete; vague text produces vague routing.
For apps, prefer user-scoped tokens and routes such as /api/user/chat/completions or /api/u/{username}/chat/completions. Avoid embedding global admin tokens in customer-facing services.
{"providers":{"runpod-gpu-a":{"id":"runpod-gpu-a","name":"RunPod GPU A","endpoint":"https://YOUR-POD-ID-8000.proxy.runpod.net/v1","type":"openai","api_key_required":true,"api_key":"$RUNPOD_API_KEY","default_context_size":32768,"default_error_cooldown":120,"models":[{"name":"deepseek-coder-v2","context_size":32768,"max_request_tokens":24000}]}}}{"rotations":{"coding-production":{"model_name":"coding-production","providers":[{"provider_id":"ollama-private","model":"qwen2.5-coder:32b","weight":6,"error_cooldown":60},{"provider_id":"runpod-gpu-a","model":"deepseek-coder-v2","weight":3,"error_cooldown":120},{"provider_id":"openai-prod","model":"gpt-4.1-mini","weight":1,"error_cooldown":300}],"privacy":true,"capabilities":["chat","coding","tool_use"],"default_max_request_tokens":24000}}}{"engineering":{"model_name":"engineering","description":"Selects the best engineering route by privacy, task length, and provider availability.","selection_model":"general","fallback":"user-rotation/coding-production","classify_privacy":true,"classify_semantic":true,"capabilities":["chat","coding","tool_use"],"available_models":[{"model_id":"user-rotation/coding-production","description":"Private code, logs, secrets, customer data, or normal coding work.","privacy":true},{"model_id":"user-provider/runpod-gpu-a/deepseek-coder-v2","description":"GPU-heavy long refactors when data is not secret."},{"model_id":"user-provider/openai-prod/gpt-4.1-mini","description":"Fast hosted fallback for non-private tasks."}]}}AISBF's practical cluster rule is simple: every node can be stateless at the HTTP edge, but route inventory, users, tokens, analytics, and broker/cache state must be shared. Use MySQL for durable configuration and Redis for cache/coordination. Put a load balancer in front of identical AISBF nodes, and keep provider endpoints reachable from all nodes or broker them through Redis/websocket metadata.
| Part | Cluster decision |
|---|---|
| FastAPI nodes | Run two or more identical AISBF processes behind nginx/HAProxy/cloud LB. |
| Database | Use MySQL for users, user providers, user rotations, user autoselects, tokens, tiers, usage analytics. |
| Redis | Use shared Redis for response cache, model/cache metadata, and broker-style worker state if the CoderAI extension is deployed. |
| Provider secrets | Prefer user-scoped DB-backed providers. For file-backed OAuth credentials, ensure every node can read the right credential file or route those providers to nodes that can. |
| Failure handling | Use model/provider error_cooldown and token rate limits so retries don't collapse onto a sick upstream. |
Use RunPod as an OpenAI-compatible GPU endpoint when pods are directly reachable. Use a CoderAI broker pattern when workers may connect from dynamic networks and should be discovered by AISBF rather than exposed publicly. The CoderAI provider shape below reflects the existing static-site operational pattern; the core source tree inspected for this run does not contain a built-in coderai provider handler, so treat it as extension/deployment-specific unless your AISBF build includes that handler.
{"type":"coderai","endpoint":"http://127.0.0.1:11437","api_key_required":false,"coderai_config":{"transport":"broker","broker_enabled":true,"broker_mode":true,"broker_preferred":true,"client_id":"gpu-worker-a","registration_path":"/coderai/register","bridge_path":"/coderai/ws"},"models":[{"name":"qwen2.5-coder:32b","context_size":32768}]}Then place CoderAI and RunPod in the same rotation only when they are equivalent from the caller's perspective. If one route is private and the other is public, encode that difference in autoselect descriptions and privacy flags instead of hiding it behind one ambiguous model name.
curl -fsS https://aisbf.example.com/health
curl -fsS https://aisbf.example.com/api/v1/models -H "Authorization: Bearer $AISBF_API_TOKEN"curl -fsS https://aisbf.example.com/api/runpod-gpu-a/chat/completions -H "Authorization: Bearer $AISBF_API_TOKEN" -H "Content-Type: application/json" -d '{"model":"deepseek-coder-v2","messages":[{"role":"user","content":"say pong"}],"max_tokens":20}'curl -fsS https://aisbf.example.com/api/rotations/chat/completions -H "Authorization: Bearer $AISBF_API_TOKEN" -H "Content-Type: application/json" -d '{"model":"coding-production","messages":[{"role":"user","content":"summarize this stack trace"}]}'
curl -fsS https://aisbf.example.com/api/autoselect/chat/completions -H "Authorization: Bearer $AISBF_API_TOKEN" -H "Content-Type: application/json" -d '{"model":"engineering","messages":[{"role":"user","content":"private repo refactor: ..."}]}'