Complete AISBF Operator Guide

AISBF operator guide

Complete AISBF Operator Guide

Detailed step-by-step docs for installation, dashboard usage, network graph, clusters, rotations, autoselect, and CoderAI plus RunPod routing.

Try the Demo More tutorials

What this guide builds

This is the expanded operator path for AISBF: install the gateway, put it behind a TLS proxy, add providers, build rotations, put autoselect above them, and understand how a clustered deployment shares state. The route examples use the configuration shapes implemented in the source: providers have id, type, endpoint, api_key_required, optional provider defaults, and a model list; rotations contain weighted provider/model entries; autoselect contains a fallback plus described candidate models.

1. Install and expose AISBF 2. Use the dashboard safely 3. Build provider → rotation → autoselect 4. Cluster topology 5. CoderAI + RunPod coupling 6. Verification checklist

Step-by-step production install

Create the service account and install packages

sudo apt update
sudo apt install -y python3 python3-venv python3-pip git nginx redis-server mysql-server
sudo useradd --system --home /opt/aisbf --shell /usr/sbin/nologin aisbf || true
sudo mkdir -p /opt/aisbf /etc/aisbf /var/lib/aisbf
sudo chown -R aisbf:aisbf /opt/aisbf /etc/aisbf /var/lib/aisbf

Install AISBF from source or PyPI

# Source checkout keeps the dashboard templates and extension assets available.
sudo -u aisbf git clone https://git.nexlab.net/nexlab/aisbf /opt/aisbf/app
sudo -u aisbf python3 -m venv /opt/aisbf/venv
sudo -u aisbf /opt/aisbf/venv/bin/pip install -U pip
sudo -u aisbf /opt/aisbf/venv/bin/pip install -e /opt/aisbf/app
sudo cp /opt/aisbf/app/config/*.json /etc/aisbf/
sudo chown -R aisbf:aisbf /etc/aisbf

Create the database

sudo mysql <<'SQL'
CREATE DATABASE IF NOT EXISTS aisbf CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE DATABASE IF NOT EXISTS aisbf_response_cache CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
CREATE USER IF NOT EXISTS 'aisbf'@'localhost' IDENTIFIED BY 'change-this-password';
GRANT ALL PRIVILEGES ON aisbf.* TO 'aisbf'@'localhost';
GRANT ALL PRIVILEGES ON aisbf_response_cache.* TO 'aisbf'@'localhost';
FLUSH PRIVILEGES;
SQL

Set production config

{
  "database": {"type":"mysql","mysql_host":"localhost","mysql_port":3306,"mysql_user":"aisbf","mysql_password":"change-this-password","mysql_database":"aisbf"},
  "cache": {"type":"redis","redis_host":"localhost","redis_port":6379,"redis_db":0,"redis_key_prefix":"aisbf:prod:"},
  "response_cache": {"enabled":true,"backend":"redis","ttl":600,"redis_host":"localhost","redis_port":6379,"redis_key_prefix":"aisbf:response:"},
  "server": {"host":"127.0.0.1","port":17765,"protocol":"http"},
  "dashboard": {"enabled":true,"username":"admin","password":"REPLACE-WITH-SHA256-HASH"},
  "auth": {"enabled":true,"tokens":["global-admin-token-only-for-admin-scripts"]}
}

The default source config uses SQLite and memory cache; for clustering, switch config DB to MySQL and shared cache/broker state to Redis.

Create systemd service

[Unit]
Description=AISBF API Gateway
After=network-online.target mysql.service redis-server.service
Wants=network-online.target

[Service]
User=aisbf
Group=aisbf
WorkingDirectory=/opt/aisbf/app
Environment=AISBF_CONFIG_DIR=/etc/aisbf
ExecStart=/opt/aisbf/venv/bin/uvicorn main:app --host 127.0.0.1 --port 17765
Restart=always
RestartSec=3

[Install]
WantedBy=multi-user.target

Put nginx in front

location / {
  proxy_pass http://127.0.0.1:17765;
  proxy_set_header Host $host;
  proxy_set_header X-Forwarded-Proto https;
  proxy_set_header X-Forwarded-Host $host;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_buffering off;
}

Dashboard usage: the safe operating loop

Providers

Add one upstream at a time. Use type=openai for OpenAI-compatible APIs, google, anthropic, ollama, or OAuth-backed providers where configured. Give each provider a stable ID: openai-prod, runpod-gpu-a, ollama-private.

Rotations

Create reliability units. A rotation is what apps should call when several providers can satisfy the same class of work. Weights bias traffic; error cooldowns stop flapping pods or rate-limited APIs from getting hammered.

Autoselect

Create policy units. Autoselect chooses among described model routes using fallback, privacy/NSFW flags, semantic classification, and capability hints. Keep descriptions concrete; vague text produces vague routing.

Tokens and users

For apps, prefer user-scoped tokens and routes such as /api/user/chat/completions or /api/u/{username}/chat/completions. Avoid embedding global admin tokens in customer-facing services.

Provider → rotation → autoselect, with concrete JSON

Provider: one directly callable upstream

{"providers":{"runpod-gpu-a":{"id":"runpod-gpu-a","name":"RunPod GPU A","endpoint":"https://YOUR-POD-ID-8000.proxy.runpod.net/v1","type":"openai","api_key_required":true,"api_key":"$RUNPOD_API_KEY","default_context_size":32768,"default_error_cooldown":120,"models":[{"name":"deepseek-coder-v2","context_size":32768,"max_request_tokens":24000}]}}}

Rotation: reliability and load distribution

{"rotations":{"coding-production":{"model_name":"coding-production","providers":[{"provider_id":"ollama-private","model":"qwen2.5-coder:32b","weight":6,"error_cooldown":60},{"provider_id":"runpod-gpu-a","model":"deepseek-coder-v2","weight":3,"error_cooldown":120},{"provider_id":"openai-prod","model":"gpt-4.1-mini","weight":1,"error_cooldown":300}],"privacy":true,"capabilities":["chat","coding","tool_use"],"default_max_request_tokens":24000}}}

Autoselect: policy and classification

{"engineering":{"model_name":"engineering","description":"Selects the best engineering route by privacy, task length, and provider availability.","selection_model":"general","fallback":"user-rotation/coding-production","classify_privacy":true,"classify_semantic":true,"capabilities":["chat","coding","tool_use"],"available_models":[{"model_id":"user-rotation/coding-production","description":"Private code, logs, secrets, customer data, or normal coding work.","privacy":true},{"model_id":"user-provider/runpod-gpu-a/deepseek-coder-v2","description":"GPU-heavy long refactors when data is not secret."},{"model_id":"user-provider/openai-prod/gpt-4.1-mini","description":"Fast hosted fallback for non-private tasks."}]}}

Cluster operating model

AISBF's practical cluster rule is simple: every node can be stateless at the HTTP edge, but route inventory, users, tokens, analytics, and broker/cache state must be shared. Use MySQL for durable configuration and Redis for cache/coordination. Put a load balancer in front of identical AISBF nodes, and keep provider endpoints reachable from all nodes or broker them through Redis/websocket metadata.

Part	Cluster decision
FastAPI nodes	Run two or more identical AISBF processes behind nginx/HAProxy/cloud LB.
Database	Use MySQL for users, user providers, user rotations, user autoselects, tokens, tiers, usage analytics.
Redis	Use shared Redis for response cache, model/cache metadata, and broker-style worker state if the CoderAI extension is deployed.
Provider secrets	Prefer user-scoped DB-backed providers. For file-backed OAuth credentials, ensure every node can read the right credential file or route those providers to nodes that can.
Failure handling	Use model/provider error_cooldown and token rate limits so retries don't collapse onto a sick upstream.

CoderAI + RunPod coupling

Use RunPod as an OpenAI-compatible GPU endpoint when pods are directly reachable. Use a CoderAI broker pattern when workers may connect from dynamic networks and should be discovered by AISBF rather than exposed publicly. The CoderAI provider shape below reflects the existing static-site operational pattern; the core source tree inspected for this run does not contain a built-in coderai provider handler, so treat it as extension/deployment-specific unless your AISBF build includes that handler.

{"type":"coderai","endpoint":"http://127.0.0.1:11437","api_key_required":false,"coderai_config":{"transport":"broker","broker_enabled":true,"broker_mode":true,"broker_preferred":true,"client_id":"gpu-worker-a","registration_path":"/coderai/register","bridge_path":"/coderai/ws"},"models":[{"name":"qwen2.5-coder:32b","context_size":32768}]}

Then place CoderAI and RunPod in the same rotation only when they are equivalent from the caller's perspective. If one route is private and the other is public, encode that difference in autoselect descriptions and privacy flags instead of hiding it behind one ambiguous model name.

Verification checklist

Health and model list

curl -fsS https://aisbf.example.com/health
curl -fsS https://aisbf.example.com/api/v1/models -H "Authorization: Bearer $AISBF_API_TOKEN"

Direct provider call

curl -fsS https://aisbf.example.com/api/runpod-gpu-a/chat/completions   -H "Authorization: Bearer $AISBF_API_TOKEN" -H "Content-Type: application/json"   -d '{"model":"deepseek-coder-v2","messages":[{"role":"user","content":"say pong"}],"max_tokens":20}'

Rotation/autoselect calls

curl -fsS https://aisbf.example.com/api/rotations/chat/completions   -H "Authorization: Bearer $AISBF_API_TOKEN" -H "Content-Type: application/json"   -d '{"model":"coding-production","messages":[{"role":"user","content":"summarize this stack trace"}]}'

curl -fsS https://aisbf.example.com/api/autoselect/chat/completions   -H "Authorization: Bearer $AISBF_API_TOKEN" -H "Content-Type: application/json"   -d '{"model":"engineering","messages":[{"role":"user","content":"private repo refactor: ..."}]}'

Sharp edge: route names are API contracts. Rename providers, rotations, or autoselect IDs only after checking clients, dashboards, examples, and saved tokens.