Operator connects to LLM providers through theDocumentation Index
Fetch the complete documentation index at: https://mintlify.com/operatoronline/standard-operator/llms.txt
Use this file to discover all available pages before exploring further.
model_list in ~/.operator/config.json. Each entry specifies a model field using a protocol/model-id format. The protocol prefix determines which provider backend is used.
Provider table
| Protocol prefix | Provider | Default api_base | Auth | Notes |
|---|---|---|---|---|
openai/ | OpenAI | https://api.openai.com/v1 | API key | Default protocol when no prefix is given. |
anthropic/ | Anthropic | https://api.anthropic.com/v1 | API key or OAuth | Native Anthropic SDK. Supports auth_method: "oauth" via Claude Max. |
gemini/ | Google Gemini | https://generativelanguage.googleapis.com/v1beta | API key | OpenAI-compatible Gemini endpoint. |
antigravity/ | Google Cloud Code Assist | (OAuth endpoint) | OAuth only | Access Gemini and Claude models via a free Google account. See Antigravity. |
groq/ | Groq | https://api.groq.com/openai/v1 | API key | OpenAI-compatible. Fast inference on Groq LPU hardware. |
deepseek/ | DeepSeek | https://api.deepseek.com/v1 | API key | OpenAI-compatible. Supports deepseek-chat and deepseek-reasoner. |
openrouter/ | OpenRouter | https://openrouter.ai/api/v1 | API key | Routes to 100+ models. Use openrouter/provider/model-id format. |
zhipu/ | Zhipu AI (智谱) | https://open.bigmodel.cn/api/paas/v4 | API key | GLM series models. Also accepts glm/ alias. |
qwen/ | Alibaba Qwen (通义千问) | https://dashscope.aliyuncs.com/compatible-mode/v1 | API key | OpenAI-compatible DashScope endpoint. |
moonshot/ | Moonshot AI (月之暗面) | https://api.moonshot.cn/v1 | API key | Kimi models. Also matched by kimi in model name. |
nvidia/ | NVIDIA NIM | https://integrate.api.nvidia.com/v1 | API key | NVIDIA-hosted inference for Llama, Nemotron, and other models. |
cerebras/ | Cerebras | https://api.cerebras.ai/v1 | API key | OpenAI-compatible. Fast inference on Cerebras hardware. |
volcengine/ | Volcengine (火山引擎) | https://ark.cn-beijing.volces.com/api/v3 | API key | Doubao and other ByteDance models. |
mistral/ | Mistral AI | https://api.mistral.ai/v1 | API key | OpenAI-compatible Mistral endpoint. |
ollama/ | Ollama | http://localhost:11434/v1 | None (use "ollama") | Local model serving. Set api_key to "ollama" as a placeholder. |
vllm/ | vLLM | http://localhost:8000/v1 | Optional | Self-hosted OpenAI-compatible inference. |
shengsuanyun/ | ShengSuanYun (神算云) | https://router.shengsuanyun.com/api/v1 | API key | OpenAI-compatible routing provider. |
litellm/ | LiteLLM proxy | http://localhost:4000/v1 | Optional | Self-hosted LiteLLM proxy server. |
claude-cli/ | Claude CLI | (local process) | Token/session | Delegates to local claude CLI binary. Requires Claude Max subscription. |
codex-cli/ | Codex CLI | (local process) | Token/session | Delegates to local codex CLI binary. |
github-copilot/ | GitHub Copilot | http://localhost:4321 | OAuth | Connects via gRPC or stdio to the Copilot extension. |
Configuration snippets
OpenAI
gpt-5.2, gpt-4o, gpt-4o-mini, o3, o3-mini.
Anthropic
claude-opus-4-6, claude-sonnet-4.6, claude-haiku-3-5.
Gemini (Google AI Studio)
gemini-2.5-pro, gemini-2.0-flash-exp, gemini-1.5-flash.
Antigravity (Google Cloud Code Assist)
Antigravity provides access to Gemini and Claude models through Google’s Cloud Code Assist infrastructure. Authentication uses OAuth 2.0 with PKCE — no API key is required.http://localhost:51121/...) and paste it back into the terminal.
Reliable model IDs (from operator auth models):
| Model ID | Description |
|---|---|
gemini-3-flash | Fast, highly available |
gemini-2.5-flash-lite | Lightweight |
claude-opus-4-6-thinking | Powerful, includes reasoning |
Available models depend on your Google Cloud project. Run
operator auth models to list what your project has access to. Credentials are stored in ~/.operator/auth.json.Groq
llama-3.3-70b-versatile, llama-3.1-8b-instant, mixtral-8x7b-32768.
DeepSeek
deepseek-chat, deepseek-reasoner.
OpenRouter
OpenRouter provides a single endpoint for 100+ models. Use the formatopenrouter/provider/model-id:
Ollama (local)
Ollama serves models locally with an OpenAI-compatible API. No real API key is needed; use the placeholder string"ollama":
ollama pull llama3 before use.
vLLM (local/self-hosted)
Qwen (Alibaba)
qwen-plus, qwen-max, qwen-turbo.
Zhipu AI (GLM)
Moonshot AI (Kimi)
NVIDIA NIM
Cerebras
Volcengine (火山引擎 / Doubao)
Mistral AI
mistral-large-latest, mistral-small-latest, codestral-latest.
GitHub Copilot
grpc (default) or stdio using the connect_mode field.
Claude CLI
Delegates requests to the localclaude CLI binary. Requires a Claude Max subscription.
Codex CLI
Delegates requests to the localcodex CLI binary.
Load balancing
Operator uses round-robin load balancing when multiplemodel_list entries share the same model_name. This lets you distribute load across multiple API keys, regions, or provider accounts:
gpt4 cycles to the next endpoint in sequence.
Adding a custom OpenAI-compatible provider
Any provider that implements the OpenAI Chat Completions API can be added with no code changes:openai/ as the protocol prefix (or omit it) and point api_base at your provider’s endpoint.
api_base override
Every provider’s default api_base can be overridden in the model_list entry. This is useful for:
- Self-hosted deployments — point at your own inference server.
- Regional endpoints — use a geographically closer API endpoint.
- LiteLLM proxy — route through a local proxy that handles multiple backends.
- Vendor proxy services — corporate proxy gateways.