Day 0 Support: Claude Opus 4.7
LiteLLM now supports Claude Opus 4.7 on Day 0. Use it across Anthropic, Azure, Vertex AI, and Bedrock through the LiteLLM AI Gateway.
Docker Image​
docker pull ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7
Usage - Anthropic​
- LiteLLM Proxy
1. Setup config.yaml
model_list:
- model_name: claude-opus-4-7
litellm_params:
model: anthropic/claude-opus-4-7
api_key: os.environ/ANTHROPIC_API_KEY
2. Start the proxy
docker run -d \
-p 4000:4000 \
-e ANTHROPIC_API_KEY=$ANTHROPIC_API_KEY \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7 \
--config /app/config.yaml
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Usage - Azure​
- LiteLLM Proxy
1. Setup config.yaml
model_list:
- model_name: claude-opus-4-7
litellm_params:
model: azure_ai/claude-opus-4-7
api_key: os.environ/AZURE_AI_API_KEY
api_base: os.environ/AZURE_AI_API_BASE # https://<resource>.services.ai.azure.com
2. Start the proxy
docker run -d \
-p 4000:4000 \
-e AZURE_AI_API_KEY=$AZURE_AI_API_KEY \
-e AZURE_AI_API_BASE=$AZURE_AI_API_BASE \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7 \
--config /app/config.yaml
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Usage - Vertex AI​
- LiteLLM Proxy
1. Setup config.yaml
model_list:
- model_name: claude-opus-4-7
litellm_params:
model: vertex_ai/claude-opus-4-7
vertex_project: os.environ/VERTEX_PROJECT
vertex_location: us-east5
2. Start the proxy
docker run -d \
-p 4000:4000 \
-e VERTEX_PROJECT=$VERTEX_PROJECT \
-e GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json \
-v $(pwd)/config.yaml:/app/config.yaml \
-v $(pwd)/credentials.json:/app/credentials.json \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7 \
--config /app/config.yaml
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Usage - Bedrock​
- LiteLLM Proxy
1. Setup config.yaml
model_list:
- model_name: claude-opus-4-7
litellm_params:
model: bedrock/anthropic.claude-opus-4-7
aws_access_key_id: os.environ/AWS_ACCESS_KEY_ID
aws_secret_access_key: os.environ/AWS_SECRET_ACCESS_KEY
aws_region_name: us-east-1
2. Start the proxy
docker run -d \
-p 4000:4000 \
-e AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID \
-e AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:litellm_stable_release_branch-v1.83.3-stable.opus-4.7 \
--config /app/config.yaml
3. Test it!
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "what llm are you"
}
]
}'
Advanced Features​
Adaptive Thinking​
When using reasoning_effort with Claude Opus 4.7, all values (low, medium, high, xhigh) are mapped to thinking: {type: "adaptive"}. To use explicit thinking budgets with type: "enabled", pass the native thinking parameter directly.
- /chat/completions
- /v1/messages
LiteLLM supports adaptive thinking through the reasoning_effort parameter:
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "Solve this complex problem: What is the optimal strategy for..."
}
],
"reasoning_effort": "high"
}'
Use the thinking parameter with type: "adaptive" to enable adaptive thinking mode:
curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
"model": "claude-opus-4-7",
"max_tokens": 16000,
"thinking": {
"type": "adaptive"
},
"messages": [
{
"role": "user",
"content": "Explain why the sum of two even numbers is always even."
}
]
}'
Effort Levels​
Claude Opus 4.7 supports four effort levels: low, medium, high (default), and xhigh. These give you finer-grained control over how much reasoning the model applies to a task. Pass the effort level via the output_config parameter.
xhigh is a new effort level introduced with Opus 4.7 that sits above high. The max effort level is Claude Opus 4.6 only and is not available on 4.7.
- /chat/completions
- /v1/messages
curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer $LITELLM_KEY' \
--data '{
"model": "claude-opus-4-7",
"messages": [
{
"role": "user",
"content": "Explain quantum computing"
}
],
"output_config": {
"effort": "xhigh"
}
}'
Using OpenAI SDK:
import openai
client = openai.OpenAI(
api_key="your-litellm-key",
base_url="http://0.0.0.0:4000"
)
response = client.chat.completions.create(
model="claude-opus-4-7",
messages=[{"role": "user", "content": "Explain quantum computing"}],
extra_body={"output_config": {"effort": "xhigh"}}
)
Using LiteLLM SDK:
from litellm import completion
response = completion(
model="anthropic/claude-opus-4-7",
messages=[{"role": "user", "content": "Explain quantum computing"}],
output_config={"effort": "xhigh"},
)
You can combine reasoning_effort with output_config for even more fine-grained control over the model's behavior.
curl --location 'http://0.0.0.0:4000/v1/messages' \
--header 'x-api-key: sk-12345' \
--header 'content-type: application/json' \
--data '{
"model": "claude-opus-4-7",
"max_tokens": 4096,
"messages": [
{
"role": "user",
"content": "Explain quantum computing"
}
],
"output_config": {
"effort": "xhigh"
}
}'
Effort level guide:
| Effort | When to use |
|---|---|
low | Short, fast responses — simple lookups, formatting, classification |
medium | Balanced tradeoff for everyday Q&A and light reasoning |
high (default) | Complex reasoning, code generation, analysis |
xhigh | Hardest problems — multi-step math, deep research, agentic planning |


