Skip to main content

3 posts tagged with "thinking"

View All Tags

v1.63.2-stable

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

These are the changes since v1.61.20-stable.

This release is primarily focused on:

  • LLM Translation improvements (more thinking content improvements)
  • UI improvements (Error logs now shown on UI)
info

This release will be live on 03/09/2025

Demo Instance​

Here's a Demo Instance to test changes:

New Models / Updated Models​

  1. Add supports_pdf_input for specific Bedrock Claude models PR
  2. Add pricing for amazon eu models PR
  3. Fix Azure O1 mini pricing PR

LLM Translation​

  1. Support /openai/ passthrough for Assistant endpoints. Get Started
  2. Bedrock Claude - fix tool calling transformation on invoke route. Get Started
  3. Bedrock Claude - response_format support for claude on invoke route. Get Started
  4. Bedrock - pass description if set in response_format. Get Started
  5. Bedrock - Fix passing response_format: {"type": "text"}. PR
  6. OpenAI - Handle sending image_url as str to openai. Get Started
  7. Deepseek - return 'reasoning_content' missing on streaming. Get Started
  8. Caching - Support caching on reasoning content. Get Started
  9. Bedrock - handle thinking blocks in assistant message. Get Started
  10. Anthropic - Return signature on streaming. Get Started
  • Note: We've also migrated from signature_delta to signature. Read more
  1. Support format param for specifying image type. Get Started
  2. Anthropic - /v1/messages endpoint - thinking param support. Get Started
  • Note: this refactors the [BETA] unified /v1/messages endpoint, to just work for the Anthropic API.
  1. Vertex AI - handle $id in response schema when calling vertex ai. Get Started

Spend Tracking Improvements​

  1. Batches API - Fix cost calculation to run on retrieve_batch. Get Started
  2. Batches API - Log batch models in spend logs / standard logging payload. Get Started

Management Endpoints / UI​

  1. Virtual Keys Page
    • Allow team/org filters to be searchable on the Create Key Page
    • Add created_by and updated_by fields to Keys table
    • Show 'user_email' on key table
    • Show 100 Keys Per Page, Use full height, increase width of key alias
  2. Logs Page
    • Show Error Logs on LiteLLM UI
    • Allow Internal Users to View their own logs
  3. Internal Users Page
    • Allow admin to control default model access for internal users
  4. Fix session handling with cookies

Logging / Guardrail Integrations​

  1. Fix prometheus metrics w/ custom metrics, when keys containing team_id make requests. PR

Performance / Loadbalancing / Reliability improvements​

  1. Cooldowns - Support cooldowns on models called with client side credentials. Get Started
  2. Tag-based Routing - ensures tag-based routing across all endpoints (/embeddings, /image_generation, etc.). Get Started

General Proxy Improvements​

  1. Raise BadRequestError when unknown model passed in request
  2. Enforce model access restrictions on Azure OpenAI proxy route
  3. Reliability fix - Handle emoji’s in text - fix orjson error
  4. Model Access Patch - don't overwrite litellm.anthropic_models when running auth checks
  5. Enable setting timezone information in docker image

Complete Git Diff​

Here's the complete git diff

v1.63.0 - Anthropic 'thinking' response update

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

v1.63.0 fixes Anthropic 'thinking' response on streaming to return the signature block. Github Issue

It also moves the response structure from signature_delta to signature to be the same as Anthropic. Anthropic Docs

Diff​

"message": {
...
"reasoning_content": "The capital of France is Paris.",
"thinking_blocks": [
{
"type": "thinking",
"thinking": "The capital of France is Paris.",
- "signature_delta": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 OLD FORMAT
+ "signature": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 KEY CHANGE
}
]
}

v1.61.20-stable

Krrish Dholakia
CEO, LiteLLM
Ishaan Jaffer
CTO, LiteLLM

These are the changes since v1.61.13-stable.

This release is primarily focused on:

  • LLM Translation improvements (claude-3-7-sonnet + 'thinking'/'reasoning_content' support)
  • UI improvements (add model flow, user management, etc)

Demo Instance​

Here's a Demo Instance to test changes:

New Models / Updated Models​

  1. Anthropic 3-7 sonnet support + cost tracking (Anthropic API + Bedrock + Vertex AI + OpenRouter)
    1. Anthropic API Start here
    2. Bedrock API Start here
    3. Vertex AI API See here
    4. OpenRouter See here
  2. Gpt-4.5-preview support + cost tracking See here
  3. Azure AI - Phi-4 cost tracking See here
  4. Claude-3.5-sonnet - vision support updated on Anthropic API See here
  5. Bedrock llama vision support See here
  6. Cerebras llama3.3-70b pricing See here

LLM Translation​

  1. Infinity Rerank - support returning documents when return_documents=True Start here
  2. Amazon Deepseek - <think> param extraction into ‘reasoning_content’ Start here
  3. Amazon Titan Embeddings - filter out ‘aws_’ params from request body Start here
  4. Anthropic ‘thinking’ + ‘reasoning_content’ translation support (Anthropic API, Bedrock, Vertex AI) Start here
  5. VLLM - support ‘video_url’ Start here
  6. Call proxy via litellm SDK: Support litellm_proxy/ for embedding, image_generation, transcription, speech, rerank Start here
  7. OpenAI Pass-through - allow using Assistants GET, DELETE on /openai pass through routes Start here
  8. Message Translation - fix openai message for assistant msg if role is missing - openai allows this
  9. O1/O3 - support ‘drop_params’ for o3-mini and o1 parallel_tool_calls param (not supported currently) See here

Spend Tracking Improvements​

  1. Cost tracking for rerank via Bedrock See PR
  2. Anthropic pass-through - fix race condition causing cost to not be tracked See PR
  3. Anthropic pass-through: Ensure accurate token counting See PR

Management Endpoints / UI​

  1. Models Page - Allow sorting models by ‘created at’
  2. Models Page - Edit Model Flow Improvements
  3. Models Page - Fix Adding Azure, Azure AI Studio models on UI
  4. Internal Users Page - Allow Bulk Adding Internal Users on UI
  5. Internal Users Page - Allow sorting users by ‘created at’
  6. Virtual Keys Page - Allow searching for UserIDs on the dropdown when assigning a user to a team See PR
  7. Virtual Keys Page - allow creating a user when assigning keys to users See PR
  8. Model Hub Page - fix text overflow issue See PR
  9. Admin Settings Page - Allow adding MSFT SSO on UI
  10. Backend - don't allow creating duplicate internal users in DB

Helm​

  1. support ttlSecondsAfterFinished on the migration job - See PR
  2. enhance migrations job with additional configurable properties - See PR

Logging / Guardrail Integrations​

  1. Arize Phoenix support
  2. ‘No-log’ - fix ‘no-log’ param support on embedding calls

Performance / Loadbalancing / Reliability improvements​

  1. Single Deployment Cooldown logic - Use allowed_fails or allowed_fail_policy if set Start here

General Proxy Improvements​

  1. Hypercorn - fix reading / parsing request body
  2. Windows - fix running proxy in windows
  3. DD-Trace - fix dd-trace enablement on proxy

Complete Git Diff​

View the complete git diff here.