3 posts tagged with "thinking"

View All Tags

v1.63.2-stable

March 8, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

These are the changes since v1.61.20-stable.

This release is primarily focused on:

LLM Translation improvements (more thinking content improvements)
UI improvements (Error logs now shown on UI)

info

This release will be live on 03/09/2025

Demo Instance

Here's a Demo Instance to test changes:

Instance: https://demo.litellm.ai/
Login Credentials:
- Username: admin
- Password: sk-1234

New Models / Updated Models

Add supports_pdf_input for specific Bedrock Claude models PR
Add pricing for amazon eu models PR
Fix Azure O1 mini pricing PR

LLM Translation

Support /openai/ passthrough for Assistant endpoints. Get Started
Bedrock Claude - fix tool calling transformation on invoke route. Get Started
Bedrock Claude - response_format support for claude on invoke route. Get Started
Bedrock - pass description if set in response_format. Get Started
Bedrock - Fix passing response_format: {"type": "text"}. PR
OpenAI - Handle sending image_url as str to openai. Get Started
Deepseek - return 'reasoning_content' missing on streaming. Get Started
Caching - Support caching on reasoning content. Get Started
Bedrock - handle thinking blocks in assistant message. Get Started
Anthropic - Return signature on streaming. Get Started

Note: We've also migrated from signature_delta to signature. Read more

Support format param for specifying image type. Get Started
Anthropic - /v1/messages endpoint - thinking param support. Get Started

Note: this refactors the [BETA] unified /v1/messages endpoint, to just work for the Anthropic API.

Vertex AI - handle $id in response schema when calling vertex ai. Get Started

Spend Tracking Improvements

Batches API - Fix cost calculation to run on retrieve_batch. Get Started
Batches API - Log batch models in spend logs / standard logging payload. Get Started

Management Endpoints / UI

Virtual Keys Page
- Allow team/org filters to be searchable on the Create Key Page
- Add created_by and updated_by fields to Keys table
- Show 'user_email' on key table
- Show 100 Keys Per Page, Use full height, increase width of key alias
Logs Page
- Show Error Logs on LiteLLM UI
- Allow Internal Users to View their own logs
Internal Users Page
- Allow admin to control default model access for internal users
Fix session handling with cookies

Logging / Guardrail Integrations

Fix prometheus metrics w/ custom metrics, when keys containing team_id make requests. PR

Performance / Loadbalancing / Reliability improvements

Cooldowns - Support cooldowns on models called with client side credentials. Get Started
Tag-based Routing - ensures tag-based routing across all endpoints (/embeddings, /image_generation, etc.). Get Started

General Proxy Improvements

Raise BadRequestError when unknown model passed in request
Enforce model access restrictions on Azure OpenAI proxy route
Reliability fix - Handle emoji’s in text - fix orjson error
Model Access Patch - don't overwrite litellm.anthropic_models when running auth checks
Enable setting timezone information in docker image

Complete Git Diff

Here's the complete git diff

v1.63.0 - Anthropic 'thinking' response update

March 5, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

v1.63.0 fixes Anthropic 'thinking' response on streaming to return the signature block. Github Issue

It also moves the response structure from signature_delta to signature to be the same as Anthropic. Anthropic Docs

Diff

"message": {
    ...
    "reasoning_content": "The capital of France is Paris.",
    "thinking_blocks": [
        {
            "type": "thinking",
            "thinking": "The capital of France is Paris.",
-            "signature_delta": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 OLD FORMAT
+            "signature": "EqoBCkgIARABGAIiQL2UoU0b1OHYi+..." # 👈 KEY CHANGE
        }
    ]
}

v1.61.20-stable

March 1, 2025

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaffer

CTO, LiteLLM

These are the changes since v1.61.13-stable.

This release is primarily focused on:

LLM Translation improvements (claude-3-7-sonnet + 'thinking'/'reasoning_content' support)
UI improvements (add model flow, user management, etc)

Demo Instance

Here's a Demo Instance to test changes:

Instance: https://demo.litellm.ai/
Login Credentials:
- Username: admin
- Password: sk-1234

New Models / Updated Models

Anthropic 3-7 sonnet support + cost tracking (Anthropic API + Bedrock + Vertex AI + OpenRouter)
1. Anthropic API Start here
2. Bedrock API Start here
3. Vertex AI API See here
4. OpenRouter See here
Gpt-4.5-preview support + cost tracking See here
Azure AI - Phi-4 cost tracking See here
Claude-3.5-sonnet - vision support updated on Anthropic API See here
Bedrock llama vision support See here
Cerebras llama3.3-70b pricing See here

LLM Translation

Infinity Rerank - support returning documents when return_documents=True Start here
Amazon Deepseek - <think> param extraction into ‘reasoning_content’ Start here
Amazon Titan Embeddings - filter out ‘aws_’ params from request body Start here
Anthropic ‘thinking’ + ‘reasoning_content’ translation support (Anthropic API, Bedrock, Vertex AI) Start here
VLLM - support ‘video_url’ Start here
Call proxy via litellm SDK: Support litellm_proxy/ for embedding, image_generation, transcription, speech, rerank Start here
OpenAI Pass-through - allow using Assistants GET, DELETE on /openai pass through routes Start here
Message Translation - fix openai message for assistant msg if role is missing - openai allows this
O1/O3 - support ‘drop_params’ for o3-mini and o1 parallel_tool_calls param (not supported currently) See here

Spend Tracking Improvements

Cost tracking for rerank via Bedrock See PR
Anthropic pass-through - fix race condition causing cost to not be tracked See PR
Anthropic pass-through: Ensure accurate token counting See PR

Management Endpoints / UI

Models Page - Allow sorting models by ‘created at’
Models Page - Edit Model Flow Improvements
Models Page - Fix Adding Azure, Azure AI Studio models on UI
Internal Users Page - Allow Bulk Adding Internal Users on UI
Internal Users Page - Allow sorting users by ‘created at’
Virtual Keys Page - Allow searching for UserIDs on the dropdown when assigning a user to a team See PR
Virtual Keys Page - allow creating a user when assigning keys to users See PR
Model Hub Page - fix text overflow issue See PR
Admin Settings Page - Allow adding MSFT SSO on UI
Backend - don't allow creating duplicate internal users in DB

Helm

support ttlSecondsAfterFinished on the migration job - See PR
enhance migrations job with additional configurable properties - See PR

Logging / Guardrail Integrations

Arize Phoenix support
‘No-log’ - fix ‘no-log’ param support on embedding calls

Performance / Loadbalancing / Reliability improvements

Single Deployment Cooldown logic - Use allowed_fails or allowed_fail_policy if set Start here

General Proxy Improvements

Hypercorn - fix reading / parsing request body
Windows - fix running proxy in windows
DD-Trace - fix dd-trace enablement on proxy

Complete Git Diff

View the complete git diff here.

Demo Instance​

New Models / Updated Models​

LLM Translation​

Spend Tracking Improvements​

Management Endpoints / UI​

Logging / Guardrail Integrations​

Performance / Loadbalancing / Reliability improvements​

General Proxy Improvements​

Complete Git Diff​

Diff​

Demo Instance​

New Models / Updated Models​

LLM Translation​

Spend Tracking Improvements​

Management Endpoints / UI​

Helm​

Logging / Guardrail Integrations​

Performance / Loadbalancing / Reliability improvements​

General Proxy Improvements​

Complete Git Diff​

Demo Instance

New Models / Updated Models

LLM Translation

Spend Tracking Improvements

Management Endpoints / UI

Logging / Guardrail Integrations

Performance / Loadbalancing / Reliability improvements

General Proxy Improvements

Complete Git Diff

Diff

Demo Instance

New Models / Updated Models

LLM Translation

Spend Tracking Improvements

Management Endpoints / UI

Helm

Logging / Guardrail Integrations

Performance / Loadbalancing / Reliability improvements

General Proxy Improvements

Complete Git Diff