Azure Model Router

Azure Model Router is a feature in Azure AI Foundry that automatically routes your requests to the best available model based on your requirements. This allows you to use a single endpoint that intelligently selects the optimal model for each request.

Key Features

Automatic Model Selection: Azure Model Router dynamically selects the best model for your request
Cost Tracking: LiteLLM automatically tracks costs based on the actual model used (e.g., gpt-4.1-nano), plus the Model Router infrastructure fee
Streaming Support: Full support for streaming responses with accurate cost calculation
Simple Configuration: Easy to set up via UI or config file

Model Naming Pattern

Use the pattern: azure_ai/model_router/<deployment-name>

Components:

azure_ai - The provider identifier
model_router - Indicates this is a Model Router deployment
<deployment-name> - Your actual deployment name from Azure AI Foundry (e.g., azure-model-router)

Example: azure_ai/model_router/azure-model-router

How it works:

LiteLLM automatically strips the model_router/ prefix when sending requests to Azure
Only your deployment name (e.g., azure-model-router) is sent to the Azure API
The full path is preserved in responses and logs for proper cost tracking

LiteLLM Python SDK

Basic Usage

Use the pattern azure_ai/model_router/<deployment-name> where <deployment-name> is your Azure deployment name:

import litellm
import os

response = litellm.completion(
    model="azure_ai/model_router/azure-model-router",  # Use your deployment name
    messages=[{"role": "user", "content": "Hello!"}],
    api_base="https://your-endpoint.cognitiveservices.azure.com/openai/v1/",
    api_key=os.getenv("AZURE_MODEL_ROUTER_API_KEY"),
)

print(response)

Pattern Explanation:

azure_ai - The provider
model_router - Indicates this is a model router deployment
azure-model-router - Your actual deployment name from Azure AI Foundry

LiteLLM will automatically strip the model_router/ prefix when sending the request to Azure, so only azure-model-router is sent to the API.

Streaming with Usage Tracking

import litellm
import os

response = await litellm.acompletion(
    model="azure_ai/model_router/azure-model-router",  # Use your deployment name
    messages=[{"role": "user", "content": "hi"}],
    api_base="https://your-endpoint.cognitiveservices.azure.com/openai/v1/",
    api_key=os.getenv("AZURE_MODEL_ROUTER_API_KEY"),
    stream=True,
    stream_options={"include_usage": True},
)

async for chunk in response:
    print(chunk)

LiteLLM Proxy (AI Gateway)

config.yaml

model_list:
  - model_name: azure-model-router  # Public name for your users
    litellm_params:
      model: azure_ai/model_router/azure-model-router  # Use your deployment name
      api_base: https://your-endpoint.cognitiveservices.azure.com/openai/v1/
      api_key: os.environ/AZURE_MODEL_ROUTER_API_KEY

Note: Replace azure-model-router in the model path with your actual deployment name from Azure AI Foundry.

Start Proxy

litellm --config config.yaml

Test Request

curl -X POST http://localhost:4000/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-1234" \
  -d '{
    "model": "azure-model-router",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Add Azure Model Router via LiteLLM UI

This walkthrough shows how to add an Azure Model Router endpoint to LiteLLM using the Admin Dashboard.

Quick Start

Navigate to the Models page in the LiteLLM UI
Select "Azure AI Foundry (Studio)" as the provider
Enter your deployment name (e.g., azure-model-router)
LiteLLM will automatically format it as azure_ai/model_router/azure-model-router
Add your API base URL and API key
Test and save

Detailed Walkthrough

Step 1: Select Provider

Navigate to the Models page and select "Azure AI Foundry (Studio)" as the provider.

Navigate to Models Page

Navigate to Models

Click Provider

Choose Azure AI Foundry

Select Azure AI Foundry

Step 2: Enter Deployment Name

New Simplified Method: Just enter your deployment name directly in the text field. If your deployment name contains "model-router" or "model_router", LiteLLM will automatically format it as azure_ai/model_router/<deployment-name>.

Example:

Enter: azure-model-router
LiteLLM creates: azure_ai/model_router/azure-model-router

Copy Deployment Name from Azure Portal

Switch to Azure AI Foundry and copy your model router deployment name.

Azure Portal Model Name

Copy Model Name

Enter Deployment Name in LiteLLM

Paste your deployment name (e.g., azure-model-router) directly into the text field.

Enter Deployment Name

What happens behind the scenes:

You enter: azure-model-router
LiteLLM automatically detects this is a model router deployment
The full model path becomes: azure_ai/model_router/azure-model-router
When making API calls, only azure-model-router is sent to Azure

Step 3: Configure API Base and Key

Copy the endpoint URL and API key from Azure portal.

Copy API Base URL from Azure

Copy API Base

Enter API Base in LiteLLM

Click API Base Field

Paste API Base

Copy API Key from Azure

Copy API Key

Enter API Key in LiteLLM

Enter API Key

Step 4: Test and Add Model

Verify your configuration works and save the model.

Test Connection

Close Test Dialog

Close Dialog

Add Model

Step 5: Verify in Playground

Test your model and verify cost tracking is working.

Open Playground

Go to Playground

Select Model

Send Test Message

Send Message

View Logs

Verify Cost Tracking

Cost is tracked based on the actual model used (e.g., gpt-4.1-nano), plus a flat infrastructure cost of $0.14 per million input tokens for using the Model Router.

Verify Cost

Cost Tracking

LiteLLM automatically handles cost tracking for Azure Model Router by:

Detecting the actual model: When Azure Model Router routes your request to a specific model (e.g., gpt-4.1-nano-2025-04-14), LiteLLM extracts this from the response
Calculating accurate costs: Costs are calculated based on:
- The actual model used (e.g., gpt-4.1-nano token costs)
- Plus a flat infrastructure cost of $0.14 per million input tokens for using the Model Router
Streaming support: Cost tracking works correctly for both streaming and non-streaming requests

Cost Breakdown

When you use Azure Model Router, the total cost includes:

Model Cost: Based on the actual model that handled your request (e.g., gpt-4.1-nano)
Router Flat Cost: $0.14 per million input tokens (Azure AI Foundry infrastructure fee)

Example Response with Cost

import litellm

response = litellm.completion(
    model="azure_ai/model_router/azure-model-router",
    messages=[{"role": "user", "content": "Hello!"}],
    api_base="https://your-endpoint.cognitiveservices.azure.com/openai/v1/",
    api_key="your-api-key",
)

# The response will show the actual model used
print(f"Model used: {response.model}")  # e.g., "azure_ai/gpt-4.1-nano-2025-04-14"

# Get cost (includes both model cost and router flat cost)
from litellm import completion_cost
cost = completion_cost(completion_response=response)
print(f"Total cost: ${cost}")

# Access detailed cost breakdown
if hasattr(response, '_hidden_params') and 'response_cost' in response._hidden_params:
    print(f"Response cost: ${response._hidden_params['response_cost']}")

Viewing Cost Breakdown in UI

When viewing logs in the LiteLLM UI, you'll see:

Model Cost: The cost for the actual model used
Azure Model Router Flat Cost: The $0.14/M input tokens infrastructure fee
Total Cost: Sum of both costs

This breakdown helps you understand exactly what you're paying for when using the Model Router.

Key Features​

Model Naming Pattern​

LiteLLM Python SDK​

Basic Usage​

Streaming with Usage Tracking​

LiteLLM Proxy (AI Gateway)​

config.yaml​

Start Proxy​

Test Request​

Add Azure Model Router via LiteLLM UI​

Quick Start​

Detailed Walkthrough​

Step 1: Select Provider​

Navigate to Models Page​

Click Provider Dropdown​

Choose Azure AI Foundry​

Step 2: Enter Deployment Name​

Copy Deployment Name from Azure Portal​

Enter Deployment Name in LiteLLM​

Step 3: Configure API Base and Key​

Copy API Base URL from Azure​

Enter API Base in LiteLLM​

Copy API Key from Azure​

Enter API Key in LiteLLM​

Step 4: Test and Add Model​

Test Connection​

Close Test Dialog​

Add Model​

Step 5: Verify in Playground​

Open Playground​

Select Model​

Send Test Message​

View Logs​

Verify Cost Tracking​

Cost Tracking​

Cost Breakdown​

Example Response with Cost​

Viewing Cost Breakdown in UI​