Skip to main content

Bedrock (boto3) SDK

Pass-through endpoints for Bedrock - call provider-specific endpoint, in native format (no translation).

FeatureSupportedNotes
Cost Trackingโœ…For /invoke and /converse endpoints
Load Balancingโœ…You can load balance /invoke, /converse routes across multiple deployments
End-user TrackingโŒTell us if you need this
Streamingโœ…

Just replace https://bedrock-runtime.{aws_region_name}.amazonaws.com with LITELLM_PROXY_BASE_URL/bedrock ๐Ÿš€

Overviewโ€‹

LiteLLM supports two ways to call Bedrock endpoints:

Define your Bedrock models in config.yaml and reference them by name. The proxy handles authentication and routing.

Use for: /converse, /converse-stream, /invoke, /invoke-with-response-stream

model_list:
- model_name: my-bedrock-model
litellm_params:
model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-west-2
custom_llm_provider: bedrock
curl -X POST 'http://0.0.0.0:4000/bedrock/model/my-bedrock-model/converse' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{"messages": [{"role": "user", "content": [{"text": "Hello"}]}]}'

2. Direct passthrough (For non-model endpoints)โ€‹

Set AWS credentials via environment variables and call Bedrock endpoints directly.

Use for: Guardrails, Knowledge Bases, Agents, and other non-model endpoints

export AWS_ACCESS_KEY_ID=""
export AWS_SECRET_ACCESS_KEY=""
export AWS_REGION_NAME="us-west-2"
curl "http://0.0.0.0:4000/bedrock/guardrail/my-guardrail-id/version/1/apply" \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{"contents": [{"text": {"text": "Hello"}}], "source": "INPUT"}'

Supports ALL Bedrock Endpoints (including streaming).

See All Bedrock Endpoints

Quick Startโ€‹

Let's call the Bedrock /converse endpoint

  1. Create a config.yaml file with your Bedrock model
model_list:
- model_name: my-bedrock-model
litellm_params:
model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-west-2
custom_llm_provider: bedrock

Set your AWS credentials:

export AWS_ACCESS_KEY_ID=""  # Access key
export AWS_SECRET_ACCESS_KEY="" # Secret access key
  1. Start LiteLLM Proxy
litellm --config config.yaml

# RUNNING on http://0.0.0.0:4000
  1. Test it!

Let's call the Bedrock converse endpoint using the model name from config:

curl -X POST 'http://0.0.0.0:4000/bedrock/model/my-bedrock-model/converse' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{
"role": "user",
"content": [{"text": "Hello, how are you?"}]
}
],
"inferenceConfig": {
"maxTokens": 100
}
}'

Setup with config.yamlโ€‹

Use config.yaml to define Bedrock models and use them via passthrough endpoints.

1. Define models in config.yamlโ€‹

model_list:
- model_name: my-claude-model
litellm_params:
model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-west-2
custom_llm_provider: bedrock

- model_name: my-cohere-model
litellm_params:
model: bedrock/cohere.command-r-v1:0
aws_region_name: us-east-1
custom_llm_provider: bedrock

2. Start proxy with configโ€‹

litellm --config config.yaml

# RUNNING on http://0.0.0.0:4000

3. Call Bedrock Converse endpointโ€‹

Use the model_name from config in the URL path:

curl -X POST 'http://0.0.0.0:4000/bedrock/model/my-claude-model/converse' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{
"role": "user",
"content": [{"text": "Hello, how are you?"}]
}
],
"inferenceConfig": {
"temperature": 0.5,
"maxTokens": 100
}
}'

4. Call Bedrock Converse Stream endpointโ€‹

For streaming responses, use the /converse-stream endpoint:

curl -X POST 'http://0.0.0.0:4000/bedrock/model/my-claude-model/converse-stream' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{
"role": "user",
"content": [{"text": "Tell me a short story"}]
}
],
"inferenceConfig": {
"temperature": 0.7,
"maxTokens": 200
}
}'

Supported Bedrock Endpoints with config.yamlโ€‹

When using models from config.yaml, you can call any Bedrock endpoint:

EndpointDescriptionExample
/model/{model_name}/converseConverse APIhttp://0.0.0.0:4000/bedrock/model/my-claude-model/converse
/model/{model_name}/converse-streamStreaming Conversehttp://0.0.0.0:4000/bedrock/model/my-claude-model/converse-stream
/model/{model_name}/invokeLegacy Invoke APIhttp://0.0.0.0:4000/bedrock/model/my-claude-model/invoke
/model/{model_name}/invoke-with-response-streamLegacy Streaminghttp://0.0.0.0:4000/bedrock/model/my-claude-model/invoke-with-response-stream

The proxy automatically resolves the model_name to the actual Bedrock model ID and region configured in your config.yaml.

Load Balancing Across Multiple Deploymentsโ€‹

Define multiple Bedrock deployments with the same model_name to enable automatic load balancing.

1. Define multiple deployments in config.yamlโ€‹

model_list:
# First deployment - us-west-2
- model_name: my-claude-model
litellm_params:
model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-west-2
custom_llm_provider: bedrock

# Second deployment - us-east-1 (load balanced)
- model_name: my-claude-model
litellm_params:
model: bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0
aws_region_name: us-east-1
custom_llm_provider: bedrock

2. Start proxy with configโ€‹

litellm --config config.yaml

# RUNNING on http://0.0.0.0:4000

3. Call the endpoint - requests are automatically load balancedโ€‹

curl -X POST 'http://0.0.0.0:4000/bedrock/model/my-claude-model/invoke' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"max_tokens": 100,
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"anthropic_version": "bedrock-2023-05-31"
}'

The proxy will automatically distribute requests across both us-west-2 and us-east-1 deployments. This works for all Bedrock endpoints: /invoke, /invoke-with-response-stream, /converse, and /converse-stream.

Using boto3 SDK with load balancingโ€‹

You can also call the load-balanced endpoint using the boto3 SDK:

import boto3
import json
import os

# Set dummy AWS credentials (required by boto3, but not used by LiteLLM proxy)
os.environ['AWS_ACCESS_KEY_ID'] = 'dummy'
os.environ['AWS_SECRET_ACCESS_KEY'] = 'dummy'
os.environ['AWS_BEARER_TOKEN_BEDROCK'] = "sk-1234" # your litellm proxy api key

# Point boto3 to the LiteLLM proxy
bedrock_runtime = boto3.client(
service_name='bedrock-runtime',
region_name='us-west-2',
endpoint_url='http://0.0.0.0:4000/bedrock'
)

# Call the load-balanced model
response = bedrock_runtime.invoke_model(
modelId='my-claude-model', # Your model_name from config.yaml
contentType='application/json',
accept='application/json',
body=json.dumps({
"max_tokens": 100,
"messages": [
{
"role": "user",
"content": "Hello, how are you?"
}
],
"anthropic_version": "bedrock-2023-05-31"
})
)

# Parse response
response_body = json.loads(response['body'].read())
print(response_body['content'][0]['text'])

The proxy will automatically load balance your boto3 requests across all configured deployments.

Examplesโ€‹

Anything after http://0.0.0.0:4000/bedrock is treated as a provider-specific route, and handled accordingly.

Key Changes:

Original EndpointReplace With
https://bedrock-runtime.{aws_region_name}.amazonaws.comhttp://0.0.0.0:4000/bedrock (LITELLM_PROXY_BASE_URL="http://0.0.0.0:4000")
AWS4-HMAC-SHA256..Bearer anything (use Bearer LITELLM_VIRTUAL_KEY if Virtual Keys are setup on proxy)

Example 1: Converse APIโ€‹

LiteLLM Proxy Callโ€‹

curl -X POST 'http://0.0.0.0:4000/bedrock/model/cohere.command-r-v1:0/converse' \
-H 'Authorization: Bearer sk-anything' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "user",
"content": [{"text": "Hello"}]
}
]
}'

Direct Bedrock API Callโ€‹

curl -X POST 'https://bedrock-runtime.us-west-2.amazonaws.com/model/cohere.command-r-v1:0/converse' \
-H 'Authorization: AWS4-HMAC-SHA256..' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "user",
"content": [{"text": "Hello"}]
}
]
}'

Example 2: Apply Guardrailโ€‹

Setup: Set AWS credentials for direct passthrough

export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION_NAME="us-west-2"

Start proxy:

litellm

# RUNNING on http://0.0.0.0:4000

LiteLLM Proxy Callโ€‹

curl "http://0.0.0.0:4000/bedrock/guardrail/guardrailIdentifier/version/guardrailVersion/apply" \
-H 'Authorization: Bearer sk-anything' \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"text": {"text": "Hello world"}}],
"source": "INPUT"
}'

Direct Bedrock API Callโ€‹

curl "https://bedrock-runtime.us-west-2.amazonaws.com/guardrail/guardrailIdentifier/version/guardrailVersion/apply" \
-H 'Authorization: AWS4-HMAC-SHA256..' \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"text": {"text": "Hello world"}}],
"source": "INPUT"
}'

Example 3: Query Knowledge Baseโ€‹

Setup: Set AWS credentials for direct passthrough

export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION_NAME="us-west-2"

Start proxy:

litellm

# RUNNING on http://0.0.0.0:4000

LiteLLM Proxy Callโ€‹

curl -X POST "http://0.0.0.0:4000/bedrock/knowledgebases/{knowledgeBaseId}/retrieve" \
-H 'Authorization: Bearer sk-anything' \
-H 'Content-Type: application/json' \
-d '{
"nextToken": "string",
"retrievalConfiguration": {
"vectorSearchConfiguration": {
"filter": { ... },
"numberOfResults": number,
"overrideSearchType": "string"
}
},
"retrievalQuery": {
"text": "string"
}
}'

Direct Bedrock API Callโ€‹

curl -X POST "https://bedrock-agent-runtime.us-west-2.amazonaws.com/knowledgebases/{knowledgeBaseId}/retrieve" \
-H 'Authorization: AWS4-HMAC-SHA256..' \
-H 'Content-Type: application/json' \
-d '{
"nextToken": "string",
"retrievalConfiguration": {
"vectorSearchConfiguration": {
"filter": { ... },
"numberOfResults": number,
"overrideSearchType": "string"
}
},
"retrievalQuery": {
"text": "string"
}
}'

Advanced - Use with Virtual Keysโ€‹

Pre-requisites

Use this, to avoid giving developers the raw AWS Keys, but still letting them use AWS Bedrock endpoints.

Usageโ€‹

  1. Setup environment
export DATABASE_URL=""
export LITELLM_MASTER_KEY=""
export AWS_ACCESS_KEY_ID="" # Access key
export AWS_SECRET_ACCESS_KEY="" # Secret access key
export AWS_REGION_NAME="" # us-east-1, us-east-2, us-west-1, us-west-2
litellm

# RUNNING on http://0.0.0.0:4000
  1. Generate virtual key
curl -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{}'

Expected Response

{
...
"key": "sk-1234ewknldferwedojwojw"
}
  1. Test it!
curl -X POST 'http://0.0.0.0:4000/bedrock/model/cohere.command-r-v1:0/converse' \
-H 'Authorization: Bearer sk-1234ewknldferwedojwojw' \
-H 'Content-Type: application/json' \
-d '{
"messages": [
{"role": "user",
"content": [{"text": "Hello"}]
}
]
}'

Advanced - Bedrock Agentsโ€‹

Call Bedrock Agents via LiteLLM proxy

Setup: Set AWS credentials on your LiteLLM proxy server

export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION_NAME="us-west-2"

Start proxy:

litellm

# RUNNING on http://0.0.0.0:4000

Usage from Python:

import os 
import boto3

# Set dummy AWS credentials (required by boto3, but not used by LiteLLM proxy)
os.environ["AWS_ACCESS_KEY_ID"] = "dummy"
os.environ["AWS_SECRET_ACCESS_KEY"] = "dummy"
os.environ["AWS_BEARER_TOKEN_BEDROCK"] = "sk-1234" # your litellm proxy api key

# Create the client
runtime_client = boto3.client(
service_name="bedrock-agent-runtime",
region_name="us-west-2",
endpoint_url="http://0.0.0.0:4000/bedrock"
)

response = runtime_client.invoke_agent(
agentId="L1RT58GYRW",
agentAliasId="MFPSBCXYTW",
sessionId="12345",
inputText="Who do you know?"
)

completion = ""

for event in response.get("completion"):
chunk = event["chunk"]
completion += chunk["bytes"].decode()

print(completion)