Skip to main content

Overview

Amazon Bedrock provides access to foundation models from AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon through a single API. Visca AI Gateway integrates seamlessly with Bedrock.

Prerequisites

1

AWS Account

You need an AWS account with Bedrock access enabled
2

Model Access

Request access to models in AWS Console → Bedrock → Model Access
3

IAM Credentials

Create IAM user with bedrock:InvokeModel permission
4

Gateway Configuration

Configure Bedrock provider in Visca gateway

Configuration

Add Bedrock to your gateway configuration:
services:
  gateway:
    image: viscaai/gateway:latest
    environment:
      AWS_ACCESS_KEY_ID: ${AWS_ACCESS_KEY_ID}
      AWS_SECRET_ACCESS_KEY: ${AWS_SECRET_ACCESS_KEY}
      AWS_REGION: us-east-1
      PROVIDERS: "openai,anthropic,aws-bedrock"

Available Models

  • anthropic.claude-3-5-sonnet-20241022-v2:0 (latest) - anthropic.claude-3-5-sonnet-20240620-v1:0 - anthropic.claude-3-opus-20240229-v1:0 - anthropic.claude-3-sonnet-20240229-v1:0 - anthropic.claude-3-haiku-20240307-v1:0
  • meta.llama3-2-90b-instruct-v1:0 - meta.llama3-1-405b-instruct-v1:0 - meta.llama3-1-70b-instruct-v1:0 - meta.llama3-70b-instruct-v1:0
  • amazon.titan-text-premier-v1:0 - amazon.titan-text-express-v1 - amazon.titan-embed-text-v1 - amazon.titan-image-generator-v1
  • ai21.jamba-instruct-v1:0 - ai21.j2-ultra-v1 - ai21.j2-mid-v1
  • cohere.command-r-plus-v1:0 - cohere.command-r-v1:0 - cohere.embed-multilingual-v3

Basic Usage

import openai

client = openai.OpenAI(
base_url="https://gateway.visca.ai/v1",
api_key="your-visca-api-key"
)

response = client.chat.completions.create(
model="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
messages=[{
"role": "user",
"content": "Hello from Bedrock!"
}]
)

print(response.choices[0].message.content)

Region Selection

Specify AWS region in model name or configuration:
# Method 1: In model name
response = client.chat.completions.create(
    model="bedrock/us-west-2/anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Hello"}]
)

# Method 2: In metadata
response = client.chat.completions.create(
    model="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "metadata": {
            "aws_region": "us-west-2"
        }
    }
)

Cross-Region Routing

Route to closest or cheapest region automatically:
response = client.chat.completions.create(
    model="anthropic.claude-3-5-sonnet-20241022",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "metadata": {
            "routing": {
                "strategy": "latency-optimized",
                "providers": ["aws-bedrock"],
                "regions": ["us-east-1", "us-west-2", "eu-west-1"]
            }
        }
    }
)

IAM Authentication

  • IAM User
  • IAM Role (ECS/EKS)
  • Instance Profile (EC2)
{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "bedrock:InvokeModel",
      "bedrock:InvokeModelWithResponseStream"
    ],
    "Resource": "arn:aws:bedrock:*::foundation-model/*"
  }]
}

Streaming

Stream responses for real-time output:
stream = client.chat.completions.create(
    model="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Write a story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Cost Comparison

ModelProviderInput (1M tokens)Output (1M tokens)
Claude 3.5 SonnetBedrock$3.00$15.00
Claude 3.5 SonnetAnthropic Direct$3.00$15.00
Llama 3.1 405BBedrock$5.32$16.00
Llama 3.1 405BTogether AI$3.50$4.00

Use Cases

Compliance Workloads

Use AWS infrastructure for regulated industries

Multi-Model Access

Access Claude, Llama, Titan from one endpoint

Existing AWS Apps

Integrate AI into existing AWS infrastructure

Private VPC

Keep all traffic within your AWS VPC

Advanced Configuration

Guardrails

Use Bedrock Guardrails for content moderation:
response = client.chat.completions.create(
    model="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Hello"}],
    extra_body={
        "metadata": {
            "guardrail_id": "guardrail-123",
            "guardrail_version": "1"
        }
    }
)

Custom Prompting

Override default system prompts:
response = client.chat.completions.create(
    model="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello"}
    ]
)

Model Inference Parameters

Pass Bedrock-specific parameters:
response = client.chat.completions.create(
    model="bedrock/anthropic.claude-3-5-sonnet-20241022-v2:0",
    messages=[{"role": "user", "content": "Hello"}],
    temperature=0.7,
    max_tokens=2000,
    top_p=0.9
)

Troubleshooting

Cause: Model access not requested in AWS Console Solution: Go to AWS Console → Bedrock → Model Access and request access
Cause: IAM user lacks bedrock:InvokeModel permission Solution: Add required IAM policy to user/role
Cause: Exceeded quota for model requests Solution: Request quota increase in AWS Service Quotas console
Cause: Invalid model ID or region Solution: Verify model ID matches Bedrock format exactly

Best Practices

Use IAM Roles

Prefer IAM roles over access keys for better security

Enable CloudTrail

Log all Bedrock API calls for audit compliance

Request Quotas Early

Default quotas are low; request increases proactively

Multi-Region Setup

Deploy in multiple regions for redundancy

Next Steps