Skip to main content

Overview

Visca AI Gateway provides native support for the Anthropic Messages API, allowing you to use Claude models with their native format while benefiting from gateway features like routing, caching, and analytics.

Endpoint

POST https://gateway.visca.ai/v1/messages

Basic Usage

import anthropic

client = anthropic.Anthropic(
base_url="https://gateway.visca.ai/v1",
api_key="your-visca-api-key" # Not your Anthropic key
)

message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{
"role": "user",
"content": "Hello, Claude!"
}]
)

print(message.content[0].text)

Supported Models

All Claude models available:
  • claude-3-5-sonnet-20241022 (latest, recommended)
  • claude-3-5-haiku-20241022
  • claude-3-opus-20240229
  • claude-3-sonnet-20240229
  • claude-3-haiku-20240307

Streaming

Stream responses for real-time output:
with client.messages.stream(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a story"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

System Messages

Claude uses a separate system parameter:
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    system="You are a helpful AI assistant specializing in Python programming.",
    messages=[{
        "role": "user",
        "content": "How do I read a CSV file?"
    }]
)

Vision

Send images with Claude 3 models:
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {
                    "type": "url",
                    "url": "https://example.com/image.jpg"
                }
            },
            {
                "type": "text",
                "text": "What's in this image?"
            }
        ]
    }]
)

Prompt Caching

Enable prompt caching to save costs on repeated content:
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    system=[{
        "type": "text",
        "text": "You are an AI assistant...",
        "cache_control": {"type": "ephemeral"}
    }],
    messages=[{
        "role": "user",
        "content": "Hello!"
    }]
)

# Check cache usage
print(f"Cache read tokens: {message.usage.cache_read_input_tokens}")
print(f"Cache creation tokens: {message.usage.cache_creation_input_tokens}")

Tool Use (Function Calling)

Define tools for Claude to use:
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["location"]
        }
    }],
    messages=[{
        "role": "user",
        "content": "What's the weather in San Francisco?"
    }]
)

# Handle tool use
if message.stop_reason == "tool_use":
    tool_use = next(block for block in message.content if block.type == "tool_use")
    print(f"Claude wants to call: {tool_use.name}")
    print(f"With arguments: {tool_use.input}")

Gateway Features

The Anthropic endpoint supports all gateway features:

Routing

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
    metadata={
        "routing": {
            "strategy": "latency-optimized",
            "fallback": True
        }
    }
)

Request Metadata

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
    metadata={
        "user_id": "user-123",
        "session_id": "sess-456"
    }
)

Cost Tracking

All requests through the Anthropic endpoint are tracked in analytics with accurate cost attribution.

Differences from Native Anthropic API

FeatureNative AnthropicVia Gateway
Endpointhttps://api.anthropic.comhttps://gateway.visca.ai
API KeyAnthropic API keyVisca gateway key
AnalyticsLimitedFull dashboard
RoutingSingle providerMulti-provider fallback
CachingPrompt caching+ Response caching
Rate LimitsAnthropic limitsConfigurable limits

Best Practices

Use System Messages

Define behavior with system parameter, not in first user message

Enable Caching

Cache large system prompts and documents to reduce costs by 90%

Stream When Possible

Use streaming for better user experience on long responses

Handle Tool Use

Implement proper tool use loops for agentic workflows

Error Handling

from anthropic import APIError, RateLimitError

try:
    message = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}]
    )
except RateLimitError:
    print("Rate limit exceeded")
except APIError as e:
    print(f"API error: {e.status_code} - {e.message}")

Migration from Direct Anthropic

1

Change Base URL

Update base_url to https://gateway.visca.ai/v1
2

Update API Key

Use your Visca gateway API key instead of Anthropic key
3

Test Requests

Verify all requests work as expected
4

Enable Features

Add routing, metadata, and analytics as needed

Next Steps