Anthropic Endpoint

Overview

Visca AI Gateway provides native support for the Anthropic Messages API, allowing you to use Claude models with their native format while benefiting from gateway features like routing, caching, and analytics.

Endpoint

POST https://gateway.visca.ai/v1/messages

Basic Usage

import anthropic

client = anthropic.Anthropic(
base_url="https://gateway.visca.ai/v1",
api_key="your-visca-api-key" # Not your Anthropic key
)

message = client.messages.create(
model="claude-3-5-sonnet-20241022",
max_tokens=1024,
messages=[{
"role": "user",
"content": "Hello, Claude!"
}]
)

print(message.content[0].text)

Supported Models

All Claude models available:

claude-3-5-sonnet-20241022 (latest, recommended)
claude-3-5-haiku-20241022
claude-3-opus-20240229
claude-3-sonnet-20240229
claude-3-haiku-20240307

Streaming

Stream responses for real-time output:

with client.messages.stream(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Write a story"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

System Messages

Claude uses a separate system parameter:

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    system="You are a helpful AI assistant specializing in Python programming.",
    messages=[{
        "role": "user",
        "content": "How do I read a CSV file?"
    }]
)

Vision

Send images with Claude 3 models:

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": [
            {
                "type": "image",
                "source": {
                    "type": "url",
                    "url": "https://example.com/image.jpg"
                }
            },
            {
                "type": "text",
                "text": "What's in this image?"
            }
        ]
    }]
)

Prompt Caching

Enable prompt caching to save costs on repeated content:

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    system=[{
        "type": "text",
        "text": "You are an AI assistant...",
        "cache_control": {"type": "ephemeral"}
    }],
    messages=[{
        "role": "user",
        "content": "Hello!"
    }]
)

# Check cache usage
print(f"Cache read tokens: {message.usage.cache_read_input_tokens}")
print(f"Cache creation tokens: {message.usage.cache_creation_input_tokens}")

Tool Use (Function Calling)

Define tools for Claude to use:

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    tools=[{
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City name"
                }
            },
            "required": ["location"]
        }
    }],
    messages=[{
        "role": "user",
        "content": "What's the weather in San Francisco?"
    }]
)

# Handle tool use
if message.stop_reason == "tool_use":
    tool_use = next(block for block in message.content if block.type == "tool_use")
    print(f"Claude wants to call: {tool_use.name}")
    print(f"With arguments: {tool_use.input}")

Gateway Features

The Anthropic endpoint supports all gateway features:

Routing

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
    metadata={
        "routing": {
            "strategy": "latency-optimized",
            "fallback": True
        }
    }
)

Request Metadata

message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello"}],
    metadata={
        "user_id": "user-123",
        "session_id": "sess-456"
    }
)

Cost Tracking

All requests through the Anthropic endpoint are tracked in analytics with accurate cost attribution.

Differences from Native Anthropic API

Feature	Native Anthropic	Via Gateway
Endpoint	`https://api.anthropic.com`	`https://gateway.visca.ai`
API Key	Anthropic API key	Visca gateway key
Analytics	Limited	Full dashboard
Routing	Single provider	Multi-provider fallback
Caching	Prompt caching	+ Response caching
Rate Limits	Anthropic limits	Configurable limits

Best Practices

Use System Messages

Define behavior with system parameter, not in first user message

Enable Caching

Cache large system prompts and documents to reduce costs by 90%

Stream When Possible

Use streaming for better user experience on long responses

Handle Tool Use

Implement proper tool use loops for agentic workflows

Error Handling

from anthropic import APIError, RateLimitError

try:
    message = client.messages.create(
        model="claude-3-5-sonnet-20241022",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}]
    )
except RateLimitError:
    print("Rate limit exceeded")
except APIError as e:
    print(f"API error: {e.status_code} - {e.message}")

Migration from Direct Anthropic

Change Base URL

Update base_url to https://gateway.visca.ai/v1

Update API Key

Use your Visca gateway API key instead of Anthropic key

Test Requests

Verify all requests work as expected

Enable Features

Add routing, metadata, and analytics as needed

Next Steps

OpenAI Endpoint

Use OpenAI format with cross-provider routing

Model Routing

Route between Claude and other models

Prompt Caching

Advanced caching strategies

Tool Use Guide

Build agents with Claude’s tool use

Get Started

Features

Integrations

Guides

Resources

Overview

Endpoint

Basic Usage

Supported Models

Streaming

System Messages

Vision

Prompt Caching

Tool Use (Function Calling)

Gateway Features

Routing

Request Metadata

Cost Tracking

Differences from Native Anthropic API

Best Practices

Use System Messages

Enable Caching

Stream When Possible

Handle Tool Use

Error Handling

Migration from Direct Anthropic

Next Steps

OpenAI Endpoint

Model Routing

Prompt Caching

Tool Use Guide

Get Started

Features

Integrations

Guides

Resources

​Overview

​Endpoint

​Basic Usage

​Supported Models

​Streaming

​System Messages

​Vision

​Prompt Caching

​Tool Use (Function Calling)

​Gateway Features

​Routing

​Request Metadata

​Cost Tracking

​Differences from Native Anthropic API

​Best Practices

Use System Messages

Enable Caching

Stream When Possible

Handle Tool Use

​Error Handling

​Migration from Direct Anthropic

​Next Steps

OpenAI Endpoint

Model Routing

Prompt Caching

Tool Use Guide

Overview

Endpoint

Basic Usage

Supported Models

Streaming

System Messages

Vision

Prompt Caching

Tool Use (Function Calling)

Gateway Features

Routing

Request Metadata

Cost Tracking

Differences from Native Anthropic API

Best Practices

Error Handling

Migration from Direct Anthropic

Next Steps