Quick Start

Welcome

This guide will help you make your first API request to Visca AI Gateway. You’ll learn how to:

Obtain your API key
Make your first request
Use streaming responses
Handle errors effectively

Prerequisites: You’ll need an account at gateway.visca.ai or a self-hosted instance. See the Self-Host Guide for deployment options.

Step 1: Get Your API Key

Visit gateway.visca.ai and create an account or log in to your existing account.

Navigate to API Keys

Go to the API Keys section in your dashboard.

Generate a new key

Click Create API Key, give it a descriptive name, and optionally set usage limits.

Copy your API key immediately—it won’t be shown again. Store it securely in your environment variables or secrets manager.

Set environment variable

Add your API key to your environment: bash export VISCA_API_KEY="vsk_your_api_key_here"

Step 2: Install SDK (Optional)

While Visca AI Gateway is fully compatible with OpenAI’s SDKs, you can use any HTTP client. Here’s how to set up the OpenAI SDK:

Python
JavaScript/TypeScript
Go
Ruby

bash pip install openai

Step 3: Make Your First Request

Choose your preferred language and make your first request:

from openai import OpenAI

# Initialize the client
client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key="vsk_your_api_key_here"  # Use environment variable in production
)

# Create a chat completion
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    max_tokens=150
)

# Print the response
print(response.choices[0].message.content)

Step 4: Try Streaming Responses

For real-time applications, use streaming to receive responses as they’re generated:

from openai import OpenAI

client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key=os.environ.get("VISCA_API_KEY")
)

# Stream the response
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a short poem about AI."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Available Models

Visca AI Gateway supports 50+ models across multiple providers. Here are some popular options:

OpenAI
Anthropic
Google
Open Source

gpt-4o - Most capable, multimodal model - gpt-4o-mini - Fast and affordable - gpt-4-turbo - Previous generation flagship - gpt-3.5-turbo
Fast and cost-effective - dall-e-3 - Image generation

To see all available models, make a request to the /v1/models endpoint or check your dashboard.

Using Different Providers

Simply change the model name to use a different provider:

# OpenAI
response = client.chat.completions.create(model="gpt-4o", ...)

# Anthropic
response = client.chat.completions.create(model="claude-3-5-sonnet-20241022", ...)

# Google
response = client.chat.completions.create(model="gemini-2.0-flash-exp", ...)

# Open source via Groq (ultra-fast)
response = client.chat.completions.create(model="llama-3.1-70b", ...)

Request Metadata

Track requests with custom metadata for analytics and cost allocation:

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_headers={
        "X-Visca-Metadata": json.dumps({
            "user_id": "user_123",
            "app_name": "my_app",
            "environment": "production"
        })
    }
)

Learn more about metadata tracking.

Error Handling

Always implement proper error handling for production applications:

from openai import OpenAI, APIError, RateLimitError, APIConnectionError

client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key=os.environ.get("VISCA_API_KEY")
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(response.choices[0].message.content)

except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")

except APIConnectionError as e:
    print(f"Connection error: {e}")

except APIError as e:
    print(f"API error: {e}")

Common Error Codes

Status Code	Meaning	Solution
400	Bad Request	Check your request parameters
401	Unauthorized	Verify your API key is correct
403	Forbidden	Check API key permissions and rate limits
429	Rate Limit	Implement exponential backoff and retry
500	Server Error	Retry with exponential backoff
503	Service Unavailable	Provider is down, will auto-failover if configured

Best Practices

Secure Your API Keys

Never hardcode API keys in your source code
Use environment variables or secrets management
Rotate keys regularly
Set up usage limits per key
Use different keys for development and production

Implement Retry Logic

import time
from openai import OpenAI, APIError

def make_request_with_retry(client, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": "Hello!"}]
            )
        except APIError as e:
            if attempt == max_retries - 1:
                raise
            wait_time = 2 ** attempt  # Exponential backoff
            time.sleep(wait_time)

Monitor Usage and Costs

Use the dashboard to track requests, costs, and latency - Set up alerts for unusual spending patterns - Add metadata to requests for detailed analytics - Review cost reports regularly

Optimize Performance

Use streaming for real-time applications
Set appropriate max_tokens to control costs
Choose the right model for your use case (cost vs. capability)
Enable caching for repeated queries
Use routing strategies for optimal latency/cost

Next Steps

Intelligent Routing

Set up cost optimization and automatic failover

API Keys & Security

Configure fine-grained access control and limits

Vision & Multimodal

Work with images and vision models

Self-Host

Deploy on your own infrastructure

Need Help?

Features

Explore advanced features

Discord Community

Get help from the community

GitHub Issues

Report bugs and request features

Email Support

Contact our support team

Get Started

Features

Integrations

Guides

Resources

Welcome

Step 1: Get Your API Key

Step 2: Install SDK (Optional)

Step 3: Make Your First Request

Step 4: Try Streaming Responses

Available Models

Using Different Providers

Request Metadata

Error Handling

Common Error Codes

Best Practices

Next Steps

Intelligent Routing

API Keys & Security

Vision & Multimodal

Self-Host

Need Help?

Features

Discord Community

GitHub Issues

Email Support

Get Started

Features

Integrations

Guides

Resources

​Welcome

​Step 1: Get Your API Key

​Step 2: Install SDK (Optional)

​Step 3: Make Your First Request

​Step 4: Try Streaming Responses

​Available Models

​Using Different Providers

​Request Metadata

​Error Handling

​Common Error Codes

​Best Practices

​Next Steps

Intelligent Routing

API Keys & Security

Vision & Multimodal

Self-Host

​Need Help?

Features

Discord Community

GitHub Issues

Email Support

Welcome

Step 1: Get Your API Key

Step 2: Install SDK (Optional)

Step 3: Make Your First Request

Step 4: Try Streaming Responses

Available Models

Using Different Providers

Request Metadata

Error Handling

Common Error Codes

Best Practices

Next Steps

Need Help?