Skip to main content

Welcome

This guide will help you make your first API request to Visca AI Gateway. You’ll learn how to:
  • Obtain your API key
  • Make your first request
  • Use streaming responses
  • Handle errors effectively
Prerequisites: You’ll need an account at gateway.visca.ai or a self-hosted instance. See the Self-Host Guide for deployment options.

Step 1: Get Your API Key

1

Sign up or log in

Visit gateway.visca.ai and create an account or log in to your existing account.
2

Navigate to API Keys

Go to the API Keys section in your dashboard.
3

Generate a new key

Click Create API Key, give it a descriptive name, and optionally set usage limits.
Copy your API key immediately—it won’t be shown again. Store it securely in your environment variables or secrets manager.
4

Set environment variable

Add your API key to your environment: bash export VISCA_API_KEY="vsk_your_api_key_here"

Step 2: Install SDK (Optional)

While Visca AI Gateway is fully compatible with OpenAI’s SDKs, you can use any HTTP client. Here’s how to set up the OpenAI SDK:
  • Python
  • JavaScript/TypeScript
  • Go
  • Ruby
bash pip install openai

Step 3: Make Your First Request

Choose your preferred language and make your first request:
from openai import OpenAI

# Initialize the client
client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key="vsk_your_api_key_here"  # Use environment variable in production
)

# Create a chat completion
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing in simple terms."}
    ],
    max_tokens=150
)

# Print the response
print(response.choices[0].message.content)

Step 4: Try Streaming Responses

For real-time applications, use streaming to receive responses as they’re generated:
from openai import OpenAI

client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key=os.environ.get("VISCA_API_KEY")
)

# Stream the response
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Write a short poem about AI."}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="")

Available Models

Visca AI Gateway supports 50+ models across multiple providers. Here are some popular options:
  • OpenAI
  • Anthropic
  • Google
  • Open Source
  • gpt-4o - Most capable, multimodal model - gpt-4o-mini - Fast and affordable - gpt-4-turbo - Previous generation flagship - gpt-3.5-turbo
  • Fast and cost-effective - dall-e-3 - Image generation
To see all available models, make a request to the /v1/models endpoint or check your dashboard.

Using Different Providers

Simply change the model name to use a different provider:
# OpenAI
response = client.chat.completions.create(model="gpt-4o", ...)

# Anthropic
response = client.chat.completions.create(model="claude-3-5-sonnet-20241022", ...)

# Google
response = client.chat.completions.create(model="gemini-2.0-flash-exp", ...)

# Open source via Groq (ultra-fast)
response = client.chat.completions.create(model="llama-3.1-70b", ...)

Request Metadata

Track requests with custom metadata for analytics and cost allocation:
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
    extra_headers={
        "X-Visca-Metadata": json.dumps({
            "user_id": "user_123",
            "app_name": "my_app",
            "environment": "production"
        })
    }
)
Learn more about metadata tracking.

Error Handling

Always implement proper error handling for production applications:
from openai import OpenAI, APIError, RateLimitError, APIConnectionError

client = OpenAI(
    base_url="https://api.visca.ai/v1",
    api_key=os.environ.get("VISCA_API_KEY")
)

try:
    response = client.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": "Hello!"}]
    )
    print(response.choices[0].message.content)

except RateLimitError as e:
    print(f"Rate limit exceeded: {e}")

except APIConnectionError as e:
    print(f"Connection error: {e}")

except APIError as e:
    print(f"API error: {e}")

Common Error Codes

Status CodeMeaningSolution
400Bad RequestCheck your request parameters
401UnauthorizedVerify your API key is correct
403ForbiddenCheck API key permissions and rate limits
429Rate LimitImplement exponential backoff and retry
500Server ErrorRetry with exponential backoff
503Service UnavailableProvider is down, will auto-failover if configured

Best Practices

  • Never hardcode API keys in your source code
  • Use environment variables or secrets management
  • Rotate keys regularly
  • Set up usage limits per key
  • Use different keys for development and production
import time
from openai import OpenAI, APIError

def make_request_with_retry(client, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=[{"role": "user", "content": "Hello!"}]
            )
        except APIError as e:
            if attempt == max_retries - 1:
                raise
            wait_time = 2 ** attempt  # Exponential backoff
            time.sleep(wait_time)
  • Use the dashboard to track requests, costs, and latency - Set up alerts for unusual spending patterns - Add metadata to requests for detailed analytics - Review cost reports regularly
  • Use streaming for real-time applications
  • Set appropriate max_tokens to control costs
  • Choose the right model for your use case (cost vs. capability)
  • Enable caching for repeated queries
  • Use routing strategies for optimal latency/cost

Next Steps

Need Help?