Reasoning Models

Overview

Reasoning models like OpenAI’s O1 and O3 use extended thinking time to solve complex problems through chain-of-thought processing, making them ideal for mathematics, coding, scientific reasoning, and strategic planning.

Extended Thinking

Models think step-by-step before answering

Complex Problem Solving

Excel at math, logic, code, and science

Transparent Reasoning

See the model’s thinking process

Higher Accuracy

Fewer errors on complex tasks

Supported Models

O3
O1
O1-Mini

Model: o3 - Latest reasoning model - Highest accuracy on complex tasks

Extended context window (128k tokens) - $15 / 1M input tokens Best for: Research, advanced mathematics, complex code generation

Basic Usage

import openai

client = openai.OpenAI(
base_url="https://gateway.visca.ai/v1",
api_key="your-api-key"
)

response = client.chat.completions.create(
model="o1",
messages=[{
"role": "user",
"content": """Solve this problem step by step:
A train travels from City A to City B at 60 mph,
then returns at 40 mph. What is the average speed for the entire trip?"""
}]
)

print(response.choices[0].message.content)

Reasoning Tokens

Reasoning models use additional “reasoning tokens” for internal thinking:

response = client.chat.completions.create(
    model="o1",
    messages=[{"role": "user", "content": "Complex math problem..."}]
)

# Check token usage
print(f"Input tokens: {response.usage.prompt_tokens}")
print(f"Reasoning tokens: {response.usage.completion_tokens_details.reasoning_tokens}")
print(f"Output tokens: {response.usage.completion_tokens}")

Reasoning tokens are counted separately and billed at a lower rate than output tokens.

Use Cases

Mathematics & Physics

response = client.chat.completions.create(
    model="o1",
    messages=[{
        "role": "user",
        "content": """Solve this physics problem with full explanation:
        
        A 2kg object is thrown upward with initial velocity of 20 m/s.
        Ignoring air resistance, calculate:
        1. Maximum height reached
        2. Time to reach maximum height
        3. Total time in air
        4. Velocity when it returns to starting point
        
        Show all work and formulas used."""
    }]
)

Code Generation & Debugging

response = client.chat.completions.create(
    model="o1-mini",
    messages=[{
        "role": "user",
        "content": """Review this Python code and fix any bugs:
        
        def quicksort(arr):
            if len(arr) <= 1:
                return arr
            pivot = arr[0]
            left = [x for x in arr if x < pivot]
            right = [x for x in arr if x > pivot]
            return quicksort(left) + [pivot] + quicksort(right)
        
        What's wrong and how can it be optimized?"""
    }]
)

Logic & Strategy

response = client.chat.completions.create(
    model="o3",
    messages=[{
        "role": "user",
        "content": """Analyze this chess position and suggest the best move:
        
        Position (FEN): rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq e6 0 2
        
        Provide:
        1. Best move in algebraic notation
        2. Strategic reasoning
        3. Alternative moves and why they're inferior
        4. Evaluation of the position"""
    }]
)

Data Analysis

response = client.chat.completions.create(
    model="o1",
    messages=[{
        "role": "user",
        "content": """Analyze this dataset and provide insights:
        
        Sales data:
        Q1: $1.2M, Q2: $1.5M, Q3: $1.1M, Q4: $2.3M
        
        Marketing spend:
        Q1: $200K, Q2: $250K, Q3: $180K, Q4: $400K
        
        Calculate:
        1. ROI for each quarter
        2. Correlation between marketing and sales
        3. Forecast for Q1 next year
        4. Recommendations for marketing budget allocation"""
    }]
)

Best Practices

Be Explicit About Reasoning

Ask the model to “think step by step” or “show your work”

Provide Context

Include all relevant information and constraints

Structured Output

Request structured responses (numbered lists, tables)

Verify Results

Cross-check critical calculations and logic

Limitations

Reasoning models: - Cannot use system messages (user/assistant only) - Don’t support streaming - Don’t support function calling - Have higher latency (5-30 seconds typical) - Cost more per token than standard models

Model Selection

Choose the right reasoning model:

Task	Recommended Model	Why
Advanced research	O3	Highest accuracy, worth the cost
General math/coding	O1	Balanced performance and cost
Quick calculations	O1-Mini	Fast and affordable
Production APIs	O1-Mini	Lower latency and cost
One-off analysis	O3	Best results for important decisions

Cost Optimization

# Use O1-Mini for simpler reasoning tasks
response = client.chat.completions.create(
    model="o1-mini",
    messages=[{"role": "user", "content": "Simple math problem"}]
)

# Use O1 for complex tasks
response = client.chat.completions.create(
    model="o1",
    messages=[{"role": "user", "content": "Complex multi-step analysis"}]
)

# Enable automatic model selection
response = client.chat.completions.create(
    model="reasoning-auto",  # Gateway selects O1-Mini or O1 or O3
    messages=[{"role": "user", "content": "Your problem"}],
    extra_body={
        "routing": {
            "strategy": "cost-optimized"
        }
    }
)

Monitoring Reasoning

Track reasoning performance:

response = client.chat.completions.create(
    model="o1",
    messages=[{"role": "user", "content": "Complex problem"}]
)

# Log reasoning metrics
print(f"Reasoning tokens: {response.usage.completion_tokens_details.reasoning_tokens}")
print(f"Reasoning time: {response.usage.completion_tokens_details.reasoning_time_ms}ms")
print(f"Accuracy confidence: {response.choices[0].finish_reason}")

Next Steps

Function Calling

Use standard models with function calling for structured tasks

Custom Providers

Integrate your own reasoning models

Analytics

Track reasoning model performance and costs

Rate Limits

Understanding reasoning model rate limits

Get Started

Features

Integrations

Guides

Resources

Overview

Extended Thinking

Complex Problem Solving

Transparent Reasoning

Higher Accuracy

Supported Models

Basic Usage

Reasoning Tokens

Use Cases

Best Practices

Limitations

Model Selection

Cost Optimization

Monitoring Reasoning

Next Steps

Function Calling

Custom Providers

Analytics

Rate Limits

Get Started

Features

Integrations

Guides

Resources

​Overview

Extended Thinking

Complex Problem Solving

Transparent Reasoning

Higher Accuracy

​Supported Models

​Basic Usage

​Reasoning Tokens

​Use Cases

​Best Practices

​Limitations

​Model Selection

​Cost Optimization

​Monitoring Reasoning

​Next Steps

Function Calling

Custom Providers

Analytics

Rate Limits

Overview

Supported Models

Basic Usage

Reasoning Tokens

Use Cases

Best Practices

Limitations

Model Selection

Cost Optimization

Monitoring Reasoning

Next Steps