Skip to main content

Overview

Reasoning models like OpenAI’s O1 and O3 use extended thinking time to solve complex problems through chain-of-thought processing, making them ideal for mathematics, coding, scientific reasoning, and strategic planning.

Extended Thinking

Models think step-by-step before answering

Complex Problem Solving

Excel at math, logic, code, and science

Transparent Reasoning

See the model’s thinking process

Higher Accuracy

Fewer errors on complex tasks

Supported Models

  • O3
  • O1
  • O1-Mini
Model: o3 - Latest reasoning model - Highest accuracy on complex tasks
  • Extended context window (128k tokens) - $15 / 1M input tokens Best for: Research, advanced mathematics, complex code generation

Basic Usage

import openai

client = openai.OpenAI(
base_url="https://gateway.visca.ai/v1",
api_key="your-api-key"
)

response = client.chat.completions.create(
model="o1",
messages=[{
"role": "user",
"content": """Solve this problem step by step:
A train travels from City A to City B at 60 mph,
then returns at 40 mph. What is the average speed for the entire trip?"""
}]
)

print(response.choices[0].message.content)

Reasoning Tokens

Reasoning models use additional “reasoning tokens” for internal thinking:
response = client.chat.completions.create(
    model="o1",
    messages=[{"role": "user", "content": "Complex math problem..."}]
)

# Check token usage
print(f"Input tokens: {response.usage.prompt_tokens}")
print(f"Reasoning tokens: {response.usage.completion_tokens_details.reasoning_tokens}")
print(f"Output tokens: {response.usage.completion_tokens}")
Reasoning tokens are counted separately and billed at a lower rate than output tokens.

Use Cases

response = client.chat.completions.create(
    model="o1",
    messages=[{
        "role": "user",
        "content": """Solve this physics problem with full explanation:
        
        A 2kg object is thrown upward with initial velocity of 20 m/s.
        Ignoring air resistance, calculate:
        1. Maximum height reached
        2. Time to reach maximum height
        3. Total time in air
        4. Velocity when it returns to starting point
        
        Show all work and formulas used."""
    }]
)
response = client.chat.completions.create(
    model="o1-mini",
    messages=[{
        "role": "user",
        "content": """Review this Python code and fix any bugs:
        
        def quicksort(arr):
            if len(arr) <= 1:
                return arr
            pivot = arr[0]
            left = [x for x in arr if x < pivot]
            right = [x for x in arr if x > pivot]
            return quicksort(left) + [pivot] + quicksort(right)
        
        What's wrong and how can it be optimized?"""
    }]
)
response = client.chat.completions.create(
    model="o3",
    messages=[{
        "role": "user",
        "content": """Analyze this chess position and suggest the best move:
        
        Position (FEN): rnbqkbnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq e6 0 2
        
        Provide:
        1. Best move in algebraic notation
        2. Strategic reasoning
        3. Alternative moves and why they're inferior
        4. Evaluation of the position"""
    }]
)
response = client.chat.completions.create(
    model="o1",
    messages=[{
        "role": "user",
        "content": """Analyze this dataset and provide insights:
        
        Sales data:
        Q1: $1.2M, Q2: $1.5M, Q3: $1.1M, Q4: $2.3M
        
        Marketing spend:
        Q1: $200K, Q2: $250K, Q3: $180K, Q4: $400K
        
        Calculate:
        1. ROI for each quarter
        2. Correlation between marketing and sales
        3. Forecast for Q1 next year
        4. Recommendations for marketing budget allocation"""
    }]
)

Best Practices

1

Be Explicit About Reasoning

Ask the model to “think step by step” or “show your work”
2

Provide Context

Include all relevant information and constraints
3

Structured Output

Request structured responses (numbered lists, tables)
4

Verify Results

Cross-check critical calculations and logic

Limitations

Reasoning models: - Cannot use system messages (user/assistant only) - Don’t support streaming - Don’t support function calling - Have higher latency (5-30 seconds typical) - Cost more per token than standard models

Model Selection

Choose the right reasoning model:
TaskRecommended ModelWhy
Advanced researchO3Highest accuracy, worth the cost
General math/codingO1Balanced performance and cost
Quick calculationsO1-MiniFast and affordable
Production APIsO1-MiniLower latency and cost
One-off analysisO3Best results for important decisions

Cost Optimization

# Use O1-Mini for simpler reasoning tasks
response = client.chat.completions.create(
    model="o1-mini",
    messages=[{"role": "user", "content": "Simple math problem"}]
)

# Use O1 for complex tasks
response = client.chat.completions.create(
    model="o1",
    messages=[{"role": "user", "content": "Complex multi-step analysis"}]
)

# Enable automatic model selection
response = client.chat.completions.create(
    model="reasoning-auto",  # Gateway selects O1-Mini or O1 or O3
    messages=[{"role": "user", "content": "Your problem"}],
    extra_body={
        "routing": {
            "strategy": "cost-optimized"
        }
    }
)

Monitoring Reasoning

Track reasoning performance:
response = client.chat.completions.create(
    model="o1",
    messages=[{"role": "user", "content": "Complex problem"}]
)

# Log reasoning metrics
print(f"Reasoning tokens: {response.usage.completion_tokens_details.reasoning_tokens}")
print(f"Reasoning time: {response.usage.completion_tokens_details.reasoning_time_ms}ms")
print(f"Accuracy confidence: {response.choices[0].finish_reason}")

Next Steps