Thinking mode
Thinking mode is a feature of the mira-pro and mira-max models that performs an internal "chain of thought" before generating the final response. This approach significantly improves accuracy on tasks requiring logical reasoning, mathematical computation, and multi-step analysis.
How it works
When you send a request to mira-pro or mira-max, the model goes through two stages:
- Thinking stage — the model generates an internal reasoning trace, breaking the problem into sub-tasks, verifying intermediate results, and exploring alternative approaches
- Response stage — based on the reasoning, the model produces a final, edited response incorporating all discovered solutions
Thinking tokens are included in usage.completion_tokens and billed as output tokens. You can see the reasoning trace in the API response via the thinking_content field.
When to use thinking mode
Using via API
Thinking mode is automatically available when using the mira-pro and mira-max models. No additional parameters are required.
const response = await fetch("https://api.vmira.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer sk-mira-YOUR_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "mira-pro",
messages: [{
role: "user",
content: "Prove that the sum of angles in a triangle is 180 degrees"
}],
}),
});Reading thinking output
The API response contains both the final answer (content) and the reasoning chain (thinking_content):
{
"choices": [{
"message": {
"role": "assistant",
"content": "The sum of angles in a triangle is 180°. Here is a formal proof...",
"thinking_content": "I need to prove this rigorously. Let me consider the parallel lines approach..."
}
}],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 1450,
"total_tokens": 1478
}
}Comparison: with and without thinking
Without thinking (mira)
User: What is 17 × 23 + 156 ÷ 12 - 89? mira response: 17 × 23 = 391, 156 ÷ 12 = 13, 391 + 13 - 89 = 315
With thinking (mira-pro)
[Thinking] Let me break this expression into parts following order of operations (PEMDAS): 1. Multiplication: 17 × 23 = 391 2. Division: 156 ÷ 12 = 13 3. Addition: 391 + 13 = 404 4. Subtraction: 404 - 89 = 315 Let me verify: 17×23 = 17×20 + 17×3 = 340 + 51 = 391 ✓ 156÷12 = 13 ✓ (since 12×13 = 156) 391 + 13 = 404, 404 - 89 = 315 ✓ [Final answer] 17 × 23 + 156 ÷ 12 - 89 = 315 Computation steps: • 17 × 23 = 391 • 156 ÷ 12 = 13 • 391 + 13 - 89 = 315
Thinking token costs
Thinking tokens are available in mira-pro and mira-max and are billed at the output token rate of the respective model (300 ₽ / 1M for mira-pro, 750 ₽ / 1M for mira-max). A typical thinking request uses between 200 and 2,000 additional tokens for the reasoning chain.
Limitations
- Latency — response time is 2-5x longer than the standard mira model due to the additional reasoning stage
- Context — 32K context window — for large documents use mira-pro or mira-max
- Streaming — when streaming, thinking_content is sent before the main response
- Not for every task — thinking mode doesn't improve knowledge-recall tasks (facts, translation)