Thinking mode
Thinking mode is a feature of the mira-thinking model that performs an internal "chain of thought" before generating the final response. This approach significantly improves accuracy on tasks requiring logical reasoning, mathematical computation, and multi-step analysis.
How it works
When you send a request to mira-thinking, the model goes through two stages:
- Thinking stage — the model generates an internal reasoning trace, breaking the problem into sub-tasks, verifying intermediate results, and exploring alternative approaches
- Response stage — based on the reasoning, the model produces a final, edited response incorporating all discovered solutions
Thinking tokens are included in usage.completion_tokens and billed as output tokens. You can see the reasoning trace in the API response via the thinking_content field.
When to use thinking mode
Using via API
Thinking mode is automatically available when using the mira-thinking model. No additional parameters are required.
const response = await fetch("https://api.vmira.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer sk-mira-YOUR_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "mira-thinking",
messages: [{
role: "user",
content: "Prove that the sum of angles in a triangle is 180 degrees"
}],
}),
});Reading thinking output
The API response contains both the final answer (content) and the reasoning chain (thinking_content):
{
"choices": [{
"message": {
"role": "assistant",
"content": "The sum of angles in a triangle is 180°. Here is a formal proof...",
"thinking_content": "I need to prove this rigorously. Let me consider the parallel lines approach..."
}
}],
"usage": {
"prompt_tokens": 28,
"completion_tokens": 1450,
"total_tokens": 1478
}
}Comparison: with and without thinking
Without thinking (mira)
User: What is 17 × 23 + 156 ÷ 12 - 89?
mira response: 17 × 23 = 391, 156 ÷ 12 = 13, 391 + 13 - 89 = 315With thinking (mira-thinking)
[Thinking]
Let me break this expression into parts following order of operations (PEMDAS):
1. Multiplication: 17 × 23 = 391
2. Division: 156 ÷ 12 = 13
3. Addition: 391 + 13 = 404
4. Subtraction: 404 - 89 = 315
Let me verify: 17×23 = 17×20 + 17×3 = 340 + 51 = 391 ✓
156÷12 = 13 ✓ (since 12×13 = 156)
391 + 13 = 404, 404 - 89 = 315 ✓
[Final answer]
17 × 23 + 156 ÷ 12 - 89 = 315
Computation steps:
• 17 × 23 = 391
• 156 ÷ 12 = 13
• 391 + 13 - 89 = 315Thinking token costs
Thinking tokens are available in the mira-thinking model and are billed at the output token rate. A typical thinking request uses between 200 and 2,000 additional tokens for the reasoning chain. Current rates: platform.vmira.ai/pricing.
Limitations
- Latency — response time is 2-5x longer than the standard mira model due to the additional reasoning stage
- Context — 1M context window — sufficient for most documents
- Streaming — when streaming, thinking_content is sent before the main response
- Not for every task — thinking mode doesn't improve knowledge-recall tasks (facts, translation)