Choosing the right Mira model
Mira offers two models, each optimized for specific workloads. This guide helps you pick the right model.
Model comparison matrix
When to use each model
mira (Fast)
The fast model for everyday tasks. Ideal for chatbots, question answering, text summarization, translation, and code generation. The fastest and most cost-effective model.
- Best for — simple Q&A, summarization, translation, short code snippets
- Context — 1M tokens (~1500 pages of text)
- Typical latency — < 1 second to first token
mira-thinking
A model with built-in thinking mode for complex tasks. Performs internal chain-of-thought before answering. Designed for math problems, code debugging, logical analysis, and high-accuracy tasks.
- Best for — math, code debugging, logic puzzles, deep analysis
- Context — 1M tokens (~1500 pages of text)
- Feature — thinking mode — model reasons step-by-step before answering
Decision flowchart
Use this text-based flowchart to quickly identify the right model:
Do you need deep reasoning / max accuracy?
├─ Yes → mira-thinking (includes thinking mode)
└─ No → mira (best cost-performance ratio)Task-to-model matrix
Cost vs performance trade-offs
mira-thinking costs more due to additional thinking tokens but gives significantly more accurate results for complex tasks. For most applications, start with mira and upgrade to mira-thinking only when higher accuracy is needed.
Migrating between models
Both Mira models share the same API format. To switch between models, simply change the model parameter in your request. Prompts, tools, and system messages remain compatible.
// Simply change the model parameter
const response = await fetch("https://api.vmira.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer sk-mira-YOUR_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "mira-thinking", // was "mira" — just change this line
messages: [{ role: "user", content: "Analyze this document..." }],
}),
});