Models
Mira ships a small, deliberate lineup — two models, both backed by the same DeepSeek-based inference stack. Pick the faster one by default; reach for the deeper one when correctness matters more than latency.
Current lineup
ModelBest forContextMax output
miraChat, generation, classification, light coding, agents1M tokens32K tokens
mira-thinkingLong-document reasoning, complex coding, math/logic1M tokens32K tokens
The lineup is intentionally small. There is no
mira-pro, mira-max, or per-task SKU — pick mira by default and reach for mira-thinking when correctness matters more than latency.Picking a model
- Start with mira — it handles 90%+ of real workloads and is materially faster. Default for chatbots, content generation, and agents.
- Upgrade to mira-thinking — when you see hallucinations on hard reasoning, when you're processing long documents (>100K tokens of context), or when a tool-using agent keeps getting stuck.
List models programmatically
The endpoint mirrors OpenAI's /v1/models:
bash
curl https://api.vmira.ai/v1/models \
-H "Authorization: Bearer $MIRA_API_KEY"json
{
"object": "list",
"data": [
{
"id": "mira",
"object": "model",
"context_window": 1000000,
"max_output_tokens": 32768
},
{
"id": "mira-thinking",
"object": "model",
"context_window": 1000000,
"max_output_tokens": 32768
}
]
}Pricing
Per-token rates live on the public pricing page: platform.vmira.ai/pricing.