Models

Mira ships a small, deliberate lineup — two models, both backed by the same DeepSeek-based inference stack. Pick the faster one by default; reach for the deeper one when correctness matters more than latency.

Current lineup

ModelBest forContextMax output

miraChat, generation, classification, light coding, agents1M tokens32K tokens

mira-thinkingLong-document reasoning, complex coding, math/logic1M tokens32K tokens

The lineup is intentionally small. There is no mira-pro, mira-max, or per-task SKU — pick mira by default and reach for mira-thinking when correctness matters more than latency.

Picking a model

Start with mira — it handles 90%+ of real workloads and is materially faster. Default for chatbots, content generation, and agents.
Upgrade to mira-thinking — when you see hallucinations on hard reasoning, when you're processing long documents (>100K tokens of context), or when a tool-using agent keeps getting stuck.

List models programmatically

The endpoint mirrors OpenAI's /v1/models:

bash

curl https://api.vmira.ai/v1/models \
  -H "Authorization: Bearer $MIRA_API_KEY"

json

{
  "object": "list",
  "data": [
    {
      "id": "mira",
      "object": "model",
      "context_window": 1000000,
      "max_output_tokens": 32768
    },
    {
      "id": "mira-thinking",
      "object": "model",
      "context_window": 1000000,
      "max_output_tokens": 32768
    }
  ]
}

Pricing

Per-token rates live on the public pricing page: platform.vmira.ai/pricing.

Next steps

Choosing a model

Decision tree with concrete examples per use case.

Extended thinking

How mira-thinking allocates extra compute for hard problems.