Long context windows

Mira models support context windows from 32K to 128K tokens, enabling you to process large documents, codebases, and multi-turn conversations in a single request.

Context windows by model

ModelContextMax outputEffective input
mira32K4K28K
mira-pro64K8K56K
mira-max128K16K112K
Effective input = context - max output. This is the maximum data volume you can send while leaving room for a full response.

What fits in each model

Approximate text volumes each model can hold (1 token ~ 4 characters in English, ~2 characters in Russian):

Contentmira (32K)mira-pro (64K)mira-max (128K)
Text pages (English)~50~100~200
Text pages (Russian)~30~60~120
Lines of code~2,000~4,000~8,000
Code files (avg)~15-20~30-40~60-80
Emails~100~200~400
Book (pages)~80 pages~160 pages~320 pages

Best practices for long documents

1. Put instructions first

Place the system message and instructions before the long document. The model better follows instructions it sees before the main content.

Correct ordering
{
  "model": "mira-max",
  "messages": [
    {
      "role": "system",
      "content": "Analyze the following contract and highlight key risks."
    },
    {
      "role": "user",
      "content": "[50,000-token contract text here...]"
    }
  ]
}

2. Use section markers

For multi-part documents, use XML tags or markers to separate sections. This helps the model navigate long text.

Section markers
<document>
  <section id="terms">Terms of service...</section>
  <section id="privacy">Privacy policy...</section>
  <section id="sla">Service level agreement...</section>
</document>

Question: What does the SLA section say about uptime?

3. Ask for source references

When analyzing long documents, ask the model to indicate which part of the document the information comes from. This helps verify responses.

RAG vs long context

Two main approaches to working with large data volumes — Retrieval-Augmented Generation (RAG) and direct loading into long context. Each has its advantages:

CriteriaRAGLong context
Data volumeUnlimited (GB+)Up to 128K tokens
AccuracyDepends on retrievalSees full context
CostLower per requestHigher per request
SetupRequires index, embeddingsNo setup needed
Cross-document relationsWeakStrong
FreshnessRequires reindexingInstant
Recommendation: use long context (mira-max) when you need to see the full document or analyze relationships between parts. Use RAG when data exceeds context limits or when searching a large knowledge base.

Performance characteristics

  • Response timeincreases linearly with input size — expect ~2-5 seconds per 10K input tokens
  • Attention to detailMira models use improved attention mechanisms for long context, but information in the middle of a document can be less salient (the 'lost in the middle' effect)
  • Tipplace the most important information at the beginning and end of the document for best results

Example: codebase analysis

Multi-file analysis
{
  "model": "mira-max",
  "messages": [
    {
      "role": "system",
      "content": "You are an expert code reviewer. Analyze the codebase and find potential issues."
    },
    {
      "role": "user",
      "content": "<file path=\"src/auth.ts\">\n// auth.ts content...\n</file>\n<file path=\"src/db.ts\">\n// db.ts content...\n</file>\n<file path=\"src/api.ts\">\n// api.ts content...\n</file>\n\nAnalyze these three files for security and find vulnerabilities."
    }
  ]
}