Vision (Image Input)
Vision via the public API (/v1/chat/completions) is under development. This feature is available through the Mira Chat interface at platform.vmira.ai.
Mira models can analyze images included in your request. You can send images as Base64-encoded strings or as URL references, using the OpenAI-compatible content blocks format.
Vision is available for all models: mira, mira-pro, and mira-max. The mira-pro and mira-max models with thinking mode may respond slower due to the reasoning step.
Supported Formats
FormatMIME TypeMax Size
JPEGimage/jpeg20 MB
PNGimage/png20 MB
GIFimage/gif20 MB
WebPimage/webp20 MB
Sending an Image via URL
The simplest approach is to pass a publicly accessible image URL inside a content block with type image_url.
cURL
curl https://api.vmira.ai/v1/chat/completions \
-H "Authorization: Bearer sk-mira-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "mira",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What is in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/photo.jpg"
}
}
]
}
],
"max_tokens": 1024
}'Sending a Base64 Image
If the image is on disk or generated dynamically, encode it to Base64 and pass it with a data URI.
Python
import base64, requests
with open("photo.png", "rb") as f:
img_b64 = base64.b64encode(f.read()).decode()
response = requests.post(
"https://api.vmira.ai/v1/chat/completions",
headers={"Authorization": "Bearer sk-mira-YOUR_KEY"},
json={
"model": "mira",
"messages": [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image"},
{
"type": "image_url",
"image_url": {
"url": f"data:image/png;base64,{img_b64}"
}
}
]
}
],
"max_tokens": 1024
}
)
print(response.json()["choices"][0]["message"]["content"])JavaScript
import fs from "fs";
const imgBuffer = fs.readFileSync("photo.png");
const imgB64 = imgBuffer.toString("base64");
const response = await fetch("https://api.vmira.ai/v1/chat/completions", {
method: "POST",
headers: {
"Authorization": "Bearer sk-mira-YOUR_KEY",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "mira",
messages: [
{
role: "user",
content: [
{ type: "text", text: "What is in this image?" },
{
type: "image_url",
image_url: {
url: `data:image/png;base64,${imgB64}`,
},
},
],
},
],
max_tokens: 1024,
}),
});
const data = await response.json();
console.log(data.choices[0].message.content);Multiple Images
You can send multiple images in a single request by adding multiple image_url blocks in the content array. The model will analyze all images together.
JSON body
{
"model": "mira-pro",
"messages": [
{
"role": "user",
"content": [
{ "type": "text", "text": "Compare these two images" },
{ "type": "image_url", "image_url": { "url": "https://example.com/image1.jpg" } },
{ "type": "image_url", "image_url": { "url": "https://example.com/image2.jpg" } }
]
}
],
"max_tokens": 2048
}You can send up to 10 images per request. Note that each image consumes tokens: approximately 85 tokens per 512x512 pixel tile.
Image Quality Best Practices
- Resolution — For fine details, use images at least 768px on the long edge. Very small images reduce accuracy.
- Clarity — Avoid blurry, heavily compressed, or very dark photos.
- Cropping — Crop the image to the region of interest so the model focuses on the relevant content.
- Text in images — The model reads printed text well. Handwritten text is recognized less reliably.
Limitations
- People identification — The model does not identify specific people by face. It can describe appearance but will not name individuals.
- Spatial reasoning — Precise measurements, counting small objects, and determining exact spatial positions may be inaccurate.
- Medical / specialized images — The model is not a diagnostic tool. Do not use it for medical diagnosis.
Do not send images containing sensitive information (documents, passwords, personal data) unless your application is appropriately secured.