Chat completions

The chat completions endpoint is the main way to interact with Mira models. It accepts a conversation (a list of messages) and returns the model's next reply. Request and response shapes follow the OpenAI Chat Completions API, so any OpenAI SDK works as a drop-in — only the base URL and API key change.

Endpoint

POST/v1/chat/completionsCreate a chat completion

Request parameters

ParameterTypeReq.Description
modelstringYesMira model identifier — `mira` or `mira-thinking`. List via GET /v1/models.
messagesarrayYesConversation array. Each message has `role` (system / user / assistant) and `content`.
temperaturefloatNoSampling temperature (0 – 2). Lower is more deterministic; default ~0.7.
max_tokensintegerNoCap on completion tokens. Defaults to the model max.
streambooleanNoStream the response as Server-Sent Events. Defaults to false.
toolsarrayNoFunction-calling schema. See Tool use.
tool_choicestring|objectNoForce a specific tool, auto, or none. Mirrors OpenAI's shape.

Message format

  • systemsets the model's behavior. Conventionally first; optional.
  • userthe prompt the model should respond to.
  • assistantprior model responses — include them when continuing a conversation.

Example request

bash
curl https://api.vmira.ai/v1/chat/completions \
  -H "Authorization: Bearer $MIRA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mira",
    "messages": [
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user",   "content": "Explain recursion in simple terms"}
    ],
    "max_tokens": 1024,
    "temperature": 0.7
  }'
python
from openai import OpenAI

client = OpenAI(
    api_key="sk-mira-...",
    base_url="https://api.vmira.ai/v1",
)

resp = client.chat.completions.create(
    model="mira",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "Explain recursion in simple terms"},
    ],
    max_tokens=1024,
    temperature=0.7,
)
print(resp.choices[0].message.content)
typescript
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: process.env.MIRA_API_KEY,
  baseURL: "https://api.vmira.ai/v1",
});

const resp = await client.chat.completions.create({
  model: "mira",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user",   content: "Explain recursion in simple terms" },
  ],
  max_tokens: 1024,
  temperature: 0.7,
});
console.log(resp.choices[0].message.content);

Response shape

json
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1779800000,
  "model": "mira",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Recursion is when a function calls itself..." },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 150,
    "total_tokens": 175
  }
}

Streaming

When stream: true, the response arrives in chunks via Server-Sent Events. Each chunk carries a delta with new content. The stream terminates with a literal data: [DONE] sentinel.

sse
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1779800000,"model":"mira","choices":[{"index":0,"delta":{"role":"assistant","content":""},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1779800000,"model":"mira","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1779800000,"model":"mira","choices":[{"index":0,"delta":{"content":" world"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1779800000,"model":"mira","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
python
from openai import OpenAI

client = OpenAI(api_key="sk-mira-...", base_url="https://api.vmira.ai/v1")

stream = client.chat.completions.create(
    model="mira",
    messages=[{"role": "user", "content": "Stream me a haiku about TLS."}],
    stream=True,
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="", flush=True)

Next steps