Streaming & tool calling

Streaming

Streaming returns the model output as it's generated — good for typewriter chat and long output.

API	How to enable
OpenAI-compatible `/v1/chat/completions`	add `"stream": true`
Claude `/v1/messages`	add `"stream": true`
Gemini	use the `:streamGenerateContent` endpoint

OpenAI-compatible streaming returns standard SSE (text/event-stream), pushing data: {...} chunks and ending with data: [DONE]. The SDKs wrap this — see Python and Node.js.

Streaming and billing

Even if the stream is cut off mid-way (client closes, timeout, service error), the platform settles on the real usage produced so far as best it can; when real usage isn't available it falls back to the estimate, and any excess hold is released by a background safety net. See Recharge & Billing.

Raw SSE parsing (cURL)

bash

curl -N https://gateway.mindproxy.ai/v1/chat/completions \
  -H "Authorization: Bearer $TT_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o-mini","stream":true,
       "messages":[{"role":"user","content":"Count to 5"}]}'

-N disables buffering so data: chunks print line by line.

Tool calling (function calling / tools)

The platform supports tools / tool_choice on the OpenAI-compatible API; the model decides whether to call a tool.

python

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get the weather for a city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What's the weather in Beijing?"}],
    tools=tools,
    tool_choice="auto",
)

tool_calls = resp.choices[0].message.tool_calls
if tool_calls:
    # Parse tool_calls[0].function.name / .arguments, run the tool, then send
    # the result back as a role="tool" message and request again for the final answer.
    ...

Model capability differences

Whether tool calling, JSON mode, multimodal (image input), thinking, etc. work depends on the specific model you choose. The capability matrix is on Models & plans. Capabilities depend on the model you choose.