Recharge & Billing
The platform is prepaid wallet based: recharge USDT into your organization wallet first, then calls are billed from the balance on real usage.
Recharge (USDT)
- In the console, open Recharge and create an order.
- The system returns a unique amount and a payment entry (Infini-hosted on-chain USDT).
- Complete the on-chain payment as prompted.
- After on-chain confirmation, the balance is credited automatically; check order status under "Recharge Records".
Pricing basis
Currently 1 USDT = 1 USD, no FX conversion. You pick the chain on the payment page; the actual receiving chain follows the callback.
Order status roughly flows: created → processing → completed. Overpayments are handled per platform policy (see the recharge page and order detail).
Billing model
Each call is charged in three steps:
- Hold: before the call, an estimated amount is frozen from available balance to prevent concurrent overspend.
- Settle: after the call completes, you're charged by real tokens / images.
- Release: the frozen excess over actual cost is released back to available balance.
So "available balance" may briefly dip below "total balance" mid-request, and recovers afterward. If settlement fails abnormally (network drop, client disconnect), the hold is auto-released by a background safety net after a while.
Pricing units
| Unit | Applies to | How it's billed |
|---|---|---|
| Per token | Chat / text / embeddings | By input + output tokens × price per 1M tokens |
| Per image | Image generation | By the actual number of images returned (not the request n) |
| Task billing | Video / Midjourney / async image | Settled per task; failed / cancelled / expired tasks are not charged, the hold is released |
Per-model input / output / cache prices and multipliers are on the console "Model Prices" page or Models & plans.
Cache billing
On a prompt cache or platform response-cache hit, the cache-read portion is billed at cache_read_price, usually much lower than the input price. Response headers carry cache-hit info:
| Header | Meaning |
|---|---|
X-TT-Cache-Status | Cache hit status (HIT / MISS, etc.) |
X-TT-Cache-Savings | Amount saved this call due to cache |
X-TT-Cache-Read-Tokens | Tokens billed at the cache price |
Balance and usage
| Console page | What to look at |
|---|---|
| Recharge | Current balance, recharge orders, ledger |
| Usage | Token usage and charges by time / model, CSV export |
| Requests | One call's billing detail, error detail (by request id) |
When balance runs out
Calls return 402 billing_shortfall. Recommendations:
- Set monthly limits on keys for critical apps so one app can't drain the balance.
- Watch the balance and recharge ahead of time.
See Errors & rate limits.