Realtime TTS

Realtime TTS is for low-latency preview, conversational playback, and applications that submit text in smaller chunks. On the overseas Open API, realtime availability is controlled by account capability and rollout status. Confirm availability with your account team and the current GET /api/openapi.json schema before building production traffic around it.

Request

Realtime integrations use a WebSocket session when enabled for the account. The session still authenticates with the same bearer token model:

Authorization: Bearer YOUR_API_TOKEN

Client events usually include session start, text chunks, flush, and stop semantics. Keep the token on your server or trusted backend component; do not place it in a public browser client.

Response

Realtime responses are streamed events rather than one JSON document. Your client should handle audio chunks, partial status events, terminal completion, and terminal error events. Persist enough local state to reconnect or fall back to Async Jobs when a session fails.

Billing And Credits

Realtime generation may charge by generated content, session duration, or enabled model. Because partial audio may already have been produced before a disconnect, do not assume every interrupted session is free. Reconcile important sessions with Profile.

Errors

Authentication errors happen during connection setup. Session errors happen inside the event stream. When realtime is not enabled for the account, use sync HTTP or async jobs as the supported overseas path.

Realtime TTS

Realtime TTS

Request

Response

Billing And Credits

Errors

On this page