Realtime TTS
Realtime text-to-speech availability and integration notes.
Realtime TTS
Realtime TTS is for low-latency preview, conversational playback, and applications that submit text in smaller chunks. On the overseas Open API, realtime availability is controlled by account capability and rollout status. Confirm availability with your account team and the current GET /api/openapi.json schema before building production traffic around it.
Request
Realtime integrations use a WebSocket session when enabled for the account. The session still authenticates with the same bearer token model:
Authorization: Bearer YOUR_API_TOKENClient events usually include session start, text chunks, flush, and stop semantics. Keep the token on your server or trusted backend component; do not place it in a public browser client.
Response
Realtime responses are streamed events rather than one JSON document. Your client should handle audio chunks, partial status events, terminal completion, and terminal error events. Persist enough local state to reconnect or fall back to Async Jobs when a session fails.
Billing And Credits
Realtime generation may charge by generated content, session duration, or enabled model. Because partial audio may already have been produced before a disconnect, do not assume every interrupted session is free. Reconcile important sessions with Profile.
Errors
Authentication errors happen during connection setup. Session errors happen inside the event stream. When realtime is not enabled for the account, use sync HTTP or async jobs as the supported overseas path.