Text-to-Speech (Read Aloud)

RealTimeX can read content aloud with a shared text-to-speech configuration.

Open these settings from Settings > Voice & Speech, then use the Text to Speech section.

Where read-aloud is used

The current product uses the shared TTS setting for surfaces such as:

The current Text to Speech selector includes:

PiperTTS and Supertonic-3 run locally in the browser.

Use them when you want:

Supertonic-3 is the more advanced multilingual local option in the current product.

OpenAI, ElevenLabs, and Groq use hosted speech services.

Use them when you want:

Use OpenAI Compatible when your team runs a TTS service that exposes an OpenAI-style API.

This can point to a local or remote endpoint.

Groq TTS uses a provider setting here, but the Groq API key itself is managed in Language Model settings.
PiperTTS and Supertonic-3 can cache local voice assets for faster repeat playback.
Older browser-native TTS references are no longer the main path in the current app. Existing native settings are migrated toward the local provider flow.

Read-aloud is designed to start progressively for longer content instead of waiting for the full response to finish processing.

In practice, that means:

No audio plays Check device output volume, active output device, and whether the tab or app is muted.
Playback fails for a cloud provider Re-check the provider-specific configuration and any required API key.
Local voice is slow on first use The provider may still be downloading or caching local voice assets.