Feature Overview
Speech‑to‑Text (Walkie‑Talkie)

Speech‑to‑Text (Walkie‑Talkie)

Speech‑to‑Text lets you dictate into the chat composer instead of typing.

It uses the live speech provider configured in Settings > Voice & Speech.

Current live speech providers

The chat composer speech input currently supports:

  • Web Speech API
  • Whisper
  • Groq

If a different provider is configured elsewhere, the chat composer may warn that it is unsupported for prompt input.

How dictation works

When you start a Walkie‑Talkie session, RealTimeX listens to the mic and inserts the transcript into the prompt input.

The current behavior supports:

  • inserting text at the current cursor position
  • stopping automatically after silence
  • keeping a language preference for speech recognition
  • optional auto-submit after the final transcript is produced

You can also use the keyboard shortcut Ctrl+M to toggle dictation on or off.

Input language

The composer includes a language selector for speech recognition.

You can:

  • keep auto-detect enabled
  • choose a specific language manually
  • reuse your saved preference in later sessions

This is especially useful when browser recognition or local transcription keeps choosing the wrong language.

Provider behavior

Web Speech API

Use this for browser-native speech recognition when it is available.

Whisper

Use this for local speech-to-text. In the current app, Whisper can use browser or desktop-native execution depending on your environment and settings.

Groq

Use this for fast cloud speech-to-text.

Troubleshooting

  • Nothing appears while speaking Check microphone permissions and confirm the selected speech provider is supported in chat input.
  • Web Speech is unavailable Switch to Whisper or Groq in Settings > Voice & Speech.
  • Dictation uses the wrong language Pick an explicit language instead of auto-detect.
  • Final text sends too early Review whether auto-submit speech input is enabled in your appearance or chat preferences.