Feature Overview
Language Models

Large Language Models

RealTimeX lets you choose a system-wide language model provider, then override it per workspace when a specific chat flow needs something different.

Open the main selector from Settings > AI Providers > LLM.

How the system LLM setting works

The Language Model page controls the default provider used by the instance.

That default matters for:

  • workspace chats that inherit the system setting
  • features that rely on the shared system LLM preference
  • any workflow that does not explicitly choose a different provider

When you select a provider, RealTimeX shows that provider's configuration fields directly below the selector. Depending on the provider, that can include:

  • API keys
  • base URLs
  • deployment names
  • token limits
  • model selectors
  • provider-specific advanced options

Provider types

The current product supports several kinds of language model provider.

RealTimeX-managed options

  • RealTimeX Cloud for hosted models without local inference setup
  • RealTimeX Local for GGUF models running on your machine through the managed llama-server runtime

Hosted cloud providers

Current built-in cloud choices include providers such as:

  • OpenAI
  • Azure OpenAI
  • Anthropic
  • Gemini
  • DeepSeek
  • Groq
  • Mistral
  • Cohere
  • Perplexity
  • OpenRouter
  • Together AI
  • Fireworks AI
  • AWS Bedrock
  • xAI
  • Moonshot AI
  • Novita AI
  • PPIO
  • APIpie

Self-hosted and local endpoints

Current built-in local or self-hosted choices include:

  • Ollama
  • LM Studio
  • llama.cpp
  • Local AI
  • KoboldCPP
  • Oobabooga Web UI
  • Dell Pro AI Studio
  • LiteLLM
  • Generic OpenAI

Plugin providers

Plugins can register additional LLM providers. When that happens, those providers appear in the same selector as the built-in options.

Workspace overrides

Each workspace can either inherit the system default or choose its own provider.

In workspace chat settings, the current flow supports:

  • System default to inherit the instance-wide provider
  • a workspace-specific provider selection
  • a workspace-specific model selection when that provider supports it

Some providers do not yet support full multi-model workspace selection. In those cases, the workspace can still point at that provider, but the actual model comes from the system-level configuration for that provider.

Local model management

If you use local inference, there are two related settings areas.

Settings > AI Providers > RealTimeX Local

Use this page when you want the built-in managed local workflow. The current product uses it to manage:

  • the llama-server runtime
  • runtime download and update status
  • hardware/backend availability
  • the default RealTimeX Local model

For the full workflow, see RealTimeX Local.

Settings > AI Providers > Local Model Management

Use this page when you want to inspect or manage model inventories across local backends.

The current UI covers:

  • Ollama
  • LocalAI
  • LM Studio
  • llama.cpp
  • RealTimeX Local

Depending on the backend, you can inspect status, review model counts, pull or download models, delete models, warm models into memory, and set defaults.

For the cross-provider guide, see Local Models.

Choosing the right option

  • Use RealTimeX Cloud when you want the simplest hosted setup.
  • Use RealTimeX Local when you want an integrated on-device GGUF workflow without running a separate model server yourself.
  • Use a hosted provider like OpenAI, Anthropic, or Gemini when your team already operates around that vendor.
  • Use Generic OpenAI or LiteLLM when you want a compatibility layer in front of multiple model backends.
  • Use Ollama, LM Studio, or llama.cpp when you already run local model infrastructure and want RealTimeX to connect to it.

Credentials and access

Many hosted providers require keys, endpoints, or deployment details. Those are configured inside the provider section on the Language Model page.

Use API Access & Keys when outside clients need to call into RealTimeX.

Use Credentials when RealTimeX agents or tools need reusable outbound secrets for other systems.