Large Language Models
RealTimeX lets you choose a system-wide language model provider, then override it per workspace when a specific chat flow needs something different.
Open the main selector from Settings > AI Providers > LLM.
How the system LLM setting works
The Language Model page controls the default provider used by the instance.
That default matters for:
- workspace chats that inherit the system setting
- features that rely on the shared system LLM preference
- any workflow that does not explicitly choose a different provider
When you select a provider, RealTimeX shows that provider's configuration fields directly below the selector. Depending on the provider, that can include:
- API keys
- base URLs
- deployment names
- token limits
- model selectors
- provider-specific advanced options
Provider types
The current product supports several kinds of language model provider.
RealTimeX-managed options
RealTimeX Cloudfor hosted models without local inference setupRealTimeX Localfor GGUF models running on your machine through the managedllama-serverruntime
Hosted cloud providers
Current built-in cloud choices include providers such as:
OpenAIAzure OpenAIAnthropicGeminiDeepSeekGroqMistralCoherePerplexityOpenRouterTogether AIFireworks AIAWS BedrockxAIMoonshot AINovita AIPPIOAPIpie
Self-hosted and local endpoints
Current built-in local or self-hosted choices include:
OllamaLM Studiollama.cppLocal AIKoboldCPPOobabooga Web UIDell Pro AI StudioLiteLLMGeneric OpenAI
Plugin providers
Plugins can register additional LLM providers. When that happens, those providers appear in the same selector as the built-in options.
Workspace overrides
Each workspace can either inherit the system default or choose its own provider.
In workspace chat settings, the current flow supports:
System defaultto inherit the instance-wide provider- a workspace-specific provider selection
- a workspace-specific model selection when that provider supports it
Some providers do not yet support full multi-model workspace selection. In those cases, the workspace can still point at that provider, but the actual model comes from the system-level configuration for that provider.
Local model management
If you use local inference, there are two related settings areas.
Settings > AI Providers > RealTimeX Local
Use this page when you want the built-in managed local workflow. The current product uses it to manage:
- the
llama-serverruntime - runtime download and update status
- hardware/backend availability
- the default
RealTimeX Localmodel
For the full workflow, see RealTimeX Local.
Settings > AI Providers > Local Model Management
Use this page when you want to inspect or manage model inventories across local backends.
The current UI covers:
OllamaLocalAILM Studiollama.cppRealTimeX Local
Depending on the backend, you can inspect status, review model counts, pull or download models, delete models, warm models into memory, and set defaults.
For the cross-provider guide, see Local Models.
Choosing the right option
- Use
RealTimeX Cloudwhen you want the simplest hosted setup. - Use
RealTimeX Localwhen you want an integrated on-device GGUF workflow without running a separate model server yourself. - Use a hosted provider like
OpenAI,Anthropic, orGeminiwhen your team already operates around that vendor. - Use
Generic OpenAIorLiteLLMwhen you want a compatibility layer in front of multiple model backends. - Use
Ollama,LM Studio, orllama.cppwhen you already run local model infrastructure and want RealTimeX to connect to it.
Credentials and access
Many hosted providers require keys, endpoints, or deployment details. Those are configured inside the provider section on the Language Model page.
Use API Access & Keys when outside clients need to call into RealTimeX.
Use Credentials when RealTimeX agents or tools need reusable outbound secrets for other systems.