LLM Setup

This section explains how to connect language model providers in RealTimeX.

Use it when you need to:

choose the default system LLM
connect a cloud or local provider
select the default model for that provider
find the provider-specific setup page for a backend you want to use

Open the main selector from Settings > AI Providers > LLM.

💡

Use Large Language Models when you want the behavior guide for system defaults, workspace overrides, and provider strategy. This section is for connection and setup.

What this setting controls

The LLM page is the instance-wide source of truth for the default chat model provider.

From that page, you configure:

the system default provider
provider credentials, endpoints, or base URLs
the default model used by that provider
plugin-provided LLM providers when plugins register them
the provider list used by workspace and agent override pickers

Common setup flow

Most providers follow the same setup pattern:

Open Settings > AI Providers > LLM.
Select the provider you want to use.
Enter the required connection details.
Choose a model, or type a model ID when the provider expects manual entry.
Save the configuration.
Test the result in a normal chat before using it in workspaces, agents, or automations.

Depending on the provider, RealTimeX may load the model list only after you enter a valid API key, endpoint, or base URL.

How provider selection layers work

System default

The system LLM is the fallback provider for the instance.

If nothing more specific is selected, RealTimeX uses this provider for:

regular chats
new workspaces that inherit the default
tools or automations that do not define their own provider

Workspace override

Workspaces can stay on System default or choose a different provider and model for that workspace only.

Use workspace overrides when:

one team needs a different vendor
one workspace needs a larger context window
one project should stay on local inference while the rest of the instance uses cloud models

Agent and runtime override

Some agent surfaces can override the inherited provider and model as well.

If an agent or runtime does not define its own LLM, it falls back to the workspace or system selection, depending on that feature's configuration.

Which provider type should you choose?

RealTimeX-managed options

RealTimeX Cloud is the simplest hosted path when you want a managed cloud provider inside RealTimeX.
RealTimeX Local is the integrated local GGUF workflow when you want on-device inference without running a separate model server yourself.

RealTimeX Cloud is configured directly from the same LLM selector and normally does not require a separate setup flow beyond choosing the provider and model.

External cloud providers

Use providers like OpenAI, Anthropic, Gemini, Azure OpenAI, Groq, or Mistral when your team already operates around that vendor or you want access to that model family directly.

Local or self-hosted endpoints

Use providers like Ollama, LM Studio, Local AI, KoboldCPP, llama.cpp, or Generic OpenAI when you already run local inference infrastructure or a self-hosted gateway.

Compatibility and proxy layers

Use Generic OpenAI or LiteLLM when you want RealTimeX to connect through an OpenAI-compatible proxy or compatibility layer instead of a vendor-specific integration.

Plugin providers

Plugins can register additional LLM providers. When that happens, they appear in the same selector as built-in providers and follow the same general setup flow.

Local and managed setup guides

RealTimeX Local

Managed on-device GGUF runtime, download flow, and local model execution.

Local Models

Model inventory, status, pulling, downloading, warming, and deletion across local backends.

Ollama

Connect to an Ollama server and choose one of your installed models.

LM Studio

Use LM Studio as a local LLM backend for RealTimeX.

Local AI

Connect to a Local AI deployment for self-hosted inference.

KoboldCPP

Use KoboldCPP for GGUF-based local model serving.

Cloud setup guides

OpenAI

Connect a standard OpenAI API key and choose an OpenAI chat model.

Azure OpenAI

Use Azure-hosted OpenAI deployments with endpoint and deployment configuration.

Anthropic

Connect Claude models directly through Anthropic.

Google Gemini

Use Gemini models through Google's hosted API.

Groq

Connect Groq-hosted models for low-latency inference.

Mistral AI

Use Mistral-hosted models through the Mistral API.

Cohere

Connect Cohere command models through a hosted API key.

OpenRouter

Route model access through OpenRouter's multi-provider catalog.

Together AI

Use Together-hosted open model APIs inside RealTimeX.

Perplexity AI

Connect hosted Perplexity models where that provider fits your workflow.

AWS Bedrock

Use foundation models through your AWS Bedrock account.

Hugging Face

Point RealTimeX at a Hugging Face Inference Endpoint.

Compatibility and gateway guides

Generic OpenAI

Connect any OpenAI-compatible service through a custom endpoint and model configuration.

Providers you may also see in the app

The current in-app selector can include additional built-in providers and runtimes that do not all have dedicated setup guides in this section yet, including:

DeepSeek
Fireworks AI
Novita AI
Moonshot AI
xAI
PPIO
NVIDIA NIM
LiteLLM
APIpie
llama.cpp
Oobabooga Web UI
Dell Pro AI Studio

Those still follow the same core pattern:

choose the provider
enter the required API key or endpoint
select or enter a model
save and test

Related guides

Cohere RealTimeX Local