RealTimeX Local
RealTimeX Local is the managed built-in local LLM provider in RealTimeX.
Use it when you want RealTimeX to run GGUF chat models on your own machine without connecting to a separate Ollama, LM Studio, or other external server.
The deeper operational guides now live at RealTimeX Local and Local Models. This page is the setup handoff from Settings > AI Providers > LLM.
Where it appears
- provider selection:
Settings > AI Providers > LLM - runtime and built-in model management:
Settings > AI Providers > RealTimeX Local - broader cross-provider inventory:
Settings > AI Providers > Local Model Management
What this provider controls
Choosing RealTimeX Local in the LLM selector tells RealTimeX to use the managed built-in local runtime as the system LLM provider.
That selection depends on a second setup layer:
- the runtime must be installed
- a hardware backend must be available
- at least one local model must be downloaded
- one model must be selected as the preferred built-in model
Current setup flow
- Open
Settings > AI Providers > LLM. - Choose
RealTimeX Local. - Open
Settings > AI Providers > RealTimeX Local. - Download the runtime if it is missing.
- Keep
Autodetectfor the backend unless you have a reason to pinMetal,CUDA,Vulkan, orCPU only. - Download a recommended GGUF model, search Hugging Face, or enter a repository or file path manually.
- Set the model as the preferred built-in model and load it.
- Return to chat and test a normal conversation.
What the current runtime page manages
The current RealTimeX Local page is more than a simple model dropdown.
It can manage:
- runtime installation state
- backend selection
- context size
- model download and load state
- warmup readiness
- unload, refresh, delete, and repair actions
If you are looking for those controls, use the dedicated page instead of expecting everything inside the LLM selector itself.
When to use this provider
- Use
RealTimeX Localwhen you want the simplest built-in local inference flow. - Use it when you do not want to run a separate local model server.
- Use it when on-device privacy or offline-capable chat matters more than cloud-only convenience.
If you want to connect RealTimeX to an external local provider instead, use pages such as Ollama, LM Studio, or llama.cpp.