If you want one interface for both local models and hosted frontier models, Open WebUI is one of the cleanest ways to do it.
The mistake is assuming you should wire everything through one generic connection path.
You usually should not.
The better setup is:
- Ollama for local models on your own machine or server
- OpenAI for hosted models that you do not want to run locally
- Open WebUI as the shared chat layer sitting above both
That gives you one workspace, one chat history surface, and two very different inference paths.
This guide is the practical version: how to install the stack, how to connect both providers, what breaks most often, and how to decide which model path to use by default.
TL;DR
Use this architecture:
- Run Ollama locally or on a reachable internal server.
- Run Open WebUI in Docker with persistent storage.
- Add your Ollama connection in
Admin Settings > Connections > Ollama. - Add your OpenAI connection in
Admin Settings > Connections > OpenAI. - Keep both enabled so users can switch between local and hosted models from one UI.
As of April 11, 2026:
- Open WebUI docs say Docker is the officially supported and recommended path for most users
- Open WebUI’s provider docs say the platform is protocol-centric, with OpenAI support mainly through the Chat Completions protocol and experimental support for Open Responses
- Ollama’s API docs say it supports parts of the OpenAI-compatible surface, including
/v1/chat/completionsand/v1/responses, but its/v1/responsessupport is non-stateful
That combination is why the clean default is:
- use Open WebUI’s native Ollama connection for Ollama
- use Open WebUI’s OpenAI connection for OpenAI
- only force everything through one OpenAI-compatible endpoint when you have a specific platform reason
What this setup is actually good at
This stack is strong when you want to mix:
- cheap or private local inference
- stronger hosted reasoning models
- one UI for individual or small-team use
It is especially useful for:
- researchers who want local document chat plus occasional hosted model quality
- developers who want one workspace for local debugging and cloud-heavy reasoning
- small internal teams that want a private UI first, not a custom app project
If your bigger decision is still “which self-hosted AI workspace should I choose?”, read Open WebUI vs LibreChat vs AnythingLLM (2026): Which Self-Hosted AI Workspace Should You Use?.
The architecture that usually works best
The simplest mental model is:
Browser
-> Open WebUI
-> Ollama for local models
-> OpenAI for hosted models
Open WebUI’s provider docs explicitly frame the product around connection protocols. In practice, that means:
- the Ollama connection is the best fit for an actual Ollama server
- the OpenAI connection is the best fit for
https://api.openai.com/v1 - the OpenAI-compatible path is for other services that expose that protocol
That sounds obvious, but it matters.
A lot of users overcomplicate the stack by routing Ollama through the OpenAI connection first, even when the dedicated Ollama integration is already available.
Unless you need protocol unification for a special reason, keep the connections separate.
Step 1: Install Ollama first
Open WebUI is just the interface. You still need a model backend before the local side works.
The easiest official paths are:
- macOS: install the Ollama app and let it expose the CLI and local server
- Windows: install the native Windows app; the API is served on
http://localhost:11434 - Linux: use the official install script
On Linux, the docs currently show:
curl -fsSL https://ollama.com/install.sh | sh
Then start or verify the service and pull at least one model. A small default like llama3.2 is a practical first test:
ollama pull llama3.2
Why do this first?
Because Open WebUI can only auto-detect Ollama models if Ollama is already reachable.
If you are still deciding whether local models are worth the trouble, this comparison helps: Ollama vs LM Studio (2026): Which Should You Use to Run Local LLMs?.
Step 2: Run Open WebUI in Docker
Open WebUI’s docs say Docker is the recommended path for most users, and that is the right default here too.
Use a persistent volume from the start.
docker run -d \
-p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Then open:
http://localhost:3000
Why include --add-host=host.docker.internal:host-gateway?
Because Open WebUI’s environment reference uses host.docker.internal:11434 as the default Ollama target in the normal Docker case, and Linux users often hit avoidable networking issues if they skip the host mapping.
If you want the all-in-one route instead, Open WebUI also publishes an :ollama image that bundles Open WebUI and Ollama together in one container.
That is good for quick solo experiments.
It is usually not my default recommendation if you already know you want both Ollama and OpenAI in the same long-lived setup, because separate services are easier to reason about and easier to move later.
Step 3: Confirm how Open WebUI expects to find Ollama
Open WebUI’s environment docs matter here.
As of April 11, 2026, the docs say:
ENABLE_OLLAMA_APIdefaults toTrueOLLAMA_BASE_URLdefaults to:http://localhost:11434whenUSE_OLLAMA_DOCKER=True- otherwise
http://host.docker.internal:11434in the normal Docker case
That leads to a practical rule:
- if Ollama runs on the host machine, let Open WebUI reach it through
host.docker.internal - if Ollama runs in a separate server, point Open WebUI at that server explicitly
- if you use the bundled
:ollamaimage, the local default is already aligned tolocalhost:11434
If the connection does not work automatically:
- Open
Admin Settings > Connections > Ollama > Manage - Check the base URL
- Make sure the Ollama server really answers on port
11434
The most common failure is not “Open WebUI is broken.”
It is “the container cannot see the Ollama host.”
Step 4: Add the Ollama connection in Open WebUI
Once Open WebUI is running:
- Go to
Admin Settings - Open
Connections > Ollama > Manage - Verify the URL
- Confirm your local models appear
Open WebUI’s Ollama guide says the app will try to connect automatically, and if it does connect cleanly, you can manage models from the Ollama connection screen.
There is also a convenience shortcut worth remembering:
If you type a model name into the model selector and it is not installed yet, Open WebUI can prompt you to download it from Ollama directly.
That makes the first-run flow much faster than bouncing between terminals and settings pages.
Step 5: Add the OpenAI connection
Now wire in the hosted side.
Open WebUI’s OpenAI guide says the setup path is:
- Go to
Admin Settings - Open
Connections > OpenAI > Manage - Click
Add New Connection - Use:
- URL:
https://api.openai.com/v1 - API key: your OpenAI key
- URL:
The docs also call out an important design choice:
Open WebUI is protocol-centric, and its OpenAI support is mainly through the OpenAI Chat Completions protocol, with experimental support for Open Responses.
That matters for expectations.
If your goal is “use OpenAI models in Open WebUI,” that path is fine.
If your goal is “I need the newest stateful Responses-only behavior to be the center of my app architecture,” you should treat Open WebUI as a UI layer with evolving support, not as a full substitute for building directly on OpenAI’s newest primitives.
Step 6: Keep both providers visible in the same workspace
Once both connections are live, the best setup is usually not “pick one forever.”
It is to use both deliberately.
Use Ollama when:
- the data should stay local
- the task is cheap enough for a local model
- you want predictable cost
- you are testing prompts, format, or retrieval flow before sending anything to a hosted model
Use OpenAI when:
- you need stronger reasoning quality
- you need better multimodal or tool-use performance
- you want less operational work than managing bigger local models
- latency is acceptable and cloud calls are allowed
That split is the main reason to build this stack in the first place.
The important protocol caveat
This is where many tutorials get sloppy.
Open WebUI’s docs currently say its OpenAI path is mainly based on the Chat Completions protocol, while Open Responses support is experimental.
Ollama’s own API docs say:
- it supports parts of the OpenAI-compatible API
- it supports
/v1/responses /v1/responseswas added in Ollama v0.13.3- stateful features like
previous_response_idandconversationare not supported
So the safe conclusion is:
- Open WebUI can sit above both local and hosted model backends very effectively
- but you should not assume “OpenAI-compatible” means every new OpenAI behavior is portable across every provider and UI surface
That is exactly why this guide recommends the native Ollama connection plus the native OpenAI connection instead of flattening everything too early.
A setup pattern I would actually use
If I were deploying this for myself or a small internal team, I would do it in this order:
- Install Ollama and pull one reliable small model.
- Run Open WebUI in Docker with persistent storage.
- Verify the Ollama connection first.
- Add OpenAI second.
- Rename or pin the models people should actually use first.
- Keep the rest available, but not as the default decision surface.
Why?
Because too many visible models create worse usage, not better usage.
A smaller, deliberate default set is easier for real work.
Common mistakes to avoid
1) Using single-user mode too casually
Open WebUI’s quick-start docs show a WEBUI_AUTH=False mode for no-login local usage, but they also warn that you cannot switch between single-user and multi-account mode after that change.
That is fine for a throwaway personal machine.
It is the wrong default for anything you may later expose to teammates.
2) Skipping persistent storage
The Docker docs explicitly call the volume mount out as the thing that prevents data loss between restarts.
If you skip the volume, you are building a demo, not a workspace.
3) Treating networking problems like model problems
If Ollama is healthy but Open WebUI cannot see it, the issue is usually:
- wrong host name
- missing host gateway mapping
- wrong server URL
- container-to-host routing confusion
Check the connection path before you blame the model.
4) Assuming OpenAI-compatible means feature-identical
It does not.
Ollama’s compatibility layer is useful, but its own docs are clear about supported and unsupported behavior. The same general caution applies across local servers and proxies.
Should you use the bundled :ollama image or separate services?
Here is the short version:
| If your situation is… | Better default |
|---|---|
| solo test on one machine, local-first, minimal setup | open-webui:ollama |
| mixed local + hosted stack that you want to keep flexible | separate Ollama + Open WebUI |
| team use, future scaling, or clearer troubleshooting | separate Ollama + Open WebUI |
I would only choose the bundled image as the default when simplicity matters more than separation.
For the “Ollama plus OpenAI in one UI” use case, separate services are usually cleaner.
Final takeaway
The smart way to use Open WebUI with Ollama and OpenAI is not to pretend they are the same backend.
It is to let each side do the job it is best at:
- Ollama for local, private, lower-cost inference
- OpenAI for stronger hosted model capability
- Open WebUI as the shared interface above both
That gives you one workspace without forcing one model strategy.
And that is the whole point.
Sources
- Open WebUI docs home
- Open WebUI getting started
- Open WebUI quick start
- Open WebUI connect a provider
- Open WebUI starting with OpenAI
- Open WebUI starting with Ollama
- Open WebUI environment variable configuration
- Ollama documentation home
- Ollama macOS docs
- Ollama Windows docs
- Ollama Linux docs
- Ollama Docker docs
- Ollama OpenAI compatibility docs