Public Launch on coming 2/27/2026 ( This is final date no more delays 😅 )
Ollama Setup Guide
Running WriteWP with Ollama
A complete, practical guide to setting up Ollama on your own server so every article WriteWP processes stays completely private. Zero data leaves your infrastructure.
Last updated: February 27, 2026
In This Guide
Why Ollama
Every AI provider WriteWP supports sends your article content to an external API. OpenAI, Google Gemini, and MiniMax all receive your source articles on their servers. For most users this is fine. But if you handle sensitive content, client content, or simply prefer not to send your data to third-party servers, Ollama is the answer.
When WriteWP uses Ollama, the article content is processed entirely on your own hardware. The AI model runs locally. Nothing is sent externally. You get the full WriteWP workflow with complete data privacy.
Privacy guarantee: With Ollama configured and WriteWP set to use it, your article content never leaves your server. This is verified by the architecture — WriteWP sends the request to localhost, and Ollama processes it locally. No external network calls are made.
Hardware Requirements
The model you choose determines the hardware you need. Ollama runs on Linux, macOS, and Windows. For a production WriteWP setup, a Linux VPS is the most practical choice.
| Model | RAM Required | Best For | Quality |
|---|---|---|---|
| Llama 3.1 8B | 8GB VRAM or RAM | Fast drafting, high volume | Good |
| Llama 3.1 70B | 64GB RAM | High quality, complex content | Excellent |
| Mistral 7B | 8GB VRAM or RAM | Balanced speed and quality | Very Good |
| Gemma 2 27B | 32GB RAM | Instruction following, SEO writing | Very Good |
| Qwen 2.5 32B | 32GB RAM | Multilingual, long-form | Very Good |
For most WordPress publishers, Llama 3.1 8B or Mistral 7B provides the best balance of speed and quality for article rewriting. If you are processing technical or research-heavy content, the larger models are worth the hardware cost.
Install Ollama
Ubuntu / Debian (Recommended)
SSH into your server and run:
curl -fsSL https://ollama.com/install.sh | sh
Verify the Installation
ollama --version
# Should output: ollama version 0.5.x or higher
Start Ollama as a Background Service
For production use, run Ollama as a systemd service so it survives server restarts:
sudo systemctl enable ollama
sudo systemctl start ollama
sudo systemctl status ollama
Run Without systemd (Alternative)
If you do not have systemd access (some shared VPS environments):
OLLAMA_HOST=0.0.0.0 ollama serve &
Pull a Model
Once Ollama is installed, download a model. Each model is several gigabytes. This takes 5-20 minutes depending on your internet connection.
# For Llama 3.1 8B (recommended starting point)
ollama pull llama3.1:8b
# For Mistral 7B (excellent for article writing)
ollama pull mistral:7b
# For Gemma 2 27B (better instruction following)
ollama pull gemma2:27b
Verify the model is available:
ollama list
# Shows all downloaded models with their sizes
Configure Network Access
By default, Ollama only listens on localhost (127.0.0.1). For WriteWP to reach it from your WordPress server, you need to bind it to a network-accessible address.
Option 1: Same Server (WordPress on same machine as Ollama)
If WordPress and Ollama run on the same server, no network changes are needed. WriteWP connects to
localhost:11434
.
Option 2: Different Servers (Ollama on a separate VPS)
On the Ollama server, edit the systemd service to bind to all interfaces:
sudo systemctl edit ollama
# Add the following lines:
# [Service]
# Environment="OLLAMA_HOST=0.0.0.0"
sudo systemctl restart ollama
Security note: Binding Ollama to 0.0.0.0 exposes it on all network interfaces. If your Ollama server is on the public internet, you must use a firewall (UFW or iptables) to restrict access to only your WordPress server IP. Ollama has no built-in authentication.
Test that Ollama is reachable from your WordPress server:
# From your WordPress server:
curl https://your-ollama-server.com/api/generate -d '{
"model": "llama3.1:8b",
"prompt": "Say hello in one word.",
"stream": false
}'
Connect to WriteWP
-
Open WriteWP Settings
In your WordPress admin, go to WriteWP > Settings > AI Providers tab.
-
Select Ollama
Click the Ollama logo or radio button to set it as your active provider.
-
Enter Connection Details
Fill in the Ollama server URL — either
http://localhost:11434(same server) orhttp://YOUR_OLLAMA_SERVER_IP:11434. -
Select Your Model
Enter the exact model name you downloaded — for example,
llama3.1:8bormistral:7b. The model must match exactly, including the tag. -
Test Connection
Click Test Connection. WriteWP sends a small prompt to Ollama and reports the result. If it succeeds, you are ready to import and rewrite articles.
Tip: Start with Llama 3.1 8B for the best balance of speed and quality. If article quality is good but speed is acceptable, try Mistral 7B next. Only move to the larger 70B models if quality is noticeably lacking on technical content.
Troubleshooting
Connection Refused
If WriteWP reports "Connection refused" when testing Ollama, the most common causes are:
-
Ollama is not running — run
sudo systemctl status ollamato check -
Firewall blocking port 11434 — run
sudo ufw allow 11434/tcp -
OLLAMA_HOST not set to 0.0.0.0 — check with
systemctl show ollama | grep Environment
Model Not Found
If Ollama responds but says model not found, the model name in WriteWP settings does not match exactly. Run
ollama list
on the Ollama server and copy the exact model name including the tag.
Slow Response Times
Ollama response speed depends on available RAM and the model size. If responses take more than 60 seconds, you likely do not have enough RAM for that model and the system is swapping to disk. Either switch to a smaller model or upgrade your server RAM.
Out of Memory (OOM)
If Ollama crashes with an OOM error, the model is too large for your available RAM. Options: switch to a smaller model (8B instead of 70B), add more RAM to your server, or add swap space.
