Ollama Setup Guide

Setup Guide

Running WriteWP with Ollama

A complete, practical guide to setting up Ollama on your own server so every article WriteWP processes stays completely private. Zero data leaves your infrastructure.

Last updated: February 27, 2026

Why Ollama

Every AI provider WriteWP supports sends your article content to an external API. OpenAI, Google Gemini, and MiniMax all receive your source articles on their servers. For most users this is fine. But if you handle sensitive content, client content, or simply prefer not to send your data to third-party servers, Ollama is the answer.

When WriteWP uses Ollama, the article content is processed entirely on your own hardware. The AI model runs locally. Nothing is sent externally. You get the full WriteWP workflow with complete data privacy.

Privacy guarantee: With Ollama configured and WriteWP set to use it, your article content never leaves your server. This is verified by the architecture — WriteWP sends the request to localhost, and Ollama processes it locally. No external network calls are made.

Hardware Requirements

The model you choose determines the hardware you need. Ollama runs on Linux, macOS, and Windows. For a production WriteWP setup, a Linux VPS is the most practical choice.

Model RAM Required Best For Quality
Llama 3.1 8B 8GB VRAM or RAM Fast drafting, high volume Good
Llama 3.1 70B 64GB RAM High quality, complex content Excellent
Mistral 7B 8GB VRAM or RAM Balanced speed and quality Very Good
Gemma 2 27B 32GB RAM Instruction following, SEO writing Very Good
Qwen 2.5 32B 32GB RAM Multilingual, long-form Very Good

For most WordPress publishers, Llama 3.1 8B or Mistral 7B provides the best balance of speed and quality for article rewriting. If you are processing technical or research-heavy content, the larger models are worth the hardware cost.

Install Ollama

Ubuntu / Debian (Recommended)

SSH into your server and run:

curl -fsSL https://ollama.com/install.sh | sh

Verify the Installation

ollama --version # Should output: ollama version 0.5.x or higher

Start Ollama as a Background Service

For production use, run Ollama as a systemd service so it survives server restarts:

sudo systemctl enable ollama sudo systemctl start ollama sudo systemctl status ollama

Run Without systemd (Alternative)

If you do not have systemd access (some shared VPS environments):

OLLAMA_HOST=0.0.0.0 ollama serve &

Pull a Model

Once Ollama is installed, download a model. Each model is several gigabytes. This takes 5-20 minutes depending on your internet connection.

# For Llama 3.1 8B (recommended starting point) ollama pull llama3.1:8b # For Mistral 7B (excellent for article writing) ollama pull mistral:7b # For Gemma 2 27B (better instruction following) ollama pull gemma2:27b

Verify the model is available:

ollama list # Shows all downloaded models with their sizes

Configure Network Access

By default, Ollama only listens on localhost (127.0.0.1). For WriteWP to reach it from your WordPress server, you need to bind it to a network-accessible address.

Option 1: Same Server (WordPress on same machine as Ollama)

If WordPress and Ollama run on the same server, no network changes are needed. WriteWP connects to localhost:11434 .

Option 2: Different Servers (Ollama on a separate VPS)

On the Ollama server, edit the systemd service to bind to all interfaces:

sudo systemctl edit ollama # Add the following lines: # [Service] # Environment="OLLAMA_HOST=0.0.0.0"
sudo systemctl restart ollama

Security note: Binding Ollama to 0.0.0.0 exposes it on all network interfaces. If your Ollama server is on the public internet, you must use a firewall (UFW or iptables) to restrict access to only your WordPress server IP. Ollama has no built-in authentication.

Test that Ollama is reachable from your WordPress server:

# From your WordPress server: curl https://your-ollama-server.com/api/generate -d '{ "model": "llama3.1:8b", "prompt": "Say hello in one word.", "stream": false }'

Connect to WriteWP

  1. Open WriteWP Settings

    In your WordPress admin, go to WriteWP > Settings > AI Providers tab.

  2. Select Ollama

    Click the Ollama logo or radio button to set it as your active provider.

  3. Enter Connection Details

    Fill in the Ollama server URL — either http://localhost:11434 (same server) or http://YOUR_OLLAMA_SERVER_IP:11434 .

  4. Select Your Model

    Enter the exact model name you downloaded — for example, llama3.1:8b or mistral:7b . The model must match exactly, including the tag.

  5. Test Connection

    Click Test Connection. WriteWP sends a small prompt to Ollama and reports the result. If it succeeds, you are ready to import and rewrite articles.

Tip: Start with Llama 3.1 8B for the best balance of speed and quality. If article quality is good but speed is acceptable, try Mistral 7B next. Only move to the larger 70B models if quality is noticeably lacking on technical content.

Troubleshooting

Connection Refused

If WriteWP reports "Connection refused" when testing Ollama, the most common causes are:

  • Ollama is not running — run sudo systemctl status ollama to check
  • Firewall blocking port 11434 — run sudo ufw allow 11434/tcp
  • OLLAMA_HOST not set to 0.0.0.0 — check with systemctl show ollama | grep Environment

Model Not Found

If Ollama responds but says model not found, the model name in WriteWP settings does not match exactly. Run ollama list on the Ollama server and copy the exact model name including the tag.

Slow Response Times

Ollama response speed depends on available RAM and the model size. If responses take more than 60 seconds, you likely do not have enough RAM for that model and the system is swapping to disk. Either switch to a smaller model or upgrade your server RAM.

Out of Memory (OOM)

If Ollama crashes with an OOM error, the model is too large for your available RAM. Options: switch to a smaller model (8B instead of 70B), add more RAM to your server, or add swap space.