QUICK REFERENCE
Everything from tonight on one page. Print this. Bookmark it. Tape it to your monitor.
Installation Commands
Install Ollama:
brew install --cask ollama
Install OpenWebUI (Docker Compose):
Create docker-compose.yml:
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
ports:
- "3000:8080"
volumes:
- open-webui:/app/backend/data
restart: always
volumes:
open-webui:
Then run:
docker compose up -d
Access: http://localhost:3000 First account becomes admin.
Model Commands
Pull models (run in Terminal):
ollama pull qwen3:30b-a3b 19GB General, code, reasoning
ollama pull qwen3-next 50GB Heavyweight, long context
ollama pull mistral-small3.2 15GB Text + vision, multilingual
ollama pull qwen3-vl 6GB Vision specialist
ollama pull qwen3-vl:4b 3.3GB Light vision
Check installed models:
ollama list
Check running models:
ollama ps
RAM Guide
8GB Mac: qwen3-vl:4b only
16GB Mac: qwen3-vl (8B), one small model at a time
32GB Mac: qwen3:30b-a3b, mistral-small3.2, qwen3-vl
64GB+ Mac: All models including qwen3-next
Creating Custom Models
Workspace > Models > Create a Model
- Name it
- Pick a base model
- Write a system prompt
- Add starter questions (optional)
- Set parameters (optional)
- Toggle capabilities (vision, web search, etc.)
- Save
Parameter Cheat Sheet
Temperature: Creativity dial
Low (0.2) = precise, High (1.0) = creative
Default: 0.7
Top P: Probability filter
Low (0.3) = focused, High (0.95) = varied
Default: 0.9
Top K: Fixed candidate count
Lower = more focused, Higher = more varied
Default: 40
Repeat Penalty: Prevents loops
1.0 = off, 1.1 = mild, 1.3+ = aggressive
Default: 1.1
Context Length: How much conversation the model remembers
Higher = more memory, more RAM used
Default: varies by model
Max Tokens: Response length limit
-1 = unlimited
Default: -1
Seed: Fixed output for testing
0 = random each time
Default: 0
Suggested Settings by Task
Coding: Temperature 0.2, Top P 0.9
Conversation: Temperature 0.7, Top P 0.9
Creative Writing: Temperature 0.9, Top P 0.95
Factual Q&A: Temperature 0.3, Top P 0.85
Dynamic Variables for System Prompts
{{CURRENT_DATE}} Today's date
{{CURRENT_TIME}} Current time
{{USER_NAME}} Logged-in user's name
Useful Keyboard Shortcuts
Enter: Send message
Shift+Enter: New line without sending
Ctrl+Shift+O: New chat
Ctrl+Shift+S: Toggle sidebar
Adding Tools
Workspace > Tools > Create a Tool
- Paste Python code for the tool
- Configure settings (API keys, etc.)
- Assign to models via Tool Bindings or enable globally
Perplexity Web Search: Gives models live web search
Needs API key from perplexity.ai
Uses sonar-pro model by default
Docker Compose Management
Run these from the directory with your docker-compose.yml:
Start: docker compose up -d
Stop: docker compose down
Logs: docker compose logs -f
Status: docker compose ps
Update: docker compose pull && docker compose up -d
Reset: docker compose down -v (wipes everything)
Troubleshooting Checklist
Nothing loads? docker compose ps (is container running?)
No models? ollama list (did you pull any?)
Can't connect? Check Ollama URL in Admin > Connections
Slow responses? Activity Monitor > check memory pressure
Gibberish output? Lower temperature, check parameters
Forgetting context? Start fresh chat or increase num_ctx
Links
OpenWebUI: docs.openwebui.com
Ollama: ollama.com
Ollama Models: ollama.com/library
Tutorial: techalicious.academy