QUICK REFERENCE - CHEAT SHEET

Everything you need on one page. Print this. Bookmark this.

MODEL

RPMax 22B:
ollama pull hf.co/bartowski/Mistral-Small-22B-ArliAI-RPMax-v1.1-GGUF:Q6_K_L

RPMax 12B (lighter):
ollama pull hf.co/bartowski/Mistral-Nemo-12B-ArliAI-RPMax-v1.1-GGUF:Q6_K

PARAMETERS

Temperature:     1.0
Top K:           40
Top P:           0.95
Min P:           0.02
Repeat Penalty:  1.0 (DISABLED!)
Max Tokens:      2048
Context Window:  16384

MODELFILE TEMPLATE

FROM hf.co/bartowski/Mistral-Small-22B-ArliAI-RPMax-v1.1-GGUF:Q6_K_L

PARAMETER temperature 1.0
PARAMETER top_k 40
PARAMETER top_p 0.95
PARAMETER repeat_penalty 1.0
PARAMETER num_ctx 16384
PARAMETER stop "User:"
PARAMETER stop "\nUser:"

SYSTEM """
[Your character card here]
"""

Create: ollama create mychar -f mychar.modelfile Run: ollama run mychar

CHARACTER CARD FORMAT

[Name: CharacterName]
[Personality= trait1, trait2, trait3, trait4, trait5]
[Speech= style1, style2, style3]

Brief background sentence if needed.

<START>
{{user}}: Example user message
{{char}}: Example character response showing personality
<END>

<START>
{{user}}: Different scenario
{{char}}: Character handling it in their voice
<END>

<START>
{{user}}: Third scenario
{{char}}: Third example response
<END>

SCENE PROMPT STRUCTURE

[Character card at top]

Scene context (brief).

User: What the user said

CharacterName:

STOP SEQUENCES

Essential:      "User:", "\nUser:"
Multi-char:     Add all character names with colons
Chat template:  "</s>", "[INST]" (if using raw API)

CHARACTER REFRESH

Place at depth 4 (4 messages from end):

[Remember: CharacterName is trait, trait, trait]

WHEN THINGS GO WRONG

Generic responses?     Raise temp, improve examples
Too long?              Shorter first message, lower max tokens
Too short?             Longer examples, raise temp
Breaking character?    Check model, add refresh, regenerate
Repetitive?            Disable repeat penalty (set to 1.0)
Forgetting things?     Prune context, save facts externally
Yes-man behavior?      Add disagreement examples

POSITIVE FRAMING

BAD:  "NEVER break character"
GOOD: "Stay fully in character"

BAD:  "Don't be verbose"
GOOD: "Keep responses brief"

BAD:  "Never give advice"
GOOD: "Ask questions instead of advising"

TOKEN ESTIMATES

1 token     ≈ 4 characters or 0.75 words
100 tokens  ≈ 75 words
1000 tokens ≈ 750 words

8K context  ≈ 6,000 words
16K context ≈ 12,000 words
32K context ≈ 24,000 words

TEMPERATURE GUIDE

0.5  = Focused, consistent, less creative
0.7  = Balanced (common default)
1.0  = Natural variation (recommended)
1.2  = More creative, occasional oddness
1.5+ = Wild, unpredictable

OOC COMMANDS

[Shorter responses please]
[Stay more in character]
[Remember detail X]
[Let's shift topics]

ACTION FORMAT

*asterisks for actions* "quotes for dialogue"

Example:
*tilts head* "That's interesting. Tell me more."

INTERFACES

OpenWebUI:     docker run -d -p 3000:8080 ...
Ollama CLI:    ollama run modelname
Bolt AI:       Mac App Store

COMMANDS

ollama serve              Start Ollama server
ollama list               List installed models
ollama pull <model>       Download a model
ollama run <model>        Interactive chat
ollama create name -f f   Create from Modelfile
ollama rm <model>         Delete a model
ollama ps                 List running models
ollama stop <model>       Stop a model

MAINTENANCE SCHEDULE

Every 30-50 messages:
  - Prune irrelevant exchanges
  - Update external fact list
  - Add character refresh if drifting

Every few sessions:
  - Review character card
  - Improve weak examples
  - Check for repetitive patterns

DIAGNOSTIC CHECKLIST

[ ] Right model? (RPMax, not Instruct)
[ ] Repeat penalty off? (1.0)
[ ] Temperature good? (1.0)
[ ] Stop sequences set?
[ ] Context not overflowed?
[ ] Character card clear?
[ ] First message sets tone?

LINKS

Ollama:        https://ollama.com
OpenWebUI:     https://openwebui.com
RPMax Model:   https://huggingface.co/ArliAI
Quantizations: https://huggingface.co/bartowski
Techalicious:  https://techalicious.forum

REMEMBER

Use roleplay models, not instruct models
Disable repetition penalty for RPMax
Show, don't tell - examples > rules
Positive framing > negative rules
First message sets the template
Context management prevents drift
Regenerate/edit bad responses immediately