THE RPMAX MODEL - WHY IT'S SPECIAL
Not all models are created equal. For companion chatbots, you want a model trained specifically for roleplay. That's RPMax.
What is RPMax?
RPMax (ArliAI-RPMax) is a fine-tuned version of Mistral-Small-22B created by ArliAI specifically for roleplay and character work. It's designed to:
- Maintain character voice consistently
- Understand subtext and emotional context
- Produce varied, non-repetitive responses
- Continue scenes naturally without breaking character
The model comes in several sizes:
- Mistral-Small-22B-ArliAI-RPMax (our choice)
- Mistral-Nemo-12B-ArliAI-RPMax (lighter alternative)
- Llama-3.3-70B-ArliAI-RPMax (if you have the hardware)
Why Not Just Use Regular Mistral or Llama?
Regular instruction-tuned models are trained to be helpful assistants. They WANT to break character and remind you they're AI. It's in their training.
Roleplay models like RPMax are trained on collaborative fiction. They WANT to continue scenes and maintain character. Different training, different behavior.
Try this experiment:
Regular Mistral: "You are a grumpy pirate. How are you today?"
Response: "Arr! As a grumpy pirate, I'd say... *ahem* As an AI, I
should mention I'm not actually a pirate..."
RPMax: Same prompt
Response: "*scratches beard and squints at the horizon* How am I?
I'll tell ye how I am. The rum's gone, the crew's useless,
and some landlubber's askin' me obvious questions."
See the difference? RPMax stays in the scene.
The Secret Sauce: Single-Epoch Training
Here's what makes RPMax technically interesting. Most fine-tunes train for multiple epochs (passes through the data). This causes the model to memorize common phrases and patterns.
You know those annoying AI-isms?
- "A shiver ran down her spine..."
- "I understand your concern..."
- "Let me paint you a picture..."
- "The weight of the moment..."
Those come from repetition in training data. The model sees them so many times it defaults to them.
RPMax takes a different approach:
- Uses a MUCH smaller, carefully curated dataset
- Rigorously deduplicates it (no repeated scenarios or characters)
- Trains for only ONE epoch
The result: the model hasn't memorized tropes. It generates fresh responses because it never saw the same thing twice during training.
Real-World Testing
Community testing confirms RPMax outperforms alternatives:
- Better at maintaining character voice over long conversations
- Understands implicit meaning (subtext, sarcasm, hints)
- Produces more varied response lengths naturally
- Less likely to become a "yes-man" who agrees with everything
- Keeps female characters proactive (not deferring to user)
One detailed comparison found RPMax correctly interpreting implicit actions (like understanding that removing radio batteries meant avoiding tracking) while other models missed the subtext entirely.
Getting RPMax
The model is hosted on Hugging Face. We want the GGUF version quantized by bartowski (a trusted quantizer in the community).
Download command:
ollama pull hf.co/bartowski/Mistral-Small-22B-ArliAI-RPMax-v1.1-GGUF:Q6_K_L
Breaking that down:
- hf.co = Hugging Face
- bartowski = the quantizer
- Mistral-Small-22B-ArliAI-RPMax-v1.1 = the model
- GGUF = the format
- Q6_K_L = the quantization level (high quality, reasonable size)
This is about 15GB. Go make coffee.
Quantization Options
If you're tight on RAM/VRAM, you can use smaller quantizations:
Q8_0 = Best quality, ~24GB (need 32GB+ RAM)
Q6_K_L = Great quality, ~16GB (our recommendation)
Q5_K_M = Good quality, ~14GB
Q4_K_M = Acceptable quality, ~12GB
Q3_K_M = Noticeable quality loss, ~10GB
Don't go below Q4 unless you have to. Quality drops fast.
Smaller Alternative
If 22B is too big, try the 12B version:
ollama pull hf.co/bartowski/Mistral-Nemo-12B-ArliAI-RPMax-v1.1-GGUF:Q6_K
About 8GB. Still good for companions, just slightly less nuanced in complex scenarios.
Verify Installation
Check it downloaded:
ollama list
You should see the model in the list. Test it:
ollama run hf.co/bartowski/Mistral-Small-22B-ArliAI-RPMax-v1.1-GGUF:Q6_K_L
Try a simple prompt:
"Continue this scene: The detective lit a cigarette, staring at the
rain-soaked window. 'Another dead end,' she muttered."
If it continues the scene naturally without breaking character or mentioning it's an AI, you're good.
Now let's configure it properly.