Techalicious Academy / 2026-03-19-chatbot

Visit our meetup for more great tutorials

PARAMETERS THAT MATTER

Here's the counterintuitive truth: the default settings in most chat interfaces are wrong for roleplay. Worse, some "fixes" that work for other models actually hurt Magidonia's performance.

We're going to fix that.

THE BIG PRINCIPLE

Magidonia was fine-tuned for creative roleplay. It handles repetition naturally. It's trained to tell a story with you, not against you. It doesn't fight its own nature.

So the first rule is: don't fight it either.

Most general-purpose chat models have repetition problems. They say the same thing twice in three messages. So people disable repetition. They add DRY (Diverse Random Yielding) samplers. They add XTC (eXtendable Thought Complexity) samplers. This works for general chat.

For Magidonia? Those techniques suppress its natural creativity. The repetition penalty makes it awkwardly avoid normal words. The DRY and XTC samplers confuse its ability to stay in character.

So here's the first commandment: DISABLE repetition penalty for Magidonia. Set it to 1.0 (which means "do nothing"). Also disable DRY and XTC if your interface has them.

This is a sharp departure from what works for other models. But Magidonia isn't other models.

THE RECOMMENDED SETTINGS

Here's what we use. These are battle-tested across the BeaverAI community:

Temperature:       1.0
Top K:             40
Top P:             0.95
Min P:             0.02
Repeat Penalty:    1.0 (disabled)
Max Tokens:        2048
Context Window:    16384

Write these down. Save them. You're going to use them for every character you create with Magidonia.

Now let's break down what each one does.

TEMPERATURE: THE RANDOMNESS DIAL

Temperature controls how random the model is when choosing the next word.

At temperature 0, the model always picks the most likely next word. Deterministic. Boring. Repetitive. Good for technical writing. Bad for roleplay.

At temperature 1.0, the model is confident but willing to surprise you. It picks words that fit the story, but not always the absolute most obvious choice. The result feels natural. Creative. Spontaneous.

At temperature 1.5, the model gets weird. It starts picking words that don't quite make sense. Good for surreal fiction. Bad for grounded roleplay.

Our recommendation: start at 1.0. This is Magidonia's sweet spot.

If your character feels too wild and unpredictable, drop temperature to 0.8. The character will still be creative but more grounded.

If your character feels boring or repetitive, try 1.2. You get more chaos, more surprises.

But 1.0 is the starting point. Most of the time, it's the finishing point too.

TOP K: CANDIDATE LIMITING

Top K limits how many of the most-likely next words the model considers.

At Top K 10, the model only looks at the 10 most likely words. Narrow. Focused. Predictable.

At Top K 40, the model considers the 40 most likely words. Wider range. More creativity.

At Top K 100, the model considers 100 words. Very open. Chaotic.

For Magidonia, 40 is the sweet spot. It's wide enough to let the model be creative. It's narrow enough that it stays focused on the story.

If the model is being too random or incoherent, drop it to 30. If it's being too predictable, try 50.

TOP P: NUCLEUS SAMPLING

Top P is also called nucleus sampling. It works differently from Top K.

Instead of limiting by count, Top P limits by cumulative probability. The model sums up probabilities of words in order of likelihood until it reaches the threshold.

At Top P 0.9, the model stops after reaching 90% cumulative probability. At Top P 0.95, it goes to 95%.

The advantage of Top P over Top K: it adapts. If the model is very confident about a word (like "the"), it counts that one word heavily. If it's uncertain (like when a character can make multiple sensible choices), it includes more words.

For Magidonia: 0.95 is our baseline. It works well with Top K 40. They complement each other.

Don't change this unless you have a specific reason. Temperature and Top K are more intuitive to tune.

MIN P: THE TRASH FILTER

Min P filters out low-probability garbage words. The model's probability distribution usually has a long tail - tons of words that are technically possible but really bad.

At Min P 0.02, the model ignores any word with less than 2% probability relative to the most likely word. This filters the garbage while keeping creativity.

At Min P 0.05, it filters more aggressively. Better for technical output.

At Min P 0.01, it filters less. More chaos.

For Magidonia roleplay: 0.02 is perfect. It keeps the model coherent without being restrictive.

If you're getting nonsense or weird made-up words, try 0.03 or 0.04. If the model feels too constrained, try 0.01.

Tune this in 0.005 increments. The differences are subtle but real.

REPEAT PENALTY: THE CRITICAL SETTING

Here's the one that separates Magidonia from everything else.

Repeat penalty makes the model less likely to use the same word twice in a row, or the same phrase twice in a row.

For general models, this is essential. They repeat constantly.

For Magidonia, it's the opposite. The Cydonia line was specifically trained to avoid repetition. It's baked into the training process.

If you add a repeat penalty on top, the model now has two systems pushing it to avoid repetition. The result: it awkwardly avoids common words. "The" becomes "said object". "Look" becomes "gaze upon". The character starts sounding weird.

So here's the rule: Repeat Penalty = 1.0 (disabled)

Don't touch this for Magidonia. Ever. I mean it. The default might be 1.1 or 1.2 in your interface. Change it to 1.0.

If someone tells you "add a repeat penalty for better output," they don't understand Magidonia. Trust the training. The model knows what it's doing.

MAX TOKENS: ROOM TO BREATHE

Max Tokens controls how many words the model can generate in a single response.

At Max Tokens 256, the model gives you short, punchy responses. Good for quick dialogue.

At Max Tokens 2048, the model can write long, detailed paragraphs. Good for immersive roleplay where a single character response unfolds a scene.

For Magidonia, 2048 is the standard. It's long enough for the model to tell a story without feeling rushed. It's not so long that you're waiting forever for a response.

If you want shorter responses, drop it to 1024. If you want longer, go to 4096 (but this will increase generation time noticeably).

CONTEXT WINDOW: THE MEMORY LIMIT

Context Window is how many tokens the model can "see" in the conversation history.

At 16384 tokens (our recommendation), the model can see roughly 12,000-16,000 words of previous conversation. That's 30-40 messages of typical back-and-forth.

This is the sweet spot for character consistency. The model has enough memory to stay coherent over a long conversation, but not so much that it forgets what the character is supposed to be doing.

Research shows that consistency actually drops with very large contexts. At 32K tokens, the model starts forgetting its core personality traits. At 4K tokens, it forgets too quickly.

16384 is the goldilocks zone.

If you're doing short roleplay sessions, 8192 works fine. If you're doing very long campaigns, 24576 is worth trying.

Don't go below 8192. The character will lose coherence.

PUTTING IT TOGETHER

Here's what your parameter settings look like in different interfaces:

In Ollama CLI (run command):

ollama run -e TEMPERATURE=1.0 model_name

In OpenWebUI (Settings tab):

Temperature: 1.0
Top K: 40
Top P: 0.95
Min P: 0.02
Repeat Penalty: 1.0
Max Tokens: 2048
Context: 16384

In an API call:

{
  "temperature": 1.0,
  "top_k": 40,
  "top_p": 0.95,
  "min_p": 0.02,
  "repeat_penalty": 1.0,
  "num_predict": 2048,
  "num_ctx": 16384
}

THE SAMPLER ORDER

If your interface supports explicit sampler order, use this:

  1. Min P
  2. Top K
  3. Top P
  4. Temperature

This order is optimal for Magidonia because it filters progressively from strict to loose, ending with randomness applied to the final pool.

TUNING ADVICE

Start with the baseline settings above. Test them. See how your character responds.

If you want to adjust:

  1. Temperature first. It's the most intuitive. Too wild? Drop to 0.8. Too boring? Try 1.2.
  2. Then Min P. Is the model saying weird things? Bump it to 0.03.
  3. Then Top K. Does it feel constrained? Try 50. Too chaotic? Try 30.

Never touch repeat penalty. It's a footgun for Magidonia.

Most of the time, the default settings work perfectly. You don't need to tune. The model is smart enough to figure it out.

QUICK REFERENCE TABLE

Parameter Default Range What It Does ----------- ------- ----- -------------------- Temperature 1.0 0.1-2.0 Randomness. 1.0 is creative. Top K 40 10-100 How many words to consider Top P 0.95 0.8-0.99 Cumulative probability limit Min P 0.02 0.01-0.05 Filter low-probability garbage Repeat Penalty 1.0 1.0-1.2 Disable (1.0) for Magidonia! Max Tokens 2048 256-4096 Response length Context Window 16384 8192-24576 How much history to remember

ONE MORE TIME

The core principle: Magidonia is special. It's not a general-purpose model. It's a specialist. Don't apply general-purpose fixes to a specialist model.

The most critical setting: Repeat Penalty = 1.0. Everything else is tuning. This is the foundation.

Now go write a character. See how it feels. Adjust if needed. Most of the time, you won't need to.

Magidonia knows what it's doing. Trust it.