Techalicious Academy / 2026-02-06-claude-code-local

(Visit our meetup for more great tutorials)

CLAUDE CODE + LOCAL OLLAMA - OVERVIEW

What We're Building

Claude Code is Anthropic's terminal-based agentic coding tool. By default, it phones home to Anthropic's cloud servers. You pay per token. Your code leaves your network.

But here's the thing: Ollama added Anthropic API compatibility in version 0.14.0 (January 2026). That means we can point Claude Code at our own local models instead.

Tonight we're setting that up. All the polish of Anthropic's CLI. None of the cloud dependency.

What is "Vibe Coding"?

Vibe coding is when you describe what you want and working code appears. Instead of typing every character yourself, you tell the AI your intent:

"Create a Perl script that fetches RSS feeds and extracts titles"

And it writes it. Edits files directly. Runs tests. Iterates until it works.

This isn't autocomplete. This is an agent that:

Claude Code is currently the best tool for this workflow. And now we can run it without cloud dependencies.

The Architecture

Here's what we're building:

+------------------+         +------------------+
|  Your Machine    |         |  AI Server       |
|                  |   LAN   |                  |
|  Claude Code     | ------> |  Ollama          |
|  (CLI client)    |         |  (qwen3-coder)   |
+------------------+         +------------------+

Or if you're running everything on one machine:

+------------------------------------------+
|  Your Machine                            |
|                                          |
|  Claude Code  ------>  Ollama            |
|  (localhost:11434)                       |
+------------------------------------------+

Both setups work. We'll cover configuring for a remote Ollama server since that's the more complex case. Localhost is just simpler.

What This Tutorial Covers

By the end of this tutorial, you'll understand:

What You'll Need

Hardware:

Or:

Software:

What This Is NOT

This is a casual hobbyist demo, not a professional workshop. We're showing you something cool we figured out, sharing what works.

No certificates. No polished slides. No promises.

If you want hand-holding through every possible configuration issue, this isn't that. We'll cover the common cases and you'll figure out your edge cases.

A Note on Speed

Local inference is slower than cloud APIs. Way slower.

Anthropic's cloud: 20-50 tokens/second Local qwen3-coder on M4: 5-15 tokens/second

Expect 30-120 seconds for responses depending on complexity. That's the trade-off for privacy and zero cost.

If speed is your priority, pay for the cloud API. If privacy and independence matter more, welcome aboard.

Let's Get Started

Next chapter: Prerequisites. We'll make sure everything is in place before we start installing.