Pi + qwen3.6 as backup when Claude Code limits hit

revisedStarted May 8, 2026Finished May 10, 2026

I wanted to cancel Claude Code Max. What I got was a backup — and it turned out to be the more honest version of the same idea.

This started as a replacement project — local AI all the way down, $200/mo back in my pocket, sovereignty restored. After a real eight-hour run with the weekly limit hit, the goal shifted. The Pi stack still exists, still works — but it lives now as a second seat, not a primary. The reframe is more accurate, and more useful, than the original ambition.

The original plan vs what landed

ORIGINAL AMBITION

Replace Claude Code Max entirely.
Cancel the subscription, recover $200/mo.
Local AI all the way down.
Full sovereignty from frontier-model providers.

WHAT LANDED

Pi runs second seat when the weekly limit hits.
Subscription stays — $200/mo is cheap for what it does.
Frontier model upstream for spec; local model downstream for execution.
Real sovereignty in the moment that actually matters.

The eight-hour live test

Saturday night the weekly Claude Code limit hit. Reset wasn't until Sunday 7am. That left an entire offline workday to either lose or rescue with the Pi stack. I ran the rescue.

Pi + qwen3.6 carried real dev work for eight hours. Slow, but acceptable on review. Two parallel Pi terminals locked the box up — one terminal was the sustainable shape. Small, well-specified chunks completed cleanly; big open-ended asks didn't. The pattern that emerged is the pattern that's documented and reproducible now.

What worked

ONE TERMINAL

Two parallel sessions locked the box up.

One terminal was the sustainable shape.

SMALL CHUNKS

Big open-ended asks crashed.

Spec-driven small tasks completed cleanly, every time.

FRONTIER UPSTREAM

Frontier writes the spec, local executes.

The split that lets both tools do what they're good at.

What I learned

Local models today are good enough as a backup with discipline — one terminal, small chunks, frontier-model spec upstream. They're not yet good enough to replace a frontier-model primary on a time/quality basis. The speed tax is real.

The cost-cut math ($200/mo) was never the bottleneck. Time was. Building a parity stack would have cost more time than the savings would have bought back. The honest reframe — backup, not replacement — is strictly more accurate for where local models are in May 2026. Maybe by end of year the math flips. Until then, the subscription stays and the Pi takes the next weekly outage in stride.

Read along

The live test became a short post on this site — When the limits hit. The earlier kickoff (revised) lives at /posts/a-local-coding-agent-on-pi/.