Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
mitkox 
posted an update 24 days ago
Post
2561
We’ve reached a point where on device AI coding that is free, offline, and capable isn’t just a theoretical possibility; it’s sitting on my lap, barely warming my thighs.
My local MacBook Air setup includes a Qwen3 Coder Flash with a 1M context, Cline in a VSCode IDE. No internet, no cloud, no ID verification- this is the forbidden tech.
Current stats:
All agentic tools work great local, sandboxed, and MCP
OK model output precision
17 tokens/sec. Not great, not terrible
65K tokens context, the model can do 1M, but let’s be real, my MacBook Air would probably achieve fusion before hitting that smoothly
Standard backend and cache off for the test
All inference and function calling happen locally, offline, untethered. The cloud didn’t even get a memo.

Transformer architecture has lot of limitations, google's next Mixture of Recursions architecture will bring more throughput and less memory requirements. So one thing is correct, this is also going to better from here.

Not sure what to think. Some code i've seen works out of the box when i ask it for something, and it's approach makes sense at times. Other times it gets close but needs refinement.

Be kinda funny though if source code is generated from models, using specific seeds that are tested and known to work, and then you 'buy' software that's 5k of prompts and seed data that generates the source and then compiles it into your completed application.

For some years now it felt like we don't need much in the way of newer code, most of 'coding' today is just gluing together stuff that's already written, much like writing BASH scripts.

Other than being skeptical and not trusting code blindly, this does seem promising for the future.

calling it "forbidden tech" might be setting the bar too low. watch 'The Expanse', we have a long way to go