Welcome back. Compute used to be the biggest bottleneck for engineering teams, but now it's the budget. Uber and Microsoft have both struggled to cap the number of tokens their engineers are burning through — even Sam Altman acknowledged the budget bottleneck yesterday.
Also: How to run your own AI agent for $15 a month, build an LLM from scratch, and which 5 skills an Amazon applied scientist says get you to $300K.
Today’s Insights
Powerful new updates and hacks for devs
Why is AI weaponizing your tech debt
How to keep coding while Cursor runs a long task
Trending social posts, top repos, and more

TODAY IN PROGRAMMING
Google drops a multimodal model small enough to fit on a laptop: The search giant just unveiled Gemma 4 12B, a model designed to run locally on a laptop with 16GB of RAM. It handles vision and audio natively by feeding inputs directly into the language backbone, which cuts down on latency. Google claims its reasoning is nearly as good as the larger 26B model. Plus, the open Apache 2.0 release means engineering teams can now run powerful agents on standard hardware without racking up cloud costs.
xAI brings image-to-video generation to its API: The Musk-led AI lab just previewed Grok Imagine 1.5, which can turn a single still frame into fluid, cinematic video using just plain-language prompts. The model animates camera movement, atmosphere, and physics while keeping the lighting and detail of the original shot intact. Developers can now stage frames and chain shots together to create longer, consistent scenes with only a few lines of code.
Perplexity moves the data center to your laptop: The answer engine just announced that hybrid agentic inference is coming to Perplexity Computer. It splits tasks between your local machine and the cloud. Sensitive data like health and finances stay on-device for privacy, while heavy lifting is routed to the server. Developed with Intel and demoed on its Core Ultra silicon, with support for NVIDIA's RTX Spark platform planned. It starts shipping in July.

PRESENTED BY COMP AI
You don’t have months to get audit-ready. With Comp AI, you can get SOC 2 or ISO 27001 audit-ready in just a few days, so you move fast and close enterprise customers.
Connects to your stack and automates evidence collection
Eliminates 90% of the work
Automate your compliance with Comp AI (The Code readers save $2,000).

INSIGHT
AI weaponizing your tech debt — here’s what you should know:

Source: The Code, Superhuman
How it began. This spring, your lockfile was the best security on npm. When the popular Axios library was hijacked, its poisoned releases hit any machine running a fresh install. The attacker corrupted both a new version and an old one, so "stay on stable" was no longer a solution.
The AI multiplier. Every new library adds technical debt. A recent study shows agents make it worse. They pick vulnerable versions more often than humans do. In one case, an npm worm even hid inside agent config files. It stuck around long after the malicious package was gone.
Can't we just freeze everything? HashiCorp co-founder Mitchell Hashimoto recommends devs and teams update only when something breaks. But AI is making that a risky proposition as well: Anthropic's Mythos just built working exploits for decade-old bugs, reportedly for under $2,000. Hidden flaws don't stay hidden anymore. Frozen code is now a sitting duck.
The patch is you. Auto-update everything, and you risk the next Axios. Freeze everything, and you sit on bugs AI can now crack. You're basically stuck between a rock and a hard place. The fix? Cut your technical debt. Be intentional about what you wire in. Treat prompt files and MCP servers like production code. This full breakdown is well worth a read.

IN THE KNOW
What’s trending on socials and headlines

Meme of the day.
Design Toolbox: This website curates a list of useful tools for web-focused design engineers in one place.
Agent Stack: Self-hosting your own AI agent sounds expensive. This tutorial runs the full Hermes setup for under $15 a month.
The Full Stack: Most engineers know how LLMs work in theory. This roadmap walks you through building one from scratch.
Bill Slasher: Opus is built to orchestrate, not to do the grunt work. This 15-prompt system splits the two layers and cuts AI spend dramatically.
Dynamic Workflows: A lot of Claude Code users still write 50 prompts when one workflow would do. This guide shares the 6 patterns Anthropic's own engineers use.
Code Landmines: OpenCode (open-source AI coding agent) creator Dax Raad made a rare, candid admission about what AI agents are doing to code quality.
The $300K Gap: An Amazon applied scientist breaks down the five skills that separate $300K AI engineers from everyone still grinding prompt tutorials.

AI CODING HACK
How to keep coding while Cursor runs a long task
A big refactor or a full test pass can tie up Cursor's agent for twenty minutes, leaving your terminal locked while you just sit there and watch. The CLI offers a simple one-character fix: just add “&” to the start of any message.
This runs the task on a cloud agent instead of your local machine, so you're free to keep coding. Just install the Cursor CLI and send the task from an agent chat like this:
& refactor the auth module and add tests for every routeIt runs in Cursor's cloud, and you can pick it up later on the web or mobile. This way, nothing ties up your terminal while it works.
P.S. Get 50+ AI coding hacks for Claude Code, Cursor, and Codex here.

TOP & TRENDING RESOURCES
Top Tutorial
5 useful ways to build better Claude Code skills: This tutorial shows you how to build custom AI skills from the ground up using Claude. It walks through a five-step process for structuring personal context, setting clear triggers, creating automated pass/fail eval loops, and adding long-term memory. By the end, you'll be able to fine-tune your automation and move past generic AI responses.
Top Tool
Docusaurus: An optimized docs generator in React. It's designed to help you streamline your documentation writing process.
Top Repo
Headroom (11K ⭐): A context optimization layer for LLM apps. It compresses tool outputs, database results, file reads, and RAG data before they hit the model. You get the same quality answers for a fraction of the tokens.
Trending Paper
What Anthropic learned from a year of AI‑enabled cyberattacks: Researchers looked into whether current security frameworks can still keep up with cyber threats now that attackers are using AI. They found that AI-automated attacks are making traditional risk assessments and security protocols basically obsolete.

IN CASE YOU MISSED IT
Our most-clicked story from yesterday
Claude can now build a custom harness for any task you throw at it. An Anthropic engineer shares the patterns and prompts to unlock it (2.4M views).
Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 300K+ engineers and 150K+ followers on socials. Get in touch.
What did you think of today's newsletter?
You can also reply directly to this email if you have suggestions, feedback, or questions.
Until next time — The Code team



