Nvidia drops Nemotron 3 Nano Omni, AWS unveils Amazon Quick desktop AI

Welcome back. There’s a strange trend brewing in tech right now. CTOs from Workday, Instagram, Box, and several other billion-dollar giants are stepping down to join Anthropic as staff engineers. The Claude maker recently crossed $30 billion in annualized revenue run rate, and its valuation in private markets exceeded $1 trillion. It appears that everyone wants a seat on this rocket ship before it’s too late.

Today: Nvidia's new multi-model, the Claude skills to ship UIs 10x faster and what OpenAI's AWS pivot changes for devs building agents.

Today’s Insights

Powerful new updates and hacks for devs
Why open weights are overtaking closed models
How to make Claude react to errors in real time
Trending social posts, top repos, and more

TODAY IN PROGRAMMING

Click here to watch Nvidia's Nemotron 3 Nano Omni in action.

Nvidia's new model unifies vision, audio, and text for agents: The chip giant just dropped Nemotron 3 Nano Omni, an open model that processes vision, audio, and text simultaneously. This replaces the slow, fragmented perception models currently used by AI agents. With its 30B mixture-of-experts architecture and a 256K context window, it delivers nine times the throughput of other open omni models. Run the model locally.

Amazon launches a desktop AI built around your work: The cloud giant just shipped a new desktop app for its Quick assistant that builds a personal knowledge graph by learning from your files, calendar, and Slack threads. You can use natural language to generate live dashboards, custom apps, and professional slide decks directly from your data. The app also integrates seamlessly with Kiro CLI and Claude Code. Try it here.

Outages push a top open-source maintainer off GitHub after 18 years: HashiCorp founder Mitchell Hashimoto is moving his Ghostty terminal app away from GitHub, citing constant disruptions that have made the platform impossible to use. The move comes as GitHub’s CTO admitted that a spike in agentic coding workflows has overwhelmed their systems, forcing the company to prioritize uptime over new features.

PRESENTED BY EXE. DEV

Stop playing "Infrastructure Architect" and start shipping.

Cloud configs and booting VMs kill focus. exe.dev fixes that with instant VMs, persistent disks, and built-in HTTPS. Go from idea to live code in seconds, not minutes.

Don’t waste time with busywork: provisioning, networking, managing secrets. Our integrations inject credentials at request time so you can use external APIs without secrets on your VM. Whether you're running AI agents or dev APIs, get the isolation you need without the overhead. Cut the infra-tax at exe. dev.

Now available in regions around the globe!

INSIGHT

Open weight models have virtually caught up to closed source giants. What does that mean for dev teams?

Source: The Code, Superhuman

Two stories from last week. Last Thursday, the White House accused China of running industrial-scale distillation campaigns against US frontier models. By the following morning, DeepSeek shipped V4 under MIT license at a fraction of the cost of Opus 4.7 or GPT-5.5. The 24-hour window tells an important story.

The bet behind the wall. Frontier AI labs bet $630 billion in 2026 capex on the idea that superior capability would create a natural monopoly. But this strategy only works if they can maintain pricing power. With Chinese open-weight models like DeepSeek, Qwen, and Kimi rapidly closing the gap, that pricing power is starting to evaporate.

The gap is real but uneven. V4 trails top closed models by a few months in basic reasoning, but the gap widens for long-context retrieval and complex agentic work. Cheaper models work fine for coding assistants and RAG, but for autonomous agents on long-term tasks, your choice of model still determines whether the job actually gets done.

Build the exit, keep the front door. Treat closed-frontier APIs as replaceable for anything non-essential. Use them for heavy lifting, but you may want to build an open-weight fallback. With the regulatory window closing, transitioning later may become difficult with each passing day. According to one top VC from a16z, about a quarter of startups are probably using open-source Chinese models.

IN THE KNOW

What’s trending on socials and headlines

Meme of the day.

Harness Decoded: "Agent harness" gets thrown around a lot these days. This AI founder pins down what Claude Code, Cursor, and Codex really share under the hood.
_{1,400 bookmarks}

RL Decoded: Top AI labs are training smarter agents using a method from Karpathy. This guide walks through the technique with working code you can use.
_{61,000 views}

Agent Anatomy: An ex-Google senior engineer breaks down the 5 core components and 4 patterns behind every production AI agent, beyond the buzzwords.
_{595 likes}

Cloud Shake-Up: OpenAI just ended its Microsoft exclusivity to sign with AWS. Sam Altman explains what this means for devs building agents on the world's biggest cloud.
_{100,000 views}

Context Crunch: Does fine-tuning solve long context? This guide explains how 128K windows actually function and why that matters.
_{1,700 bookmarks}

Design Stack: This freshly updated directory site compiles the top Claude skills for design engineering, so you ship UIs faster.
_{1,100 bookmarks}

GitHub Hack: A Cloud security researcher pulled off remote code execution on GitHub with a single git push, gaining access to millions of private repos. Here's how he did it.
_{1 million views}

AI CODING HACK

How to make Claude react to errors in real time

Source: X/noahzweben

Tracking background tasks in Claude Code used to be a hassle. Users had to keep running “/loop” just to check for errors, which ended up wasting a lot of API calls. To streamline this, Claude Code PM Noah Zweben swapped out that clunky workaround for a built-in Monitor tool.

Now, a single prompt automates the process.

Start my dev server and use the Monitor tool to watch for errors.

Claude runs a background shell script and streams stdout. It burns zero tokens while the server is healthy, but instantly fixes any stack traces it detects.

TOP & TRENDING RESOURCES

Click here to watch the tutorial.

Top Tool

Tolaria: A note-taking app that stores your knowledge as plain Markdown files on your local disk. It adds a Notion-like editor, built‑in Git version control, and tight Claude Code/MCP integration so AI and CLI tools can work directly with your notes.

Top Repo

Context Mode (11k ⭐): Context window optimization for AI coding agents. It tracks sessions to keep large files and logs from cluttering the LLM context window, cutting context usage by nearly 98% while maintaining full continuity.

Trending Paper

Skill retrieval augmentation for agentic AI: Overloading AI agents with too many external tools at once strains their memory limits. While on-demand skill retrieval improves performance, current models still have a hard time knowing when they actually need the extra help.

Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 250K+ engineers and 150K+ followers on socials. Get in touch.

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team