Welcome back. Coding agents have been writing PRs for a while now, but can you actually trust the code to work? Cursor thinks so — they just gave their agents their own computers. Now, these agents can click through UIs, debug in browsers, and ship PRs with video proof to show everything works.

Also: How to set up AI agents that code 24/7, fix a critical AGENTS .md mistake flagged by a DeepMind engineer, and see how Stripe ships a thousand PRs every week.

Today’s Insights

  • Powerful new updates and hacks for devs

  • Karpathy asks engineering teams to focus on CLIs

  • How to stop AI agents from writing useless tests

  • Trending social posts, top repos, and more

Welcome to The Code. This is a 4x weekly email that cuts through the noise to help devs, engineers, and technical leaders find high-signal news, releases, and resources in 5 minutes or less. You can sign up or share this email here.

TODAY IN PROGRAMMING

Click here to watch Cursor cloud agents in action.

Cursor gives agents their own computers: The AI code editor's cloud agents now run in isolated virtual machines where they can click through UIs, debug in browsers, and host servers on their own. Every agent delivers merge ready PRs complete with video demos, screenshots, and logs for a smooth review process. CEO Michael Truell shared that over a third of Cursor's merged PRs now come from these cloud agents, predicting developers will soon manage entire fleets of AI teammates for major projects.

Anthropic now lets you code from your phone: The AI lab shipped Remote Control for Claude Code, which lets developers start a session in their terminal and pick it up from a phone, tablet, or any browser. Claude keeps running locally on your machine the whole time, so your filesystem, MCP servers, and project config are always available. It’s perfect for reviewing diffs on the go, approving PRs from the couch, or keeping a task moving between meetings.

Vercel open sources Chat SDK for cross-platform bots: The frontend cloud platform just dropped Chat SDK, a unified TypeScript library that lets devs write bot logic once and ship it across Slack, Microsoft Teams, Discord, Google Chat, and more. This open source toolkit comes with JSX-based UI cards, real time AI streaming, and built in state management with Redis. It’s a huge time saver for engineering teams who are tired of dealing with separate codebases for different chat platforms.

Delve is the AI-native compliance platform that actually does the work for you, auto-collecting evidence from AWS, GitHub, and your stack so you’re not chasing screenshots or babysitting integrations. Use AI security questionnaires and an AI copilot to make compliance less dreadful.

The proof is in the pudding:

  • Bland unlocked $500k ARR in 7 days. 

  • 11x streamlined audits and moved faster on enterprise deals. 

  • micro1 scaled compliance without adding headcount.

Free migration. Zero disruption. No starting over.

Book a demo, trigger your migration, and get $2,000 when you’re onboarded.

INSIGHT

Karpathy asks engineering teams to focus on CLIs

Source: The Code, Superhuman

It started with a single CLI release. Polymarket (a predictions market company) just shipped a Rust-based CLI that gives AI agents a direct line to prediction markets. It lets them trade and pull data straight from the terminal without a browser.

Andrej Karpathy turned it into a manifesto. In a viral post, he argued CLIs are the perfect setup for agents since they already speak the language of stdin, stdout, and JSON. To show how it's done, he had Claude install Polymarket’s CLI and pull up a live prediction market dashboard in under three minutes.

It's riding in the same boat as MCP. With 97 million monthly downloads and backing from OpenAI and Google, MCP became the industry standard and moved to the Linux Foundation by late 2025. Without a CLI, MCP server, or machine readable docs, your software is invisible to agents.

But some people aren't ready to get on board. OpenCode engineer Rhys Sullivan thinks CLIs are a dead end because they're hard to find and aren't secure enough. Sullivan’s bet is REST APIs that register clients on the fly. But no matter which side wins, software is splitting in two: the old path through websites and logins, and a new one where agents find and use tools on their own.

IN THE KNOW

What’s trending on socials and headlines

Meme of the day.

  • Ghost Team: This developer's git history looks like he hired an entire engineering team. It's just him and a fleet of AI agents (4.4M views).

  • Bad Instructions: A Google DeepMind developer found that a bad AGENTS .md file actively hurts your AI agent's performance. Here's what to fix.

  • Autopilot Mode: This open source setup lets you run AI coding agents for hours without touching them. The dev behind it completed 250 tasks in a single session.

  • Spec and Ship: An Anthropic engineer's workflow for shipping full features without writing a single line of code is going viral (1M views).

  • PR Factory: Stripe's AI agents now ship over a thousand PRs per week. An OpenAI developer broke down the emerging playbook behind it.

  • Alibaba drops Qwen 3.5 with smaller models that outperform their larger predecessors.

  • Perplexity unveils Computer, a digital worker that orchestrates top AI models.

  • Notion ships Custom Agents that run autonomously on a schedule across your tools.

  • MiniMax launches MaxClaw, a free 24/7 AI agent across major messaging apps.

AI CODING HACK

How to stop AI agents from writing useless tests

If you've ever used Claude Code or Cursor to write tests for a TypeScript project, you’ve probably run into a ton of redundant assertions. They often check for things the type system already handles, making these tests essentially dead weight. Matt Pocock, an ex-Vercel engineer, shared a simple fix for this.

Just add this to your CLAUDE .md (or .cursorrules for Cursor):

Never write tests that verify what the TypeScript type system already guarantees.

Both tools will pick up the change during your next session. Pocock also put together a full TDD skill for Claude Code if you want to dive deeper into getting agents to write meaningful tests.

TOP & TRENDING RESOURCES

Click here to watch the tutorial.

Top Tutorial

How to build custom dev tools with agent teams: You'll learn to build custom software with a team of AI agents. This tutorial walks you through building a complete project from just a single spec file, using autonomous agents to handle all the coding, testing, and debugging for you.

Top Repo

memU: A memory framework that gives AI agents persistent, structured memory so they can run 24/7 without wasting tokens on old context. It automatically extracts preferences and patterns from background conversations, feeding that context back to the agent instantly.

Trending Paper

ActionEngine (by Microsoft Research): Traditional AI web agents are often slow and pricey because they have to analyze screens step by step. ActionEngine changes that by mapping websites offline first, letting the AI generate a single, reliable script that gets the job done faster and for less money.

Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 200K+ engineers and 100K+ followers on socials. Get in touch.

Whenever you’re ready to take the next step

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

Login or Subscribe to participate

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team

Keep Reading