Cursor drops two new features, Arena.AI rolled out Agent Mode

Welcome back. The internet just crossed a major milestone. For the first time ever, traffic from AI agents has officially surpassed human traffic online. As companies build infrastructure like bank accounts and emails for agents, it’s not hard to imagine a future where the economy of agents outgrows the economy of humans online.

Also: OpenAI engineer's 7 tips on /goal, Claude skill to maintain deep understanding of outputs during complex sessions, and how a veteran engineer's role shifted to directing systems.

Today’s Insights

Powerful new updates and hacks for devs
The flat org strategy for AI-native teams
How to run multiple Codex agents at once
Trending social posts, top repos, and more

TODAY IN PROGRAMMING

Click here to watch Cursor’s new features in action.

Cursor adds visual editing and token insights to canvases: The AI coding startup just unveiled two new features to help engineers move faster. Design Mode allows you to annotate UI elements directly so you don't have to describe every fix in a prompt. There's also a new context report that shows exactly how tokens are being used across system prompts, rules, and skills, with a quick-reset button to clear out the bloat and keep things lean.

OpenAI upgrades Codex with in-app iOS tools and user profiles: The ChatGPT maker just shipped a Build iOS Apps plugin that lets devs preview SwiftUI changes and hot-reload edits directly in an in-app browser. They also released a new Python SDK that embeds Codex into your own programs with a simple pip install. Plus, new activity profiles track your lifetime tokens and streaks with private, shareable cards, keeping your entire workflow in one place.

Arena.AI unveils a real-world benchmark for agents: The SF-based startup just rolled out Agent Mode, where frontier models like GPT-5.5, Claude Opus 4.7, and Gemini 3.1 Pro handle complex, multi-step tasks in a sandbox equipped with web search and coding tools. Every session contributes to a public leaderboard. It tracks how often models successfully complete jobs, follow instructions, and avoid making up tools. For teams choosing a model for agentic coding, this provides real-world performance data rather than just curated test results.

PRESENTED BY YOU. COM

Stop Guessing on AI Search Providers.

Picking a search provider based on a few test queries is a gamble — and hallucinations are the tax you pay for guessing wrong. This technical guide from You. com walks through an exact framework to evaluate AI search, build a golden query set, and measure accuracy, relevance, and confidence.

Download the AI Search Evaluation Guide.

INSIGHT

AI is dissolving the lines between engineers, designers, and PMs. Here’s how teams are changing:

Before and after comparison of Anthropic’s shift to an AI-native workflow.

The rote layer is collapsing. Anthropic recently shared that Claude now handles 95% of its analytics queries, freeing up data scientists to focus on forecasting and modeling. The same logic is reshaping how Anthropic builds software. Claude Code's engineering director, Fiona Fung, says roles are blurring on her team. PMs are writing code, and engineers are handling design. The old model of handing specs to coders is falling apart.

Anthropic isn’t the only one. Take Zencoder, for example. A product manager there built and shipped a feature in just one day. They didn't even have to open a ticket. Building is now so cheap that it's faster than writing specs. But as Anthropic's Fiona Fung notes, old processes don't just go away. Instead, legacy tickets and handoffs pile up. They become the new bottleneck for the team.

Moving beyond coding speed. If coding is no longer the slow part, you can't just hire for speed anymore. Fung says her team has split into two groups. You have creative builders who focus on the product and user experience. Then you have systems experts who manage the architecture. The problem is that old management models don't work here. They focus too much on coding speed and standard team sizes. They just aren't built for this new way of working.

So what’s the solution? To maintain agility, companies like Anthropic and OpenAI are flattening their structures. Their playbooks focus on technical managers who use agentic tools and relentless dogfooding. They track success through faster onboarding and PR cycle times. If you’re looking to make a move, we recommend diving deeper into the playbooks of Anthropic and OpenAI for running an AI-native engineering org.

IN THE KNOW

What’s trending on socials and headlines

Meme of the day.

Worth Cloning: A Claude Code engineer’s internal prompt for maintaining a "deep understanding" of Claude’s output went viral. It’s so effective that another dev turned it into an agent skill.
Against the Grain: The creator of OpenCode shared three rules for being good at your job that cut against how most devs approach problems.
Set and Forget: Codex can now grind toward one goal for hours or days. An OpenAI engineer dropped 7 tips to keep it off the wrong path (2K bookmarks).
Agent README: No more juggling a different instructions file for every coding tool. This open format works across Cursor, Codex, Gemini CLI, and more.
Under the Hood: A Hugging Face dev advocate recorded a deep dive into Pi, a minimalist coding agent, down to how its architecture holds together (1K bookmarks).
Build Your Fleet: A CEO running 14 agents wrote the playbook for wiring Claude Code skills that don't break when you swap the tools underneath them.
Less Babysitting: A Claude Code engineer broke down how to set up feedback loops so Claude verifies its own work and finishes ambitious tasks while you do something else (1K likes).
Direct, Don't Write: A veteran engineer behind Jest, Yarn, and Metro explains how AI shifted his role from writing code to directing systems that ship work that wouldn't have before.

AI CODING HACK

How to run multiple Codex agents at once

One Codex session runs one task at a time, so your next task waits. Open two in the same repo and they fight over the working tree and wreck your branches.

Git worktrees give each agent its own directory against the shared .git, the same isolation the open-source “oh-my-codex” launcher runs on. Make a worktree for every task, then launch Codex in each from its own terminal tab.

git worktree add ../codex-auth -b codex-auth
cd ../codex-auth && codex "Migrate auth to JWT"
# new tab:
git worktree add ../codex-tests -b codex-tests
cd ../codex-tests && codex "Raise test coverage to 80%"

Since each agent uses its own working tree, they won't interfere with each other's files. Just use your own branch names and prompts.

One catch: avoid using parallel agents for shared tasks like lockfile updates, database migrations, or major refactors, as these will cause merge conflicts. It’s better to handle those one at a time.

P.S. Get 50+ AI coding hacks for Claude Code, Cursor, and Codex here.

TOP & TRENDING RESOURCES

Click here to watch the tutorial.

Top Tool

Boxes.dev: Cloud dev environments designed for agentic coding. You can run every Claude Code or Codex chat on its own dedicated cloud instance, sync across mobile and desktop, and build from anywhere.

Top Repo

Spec Kit (109K ⭐): An open-source toolkit that transforms specs into executable plans and production-ready code. It allows you to prioritize product scenarios and achieve predictable results through spec-driven development.

Trending Paper

When AI builds itself (by Anthropic): The core problem is that as AI systems learn to build their own successors, humans risk losing control over them. The key finding shows this shift is already underway, with AI now writing over 80% of Anthropic's merged code.

IN CASE YOU MISSED IT

Our most-clicked story from yesterday

Google has unveiled Gemma 4 12B, a highly efficient local model that brings powerful multimodal reasoning directly to standard laptops without the cloud costs.

Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 300K+ engineers and 150K+ followers on socials. Get in touch.

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team