OpenAI just expanded Daybreak, Stripe debuts Directory

Welcome back. Google is feeling the heat in the talent wars. In just 48 hours, they lost two titans. Transformer co-inventor Noam Shazeer headed to OpenAI on Thursday, and Nobel laureate John Jumper left for Anthropic just a day later. Now, the stock is tumbling toward its biggest single-day drop in a year.

Also: How to make Codex catch its own bugs with one loop, keep your Claude Code context clean with subagents, and use Vercel’s framework to define an agent with .md files.

TODAY IN PROGRAMMING

Click here to see GPT-5.5-Cyber’s full benchmarks.

OpenAI hands security teams an AI that fixes vulnerabilities it finds: The ChatGPT maker just expanded Daybreak, its push to patch vulnerable software at scale. They’ve updated the Codex plugin so teams can scan code, validate findings from other tools, and auto-generate patches to clear backlogs. They also rolled out GPT-5.5-Cyber to vetted defenders, claiming the model hit 85.6% on the CyberGym benchmark, beating out the standard GPT-5.5.

Stripe debuts a search engine for autonomous agents: The payments giant just opened Directory in public preview, allowing both developers and AI agents to find businesses across its network. With a single terminal command, you can pull up apps, infrastructure providers, and pay-per-call APIs as machine-readable data. From there, an agent can evaluate the options and handle the integration entirely on its own.

Cursor makes its biggest push yet for autonomous coding: The AI coding startup just previewed three drops for developers at its first Compile keynote. Cloud agents now run in their own virtual machines, clicking through UIs and shipping merge-ready PRs. There's also a new iOS app so you can keep an eye on them and clear any blockers on the go. And it teased Origin, Git infrastructure rebuilt so thousands of agents can push code without drowning humans in merge conflicts.

PRESENTED BY WISPR

Cursor for code. Claude for thinking. What about input?

Your dev stack got an AI upgrade everywhere except the input layer. You're still typing every prompt, every ticket, every review comment by hand.

Wispr Flow closes that gap. Dictate into Cursor, VS Code, Slack, Linear, or anywhere else you work. It's syntax-aware: camelCase, snake_case, acronyms, and file names all come through clean. Mention a file in Cursor or Windsurf, and it auto-tags.

It's the voice layer for an AI-native workflow. Speak your intent. Your tools do the rest.

Available on Mac, Windows, iPhone, and Android. Used by millions of developers, including teams at OpenAI and Mercury.

Try free

INSIGHT

Stop paying top dollar for every coding task. GLM-5.2 has changed the math:

Source: The Code, Superhuman

Open models have caught up. For two years, open models have been chasing closed ones, but never gotten close enough to actually replacing them. GLM-5.2 appears to have finally closed that gap. Z.ai dropped the model with a 1M-token context window, and leading engineers took notice.

Nathan Lambert, the open-model researcher at Ai2, called it the first open-weight model that feels natural in a coding setup. And many developers are ranking it on par with Opus 4.8 and GPT-5.5 — the first time an open model has passed the vibe check with developers.

Don't cancel the flagship. It's tempting to call this a win for open source and ditch the pricey Claude Code subscription. But that misses the point: heavy lifting still happens at the frontier. On Artificial Analysis's AA-Briefcase test for multi-week projects, GLM-5.2 trails both Opus 4.8 and Claude Fable 5, and even the best model nailed every part of a task only 3% of the time. Those long, messy projects are exactly why frontier intelligence is still worth the premium.

Price moved the burden. What changed is the math, not the ranking. GLM-5.2 runs at a fraction of the per-token cost of GPT-5.5 or Opus, so the same coding work is far cheaper than a month ago. The default didn't break. Frontier models must now justify their cost for lower ranking tasks. The real question isn't which model is best, but which tasks actually require the best.

Route, don't switch. Send the deep reasoning and multi-step work to the frontier model. And send the routine, well-defined tasks to GLM-5.2. You don't need a new tool to make it happen, because GLM-5.2 runs inside Claude Code. You can refer to this guide from Baseten engineer Alex Ker to get started. Whoever nails that routing first gets to ship considerably cheaper without sacrificing any performance at the top end.

PRESENTED BY IBM

What do you fund when ROI isn’t clear?

AI isn’t about ROI—yet. It’s a positioning move. It reshapes how work happens, how value is generated and how advantage accumulates. The CFO’s job isn’t defending AI spend. It’s making the cost of delay visible in financial terms. AI fails when companies hesitate. With 55% of executives tying future advantage to execution speed, waiting is the risk. Finance can set the pace—if it shifts:

From explaining variance to designing financial systems
From control to conditions for experimentation
From oversight to scaling AI with the C-suite

IN THE KNOW

What’s trending on socials and headlines

Meme of the day.

Subagent Setup: Thirty minutes in, your context drowns under 80k tokens of tool-call noise. This guide shows how to fork your context to keep it cheap.
SSD Killer: Codex secretly writes diagnostic logs to your disk 24/7, eating your drive's write limit. Here’s one command that kills it (5.3K bookmarks).
Loop Library: Coding agents run while you sleep, but most skip their own review. This breakdown of 15 loops hands you paste-ready commands.
Cross-Model Loop: A popular dev's favorite Codex loop secretly pulls in a rival model to check its work. This vastly improved his OpenAI code quality (4.3K likes).
Honeypot Trick: Ghostty's creator hides prompt injections in his AGENTS.md to catch devs shipping AI PRs they never read. Get caught, get banned.
Loop Engineering: The model is becoming a commodity, so the real work moved to the system around it. This guide explains the 4 parts of an agent loop that quietly break.
Markdown Mode: Vercel's CEO believes Markdown is the next big programming language. Their new framework lets you define an agent as a folder of .md files.

AI CODING HACK

How to let Claude Code test its own frontend

Claude Code can write your UI, but it can't actually see if it works. This usually means you're stuck clicking through the browser yourself after every single change. Microsoft's official Playwright MCP server fixes that. With just one command, you can hook a real browser right into Claude Code.

claude mcp add playwright npx @playwright/mcp@latest

Claude now controls Chromium directly. After making a change, just tell it: "Open localhost:5173, add two items to the cart, and confirm the total updates."

It will navigate, fill out forms, and fix bugs on its own. Use the “--scope project” flag to share this setup with your team, and run /mcp to verify the connection.

P.S. Get 50+ AI coding hacks for Claude Code, Cursor, and Codex here.

TOP & TRENDING RESOURCES

Click here to watch the tutorial.

Top Tool

Backgrind: It keeps your AI coding agent whether, it's Claude Code, Cursor, or our hosted model, in a floating, always-on-top window. Just fire off a task and get back to what you're doing. It'll only flash or chime when it actually needs your input.

Top Repo

Background Agents (2K ⭐): A coding agent that handles the heavy lifting in the background so you can stay focused. Just plug it into your existing stack (GitHub, Slack, Linear, or webhooks) and let it spin up dev sandboxes, automate fixes, and open PRs on autopilot.

Trending Paper

Can LLM agents infer world models: This research paper looks at whether interactive LLM agents can figure out an environment's underlying structure through tool queries. It shows that their performance takes a major hit as complexity grows, making them way less reliable and efficient than traditional algorithms.

IN CASE YOU MISSED IT

Our most-clicked story from yesterday

A PhD candidate heading to OpenAI went through 57 interviews across 11 companies before landing an ML role. Here's a breakdown of what every round taught her (4M views).

Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 300K+ engineers and 150K+ followers on socials. Get in touch.

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team