Moonshot drops Kimi K2.6, OpenAI expands Codex memory with Chronicle

Welcome back. Every AI lab has come to the same conclusion: the road to frontier AI runs right through a developer's terminal. Google DeepMind has reportedly launched a strike team to match Anthropic's performance on coding benchmarks. With AI labs spending billions to build better models and attract users, there’s never been a better time to be a dev or engineer.

Also: Atlassian engineer's system design guide from 60+ FAANG interviews, a prompt that audits your codebase for security flaws, and see who's replacing Tim Cook as Apple CEO.

Today’s Insights

Powerful new updates and hacks for devs
Why token spend is the new vanity metric
How to stop Claude Code's permission spam
Trending social posts, top repos, and more

TODAY IN PROGRAMMING

See how Kimi K2.6 stacks up against frontier models.

Moonshot unveils open-source model for autonomous coding: The Chinese AI lab just dropped Kimi K2.6, an open-source, trillion-parameter multimodal model that handles up to 300 sub-agents across 4,000 parallel steps. It is capable of 13-hour autonomous coding runs, beats out other frontier models on the Humanity’s Last Exam benchmark, and introduces Claw Groups, which allow humans and agents to collaborate in a shared workspace from any device.

Alibaba previews its most powerful coding model yet: The Chinese tech giant just shared an early look at Qwen3.6-Max-Preview, their next proprietary model and the follow-up to Qwen3.6-Plus. It's already crushing it on real-world agent tasks and snagged the top spot on six different coding benchmarks. You can try it out on Qwen Studio now, and API access with support for multi-step workflows is coming soon.

OpenAI expands coding agent memory with screen context: The ChatGPT team has been building out memory in Codex, and they just took it a step further with Chronicle. This research preview uses what's on your screen to fill in the blanks, so Codex actually gets what you're talking about with a specific error or open doc without you having to explain it all over again. As you use it, it gets the hang of your tools and how you work.

PRESENTED BY WISPR

You know what never gets written? The docs.

Not because you don't care. Because after 6 hours of building, typing paragraphs about what you built feels like punishment. So the PR description stays vague. The README gets "TODO." The architecture decision lives in your head until you leave the company.

Wispr Flow: syntax-aware voice dictation that works in every app. Speak your PR descriptions, commit messages, and docs. camelCase, snake_case, and acronyms stay intact.

Engineers at OpenAI and Vercel use it daily. Available on Mac, Windows, iPhone, and Android. The best docs are the ones that actually get written.

Try free

INSIGHT

Token spend is the new vanity metric. Here’s what you need to know.

Source: The Code, Superhuman

The 60 trillion token flex. An internal Meta leaderboard called Claudeonomics recently ranked 85,000 employees based on their AI token usage. Over just 30 days, the company consumed 60 trillion tokens, triple the amount of text in every book ever published. Even Nvidia CEO Jensen Huang noted he'd be concerned if a $500,000 engineer wasn't spending at least $250,000 a year on tokens (as the biggest seller of AI chips, he may be a little biased).

This trend led to immediate gaming of the system. Some Meta employees now leave AI agents running idle just to climb the rankings. At OpenAI, one engineer processed 210 billion tokens in a single week, which is enough to fill Wikipedia 33 times over. This phenomenon, dubbed "tokenmaxxing," mirrors the early 2000s when companies mistakenly measured developer productivity by lines of code.

Pricey inner monologue. This case study depicts a weird incentive where reasoning models generate tokens one at a time, essentially narrating every thought before moving to the next. While inference-time compute was supposed to be a breakthrough, here it's acting as wasteful overhead that leads to trillion-token bills.

History rhymes. Measuring success by lines of code or velocity points never really worked, and tracking token usage is just that same old mistake repackaged for the API era. This guide on developer productivity breaks down what actually moves the needle when it comes to engineering impact.

PRESENTED BY EXE. DEV

From idea → running code in seconds

Booting a VM or configuring cloud services steals your focus. Provisioning, networking, secrets management…

Exe.dev fixes that with instant VMs, persistent disks, and built-in HTTPS, so you can go from idea to running code in seconds.

No platform needed, just run from your computer
Credentials injected at request time = use external APIs without putting secrets on your VM
Run AI agents, dev APIs, and internal tools in isolated environments

Get 2 CPUs, 8 GB of RAM, and 25 GB of disk—shared across up to 25 VMs for $20/month.

See for yourself

IN THE KNOW

What’s trending on socials and headlines

Meme of the day.

Interview Cheatsheet: An Atlassian Principal Engineer distilled 60+ FAANG interviews into the exact order to learn system design in 2026.
Opus Playbook: Anthropic just dropped the official Opus 4.7 guide for Claude Code, and it flags settings most devs are missing.
Vibecode Audit: Paste this prompt into your coding agent, and it'll scan your codebase for every security flaw an attacker could exploit.
Apple Era Ends: Tim Cook is stepping down as Apple CEO on Sept 1 after turning a $350B company into $4T. His replacement: The hardware chief behind the iPhone and Vision Pro.
Memory Mode: An OpenAI engineer shows how to stop re-explaining your codebase to Codex every single session.
Instant Architect: This viral AI tool turns any rough shape you draw into a real house plan with a new way to explore the different layouts (2.8M views).

AI CODING HACK

How to stop Claude Code's permission spam

Claude Code asks permission for every bash and MCP command, even ones you've approved fifty times. Most devs end up running “--dangerously-skip-permissions” just to silence it. Creator of Claude Code, Boris Cherny, shared a new skill that fixes it properly.

Run this inside any session:

/fewer-permission-prompts

This skill scans your session history to find safe "bash" and "MCP" commands that usually trigger prompts, then gives you a list to add to your permissions allowlist.

You approve them once, and you're good to go. It’s best to run this after a few days of work so there's enough history to pull from.

TOP & TRENDING RESOURCES

Click here to watch the tutorial.

Top Repo

GenericAgent (5.3k ⭐): This repo is a lightweight way to build AI agents with just 3K lines of code. It uses a simple 100-line loop and 9 basic tools to let AI control your whole system. It works with your browser, terminal, files, and mobile, making it easy to automate tasks without any extra fluff.

Trending Paper

Better AI models enable more ambitious work (by Cursor): This study explores whether improved AI helps developers simply do more work or tackle previously impossible tasks. It found that better AI boosts overall usage, eventually shifting developers' focus toward managing significantly more complex, system-wide challenges.

Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 240K+ engineers and 150K+ followers on socials. Get in touch.

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team