Sakana drops Fugu, Hermes Agent ships a locked-down setup mode

Welcome back. Going forward, it may be safe to assume that the best models will not be available to you. After holding back Mythos, Anthropic has reportedly trained an even more capable Mythos successor — which it will hold back from the public again. Open-source AI is more vital than ever.

Also: A final-year PhD turned OpenAI researcher's guide to landing an ML role, 8 prompts to fine-tune your designs, and a Codex workflow that tests every feature in your app.

Today’s Insights

Powerful new updates and hacks for devs
The new software lifecycle in the agentic era
How to run GLM 5.2 in your terminal
Trending social posts, top repos, and more

TODAY IN PROGRAMMING

Click here to see Sakana Fugu’s full benchmarks.

Sakana drops a model that routes around export bans: The Tokyo-based AI lab just unveiled Fugu, a model that orchestrates a pool of rivals rather than competing with them. It selects the best system for each request and puts together a team of experts for more complex tasks. The main selling point is resilience; if a vendor cuts off access, Fugu simply reroutes to another agent to keep things moving. Sakana claims its Ultra tier can rival Anthropic's export-restricted Fable 5 and Mythos Preview.

Open-weight GLM-5.2 is gaining major traction with devs: The Chinese AI lab's latest release has devs and tech leaders buzzing. Z.ai claims GLM-5.2 outpaces every open-weight competitor in long-horizon coding. It sits just a point behind Claude Opus 4.8 on the FrontierSWE benchmark. Developers have already started integrating it into tools like Claude Code and Cline. With its MIT license and massive 1M-token context window, there is zero vendor lock-in. See how you can run it in your terminal in today's AI coding hack section below.

Hermes Agent ships a locked-down setup mode: The open-source AI lab just rolled out Blank Slate, which flips the script on how agents start. Instead of a bloated setup, a new one boots with almost nothing active, just the model, file operations, and a terminal. Everything else, from the browser to MCP servers, stays off unless a developer manually enables them. These settings are written to disk. This allows teams to lock in a single config across different machines, ensuring updates won't cause any configuration drift.

PRESENTED BY UNBLOCKED

[Webinar] 8 levels of context maturity in AI-native engineering

AI shows up in 60% of engineering work. But only about a fifth of it can be handed off without someone babysitting the output. That’s because agents are missing context.

This 8-stage context maturity model gives a real answer on why you haven't seen meaningful productivity gains for all the tokens burned.

Join Unblocked live June 24 (FREE) to learn:

Why more MCPs provides agents access but not understanding
What it takes to deploy agents you can trust without supervision
How a context layer solves for quality, efficiency and cost

INSIGHT

Letting AI write more of your code won't free up your engineers. Here's why:

A comparison of the software lifecycle, before and after agents.

An uneven speedup. Addy Osmani (ex-Google Cloud Director) co-authored a Google whitepaper on how AI is reshaping the software lifecycle, and his read on it is worth every engineering leader's time. His argument: while some parts of the development lifecycle have accelerated, the process as a whole remains stalled.

Same phases, new proportions. The two lifecycles above run the same steps (see above image): requirements, design, implementation, testing, review, and maintenance. AI hasn't removed any steps, but it has completely transformed implementation. What used to take weeks now happens in minutes, leaving the rest of the cycle to adapt to this rapid shift.

The slow parts didn't move. Gathering requirements, designing architecture, and handling verification all require human judgment. Those are things an agent just can’t fully replace (yet). Deciding what to build, weighing trade-offs, and confirming the final result is correct still falls on people. As a result, the bottleneck moved towards the initial specs and the review process.

This changes what the leader should optimize. Osmani's key advice is to set the bar at the evaluation stage. Do not just set it at the demo. A demo shows that an agent can work once. An eval proves it works every single time. This is a clear sign that the real work has shifted. It moved from writing code to specifying and verifying it. To dive deeper, check out Osmani's full post.

PRESENTED BY BITDRIFT

Mobile observability shouldn't suck

Mobile reflects reality, and it’s messy: Intermittent connections, mid-onboarding drop-offs, force quits, and more. bitdrift captures 100% of real-time data, unsampled across 1B+ installs, so you and your agents can query reality.

Try bitdrift: mobile observability for the real world.

IN THE KNOW

What’s trending on socials and headlines

Meme of the day.

Interview Gauntlet: A PhD heading to OpenAI sat 57 interviews across 11 companies before landing an ML role. Here's what every round taught her (1M views).
Retry Tax: In an agent loop, retries cost you more than tokens do. This guide rethinks what's worth optimizing once the agent is writing its own code.
Spacing's Off: Keep telling Claude Code the spacing looks wrong. The model isn't the problem. These 8 prompts hand your agent the design system it was missing.
Test Everything: A developer shares the agentic workflow he used to point Codex to turn every feature into a tested, fixed, and verified user story (4.1K likes).
Data Black Hole: Podcaster Dwarkesh Patel argues AI gains come from more data. He argues models need roughly a million times more examples than humans to match a skill.
Four Boxes: Before you build an agent loop, four things have to be true, or you're just burning tokens. This post explains exactly when one pays off and when it's a trap (6.4M views).
Feels Right: A design engineer packaged the tiny details that make interfaces feel natural into one skill your agent reads before every build.

AI CODING HACK

How to run GLM 5.2 in your terminal

Claude Code and Codex are tied to their own models, but LangChain CEO Harrison Chase has highlighted a way to break that lock. By using dcode, a model-agnostic harness, you can run GLM 5.2 instead. fast.ai co-founder Jeremy Howard ranks its performance right up there with Opus 4.8, but at a much lower cost.

Step 1: Install dcode with Fireworks bundled in.

DEEPAGENTS_EXTRAS="fireworks" curl -LsSf https://langch.in/dcode | bash

Step 2: Add your Fireworks key (grab one here).

echo 'FIREWORKS_API_KEY=your-key-here' >> ~/.deepagents/.env

Step 3: Launch it on GLM 5.2.

dcode --model fireworks:accounts/fireworks/models/glm-5p2

That gives you a complete coding agent powered by GLM 5.2's 1-million-token context. You can find the docs right here.

P.S. Get 50+ AI coding hacks for Claude Code, Cursor, and Codex here.

TOP & TRENDING RESOURCES

Click here to watch the tutorial.

Top Tool

PumaBD: This tool provides AI agents with simple, persistent memory for notes and context without the hassle of setting up complex databases, vector stores, or infrastructure.

Top Repo

DeerFlow (72.9K ⭐): An open-source SuperAgent harness that uses sandboxes and tools to research, code, and complete complex multi-hour tasks autonomously.

Trending Cookbook

How to use CLAUDE.md + scoped rules (by Anthropic): Overloading CLAUDE.md with unstructured instructions wastes context and makes Claude less effective. To fix this, developers can use seven specific customization methods, like skills and rules, to manage exactly when and how instructions load.

Our most-clicked story from Friday

Anthropic's new feature turns Claude Code sessions into private, real-time shareable web pages for teams to track progress, PR walkthroughs, and dashboards without manual status updates.

Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 300K+ engineers and 150K+ followers on socials. Get in touch.

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team