Anthropic drops Claude Tag, Krea unveils Krea 2 Raw and Turbo

Welcome back. It’s bad for all the startups that raised money to build an AI assistant for Slack. Anthropic just gave Claude its own Slack account, and anyone can now delegate tasks or ask questions directly in Slack without opening their browsers. See how.

Also: Snowflake’s CEO shares how GLM-5.2 compares against Opus-4.7, a tool to refine your agent loop before it drains your token, and how an Amazon senior engineer 10x'd her pay.

Today’s Insights

Powerful new updates and hacks for devs
Codex-maxxing for long-running work
How to run GLM-5.2 in Claude Code
Trending social posts, top repos, and more

TODAY IN PROGRAMMING

Click here to watch Anthropic’s Claude tag in action.

Anthropic unveils an AI coworker you can tag in Slack: The AI lab just dropped Claude Tag, bringing the model into Slack as a multiplayer teammate. Once tagged, the entire channel can watch it work, hand off threads, and let it tackle bugs or tickets. It builds context as it goes and can even run solo projects for days. Now available in beta for Enterprise and Team customers. Watch how it works.

Mistral's new model turns documents into structured data: The French AI lab just released OCR 4. It returns not just clean text but bounding boxes, typed blocks, and confidence scores for every region. That data feeds directly into RAG and agent pipelines. It gives teams citation-ready chunks instead of raw dumps. The model runs self-hosted in a single container and supports 170 languages. API access starts at $4 per 1,000 pages. More details on the API.

Krea drops open image models that render in seconds: The creative tools startup just opened the weights to two new image generators on Hugging Face. Developers can train custom styles and LoRAs using Krea 2 Raw, an unaligned base checkpoint that adopts any aesthetic without clashing with built-in defaults. You can then port that tuning directly to Krea 2 Turbo, which renders native 2k visuals in about two seconds on consumer hardware. You can download it here.

PRESENTED BY IBM

Scaling AI code with shared standards

AI agents now reportedly generate an average of 48% of code for surveyed
organizations, but alignment is falling behind. 55% of engineering leaders surveyed are concerned about losing shared understanding of their codebase, and 39% are worried about shipping with confidence.

The challenge shows up in inconsistent frameworks, patterns and practices. Without project-specific guidance, each AI interaction produces different results. Project-level rules help teams encode standards and help keep output aligned.

Dive deeper

INSIGHT

How to run 24-hour Codex agents without torching your budget

Source: The Code, Superhuman

How it crept up. Last year, AI spending was barely a blip on the radar. But as coding agents started running solo for hours, the costs exploded. One developer kicked off a refactor on Friday and returned to a $4,200 API bill. Since every turn resends the full context, long runs quietly snowball. Now, teams say AI costs are becoming the second-biggest expense after salaries.

Nobody pays the bill. Engineers picking models rarely see the bill. To play it safe, they default to top-tier models and max settings every time. But at scale, those costs increase exponentially. A 25-person team can easily burn $72,000 a year on tasks that would've cost $7,200 (or less) on a cheaper model. Most of that extra spend is just paying a top model to double-check work it already finished correctly on the first attempt.

Point it, don't kill it. Expensive runs are where you see the real payoff, so the move is to guide them rather than to shut them down. OpenAI once let Codex run for 25 hours on one task, and it actually shipped working code. What kept it on track was a clear "definition of done" it could test against. If you give an agent a vague goal like "build the feature," it'll drift for hours. But if you tell it to "port this library, keep the API compatible, and finish once the original tests pass," it has a specific target to check its work against.

Keep it on a leash. Even on long runs, you need to stay on top of the checkpoints. Steer the model mid-task rather than letting it run blind, and always review diffs before merging. Avoid leaving expensive automation unattended; set medium reasoning as your default and require an opt-in for heavy-duty tasks. Ultimately, measure success by code shipped and PRs closed, not tokens burned. OpenAI's Codex-maxxing guide is the blueprint here.

PRESENTED BY UNBLOCKED

Unblocked: The context engine to save you time and tokens

AI is in your engineering workflow. While the token spend shows it, the throughput doesn't. The human is very much still in the loop, and that's a context problem. Unblocked turns code, docs, tickets, and conversations into actionable context, so your agent stays on track without the babysitting.

Register for free.

IN THE KNOW

What’s trending on socials and headlines

Meme of the day.

Double-Checker: Snowflake's CEO posted a benchmark on how much more GLM-5.2 double-checks its work compared to Opus-4.7. The results aren’t what you'd expect.
One-Word Fix: A single-word swap in your GitHub Actions scripts shaves 200 to 400ms off every call. Most developers still run the slower default.
Loop Design: A poorly designed agent loop burns tokens and hands you slop fast. This open-source tool sharpens your loop before Claude Code ever runs.
Income 10x: A Senior Applied Scientist at Amazon went from $60K to over $600K in eight years. The first move? A pay cut she took on purpose.
AI Vocab: Most people who say they know AI freeze on the basics. A Microsoft AI director broke down 10 core terms with hand-drawn visuals.
Fired for Shipping: A Google engineer built a viral Workspace CLI that hit #1 on HN. Google rolled out its own version, then fired him two days later.
Status Bar: A developer built a tiny macOS menu bar app that shows Claude Code's live status. You see when it's thinking, running a tool, or waiting on you (1.4K likes).
Learn Anything: Hermes Agent's new command reads any source you throw at it, code, API docs, manuals, PDFs, and distills a reusable skill you can run again.

AI CODING HACK

How to run GLM-5.2 in Claude Code

Running Opus 4.8 in Claude Code can get pretty pricey during long sessions, since those per-token costs really start to stack up. Alex Ker, who handles engineering and GTM at Baseten, shared a workaround that points Claude Code toward GLM-5.2 instead. This lets you keep the same interface while running on a much cheaper model.

Step 1. Install the latest Claude Code.

npm install -g @anthropic-ai/claude-code

Step 2. Create an account at baseten.co and grab an API key from “app.baseten.co/settings/api_keys”. Save it for the next step.

Step 3. Open “~/.claude/settings.json” and paste this block, swapping in your key. The auth token is the only value you fill in.

"env": {
  "ANTHROPIC_AUTH_TOKEN": "your_baseten_api_key",
  "ANTHROPIC_BASE_URL": "https://inference.baseten.co",
  "ANTHROPIC_DEFAULT_HAIKU_MODEL": "zai-org/GLM-5.2",
  "ANTHROPIC_DEFAULT_SONNET_MODEL": "zai-org/GLM-5.2",
  "ANTHROPIC_DEFAULT_OPUS_MODEL": "zai-org/GLM-5.2"
}

Make sure to keep the "https://" in the base URL, as Claude Code requires the full scheme. Once you launch Claude Code, every model slot will route to GLM-5.2.

You'll have the same terminal, tool usage, and multi-turn sessions, but at about 5x cheaper per token compared to Opus 4.8.

P.S. Get 50+ AI coding hacks for Claude Code, Cursor, and Codex here.

TOP & TRENDING RESOURCES

Click here to watch the tutorial.

Top Tool

Conduit: Slash AI token usage by 90% using a local gateway. It replaces bloated tool lists with an on-demand search, making your agents faster and way more efficient.

Top Repo

Self Hosting Guide (21.6K ⭐): Master self-hosting with this end-to-end guide covering everything from Docker and LLMs to VPNs, backups, and home lab hardware. It's built for devs who want to own their stack, cut SaaS costs, and ship reliable infrastructure like pros.

Trending Cookbook

Mastering Codex Remote for engineering (by OpenAI): A lot of devs write off Codex Remote as nothing more than a status checker for their phones. In reality, it's a real breakthrough. It serves as a full command center, giving you the power to direct, review, and manage complex projects from anywhere.

Our most-clicked story from yesterday

Check out this 8-hour tutorial to master building secure, scalable agents with guardrails and evaluation tools.

Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 300K+ engineers and 150K+ followers on socials. Get in touch.

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team