OpenAI ships a new open source model, Alibaba drops Qwen3.6-27B

Welcome back. Your private data should stay exactly where it belongs: on your device. OpenAI just dropped a new open-weight model that lets developers redact sensitive info locally before it ever hits the cloud. And it wasn't the only thing they shipped yesterday.

Also: Plug 80 Nvidia-hosted models into your coding agent at no cost, 5 design patterns for long-running agents, and why Karpathy says a 1,800x smaller model can match a 1.8T giant.

Today’s Insights

Powerful new updates and hacks for devs
How AI is rewriting engineering interviews
How to stop restarting Gemini CLI on long tasks
Trending social posts, top repos, and more

TODAY IN PROGRAMMING

Click here to see OpenAI’s workspace agents in action.

OpenAI ships open-weight model for redacting private data: The ChatGPT maker just dropped Privacy Filter, a new 1.5B-parameter open-weight model that automatically hides personal info like emails and API keys on your own device. The lab also introduced workspace agents for Business and Enterprise users. These Codex-powered agents can handle complex workflows in Slack and ChatGPT and are currently free to try through May 6. Try the new privacy model on Hugging Face now.

Alibaba unveils a smaller model that tops coding benchmarks: The Chinese tech giant just open-sourced Qwen3.6-27B, a multimodal model that beats their previous 397B-parameter flagship across all major coding benchmarks. Its dense architecture makes it much easier to deploy than complex MoE models, bringing top-tier coding performance to a more practical scale. Run the model locally.

Google evolves Vertex AI into an agent platform: The search giant just unveiled the Gemini Enterprise Agent Platform at Cloud Next '26, effectively replacing its flagship AI development suite. Moving forward, all Vertex AI updates will live on this new platform, which features a low-code Agent Studio and a graph-based kit for building sub-agent networks. The new runtime now powers autonomous workflows that can run for days, all while Google’s models handle a staggering 16 billion tokens per minute via API.

PRESENTED BY MONGODB

Join the builders shaping what’s next.

MongoDB.local London 2026 brings together developers, founders, and AI leaders for a full day of real-world talks, hands-on sessions, and networking.

Learn how teams are taking AI from prototype to production, explore modern data architectures, and connect with experts solving today’s toughest engineering challenges.

London | May 7— Save your spot.

INSIGHT

How AI is rewriting engineering interviews

Source: Augment Code

Code stopped being the test. Coding interviews became obsolete the second Claude Code and Codex started passing them on autopilot. Sierra, the AI agent startup founded by former Salesforce co-CEO Bret Taylor, just published the most in-depth look yet at what's taking their place.

The signal cratered first. Traditional interviews mostly measured syntax and framework recall, things LLMs now handle in seconds. Karat, an interview-as-a-service firm, which has run over 600,000 interviews for companies like Atlassian and PayPal, sees this shift as well. Stick to pre-LLM rubrics today, and you’re basically just testing if a candidate remembered to use AI.

Live builds replaced live coding. A consistent pattern is emerging. Give candidates two hours, their AI of choice, and a real product to build, then grade their output and decision-making. This is the core of Sierra’s onsite interview. Augment Code shipped a similar model in March, evaluating engineers on dimensions like product taste and architectural judgment.

Calibration is the new headache. Open-ended interviews are harder to grade and often spark debate during debriefs. But the payoff is worth it. It’s better to hire for standout strengths like strong product taste and sharp architectural instincts instead of just checking for a lack of weaknesses. The real question has shifted from whether someone can code to what they build and why.

IN THE KNOW

What’s trending on socials and headlines

Meme of the day.

Free Inference: Nvidia is quietly hosting nearly 80 frontier models you can plug straight into your coding agent (21K bookmarks).
Pixel Stream: Shopify's CEO is hyping a prototype that streams entire UIs directly from an AI model, skipping the frontend stack entirely (2.6M views).
Agent Blueprint: Google Cloud AI Director Addy Osmani mapped out the 5 design patterns behind production-grade AI agents (1.7K bookmarks).
Code Guard: This new Claude Code command runs a fleet of cloud bug-hunters before you merge (1.8M views).
Data Over Size: OpenAI co-founder Andrej Karpathy claims today's 1.8T parameter frontier could be matched by a model 1,800x smaller (465K views).
Agent Fluency: Google open-sourced a spec that gives AI agents a shared language across projects (3.8M views).
Buy the Future: AngelList's Naval Ravikant just launched a fund that lets you own a slice of OpenAI, Anthropic, and xAI for $500 (3.9M views).

AI CODING HACK

How to stop restarting Gemini CLI on long tasks

Source: X/EvanOtero

Long agentic runs often stall when context windows max out, forcing a restart that wipes your state. Evan Otero from the Gemini CLI team solved this by porting Geoffrey Huntley's Ralph loop technique into a new extension. It fixes the issue in a single command. You'll need Gemini CLI v0.26.0 or newer to use the hooks required for the setup:

gemini extensions install https://github.com/gemini-cli-extensions/ralph

Then run a loop with an iteration cap and a completion phrase:

/ralph:loop "Build a REST API for todos with CRUD and full test coverage. Output 'DONE' when all tests pass." --completion-promise "DONE" --max-iterations 20

An AfterAgent hook clears session memory and restarts the agent with the original prompt after every turn, keeping the context stable until it finishes or hits the limit. Use the “-y” flag in sandbox mode to keep the loop running without pausing for tool calls.

TOP & TRENDING RESOURCES

Click here to watch the tutorial.

Top Tool

Code Rabbit: An AI-powered code review assistant that integrates with GitHub, GitLab, and Bitbucket to provide automated, line-by-line feedback on code quality, security vulnerabilities, and bugs.

Top Repo

Claude Context (8k ⭐): This repo contains an MCP plugin that enables semantic code search for Claude Code and other AI coding assistants, providing them with deep context from your entire codebase.

Trending Paper

When should agents use direct APIs vs CLIs vs MCP? (by Anthropic): Linking AI agents to external systems through APIs or CLIs often leads to massive integration headaches at scale. MCP fixes this by providing a standardized, portable layer that seamlessly connects cloud-based agents to remote tools and data.

Grow customers & revenue: Join companies like Google, IBM, and Datadog. Showcase your product to our 240K+ engineers and 150K+ followers on socials. Get in touch.

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team