AI Phishing kit targets major companies, and security weakness in Cursor can let hackers in

A new AI-powered phishing kit is targeting Fortune 500 companies, OpenAI and Anthropic release powerful new tools for developers, and learn how companies like Box are building and implementing AI agents.

Today’s Insights

New releases from Google, OpenAI, and Anthropic
Key Trends & Insights for engineering leaders
What’s trending on socials and dev communities
Trending tutorials, repos, and papers
Deep Dive: How to get faster RAG in production

Welcome to The Code. This is a short weekly email that cuts through the noise to help devs, engineers, and technical leaders find high-signal news, releases, and resources in 5 minutes or less. You can sign up or share this email here.

THIS WEEK IN PROGRAMMING

AI-powered phishing kit targets Fortune 500 companies: Cybercriminals are flocking to Salty2FA — a platform that uses AI to generate pixel-perfect replicas of Microsoft 365 login pages and intercept authentication codes in real-time, rendering traditional 2FA protections vulnerable. The kit has already compromised thousands of accounts across Fortune 500 companies, with victims receiving convincing emails that redirect to fake login portals.

OpenAI and Anthropic introduce new tools for developers: OpenAI's ChatGPT Developer Mode now supports full read-write access to external systems via Model Context Protocol (MCP). Pro and Plus subscribers can now connect MCP servers to automate workflows. Meanwhile, Anthropic released its ‘web fetch’ tool for Claude, enabling developers to retrieve full webpage and PDF content through API requests.

A security weakness in Cursor can let hackers in: A reported vulnerability in Cursor could allow attackers to execute malicious code simply by tricking developers into opening a booby-trapped repository. The vulnerability stems from Cursor shipping with "Workspace Trust" disabled by default — a security feature that prevents untrusted code from auto-executing.

Google builds an AI model that can't leak your personal data: Google just released VaultGemma, the biggest language model trained to protect privacy. The 1B-parameter model works by adding mathematical "noise" during training that makes it impossible to extract personal information from the data it learned on.

TRENDS & INSIGHTS

What Engineering Leaders Need to Know This Week

How Box Builds AI Agents: Box CTO Ben Kus highlights three urgent threats for enterprises deploying AI: unauthorized data access, unpredictable agent behavior, and adversarial attacks through prompt injection. His solution involves "secure RAG" with systematic permission checks and confidence scoring systems.

Engineering managers will thrive in the AI agent era: Writing code was never the hardest part of engineering jobs. EMs already excel at understanding business context, managing stakeholder expectations, and solving people problems — skills that become more valuable as AI handles routine coding.

The End of Engineering's Blank Check: DX’s CTO Laura Tacho identifies a troubling pattern among younger engineering leaders from the ‘blank check era’ who lack fundamental business acumen. Now they're scrambling as companies demand clear ROI calculations for everything from developer tooling to technical debt paydown.

Why most AI startups fail at R&D: Former OpenAI researcher Shyamal argues that companies trying to do both research and product development typically excel at neither. He argues research explorations must tie to a user pain point or metric that the product team cares about.

IN THE KNOW

What’s trending on socials and headlines

Developer Drift: r/ClaudeCode is full of developers shifting to OpenAI’s Codex.
Billion Breach: Code security platform Aikido is allegedly facing the largest NPM compromise in its history.
Code Cleaner: Want to generate simpler code with Cursor and other code gen tools? Try this simple prompt.
Decoded: Elon Musk revealed X’s updated recommendation algorithm, one builder deconstructed how it works.

Everything Else:

A brutally honest Replit Agent 3 review
ByteDance releases Seedream 4.0 image model
Copilot now supports open models from Hugging Face
Cursor upgrades inline code suggestion
Anthropic adds memory in Claude for Team and Enterprise

TOP & TRENDING RESOURCES

3 Tutorials to Level Up Your Skills

Click here to watch Prompting 101 for Developers. Source: Anthropic

Anthropic’s Prompting 101 for Developers: This tutorial teaches professional prompt engineering through a practical Swedish car accident analysis case. The instructor shows how iterative refinement transforms basic prompts into production-ready systems for reliable AI outputs.

Complete tutorial to building Coding Agents for Enterprises: A detailed guide to build production-ready coding agents: from choosing the right LLM and designing memory systems to implementing tool chains and handling edge cases that crash most agents.

Building a Docker-like Container From Scratch: The tutorial guides through building container filesystems from scratch. Developers can learn containerization fundamentals beyond surface-level Docker commands.

Top Repos

100+ Tutorials on LLMs, RAG and AI Agents: Awesome LLM Apps is a structured repo with step-by-step tutorials on AI Agents.
Build a Large Language Model (From Scratch): Implement a ChatGPT-like LLM in PyTorch from scratch.
Developer Roadmaps: Interactive roadmaps, guides and other educational content to help developers grow in their careers.

Trending Papers

Defeating Nondeterminism in LLM Inference: Research from Thinking Machines Lab discusses why LLMs give different answers to the same question, even when set to be deterministic, and finds that the real problem is how GPU processing changes based on server workload.

Scaling LLM Pretraining Performance: MIT researchers show how LLMs scale across hundreds of GPUs, focusing on dataset handling, distributed training, and parallelism. They share recommendations to maximize GPU efficiency and achieve near-linear scaling.

LightAgent: Researchers from Shanghai University of Finance and Economics introduce LightAgent, an ultra-lightweight framework for building production-ready multi-agent AI systems with just 1,000 lines of Python code. The framework eliminates heavyweight dependencies like LangChain to deploy agents 10x faster.

DEEP DIVE

31× Faster RAG Performance with Meta’s REFRAG

Source: The Code, Superhuman

REFRAG is a framework from Meta Superintelligence Labs that enables AI models to handle 16x more context while running 31x. Instead of forcing the model to read every single word from long documents, REFRAG compresses chunks of text into smart summaries that capture the same meaning but take up much less space.

How Does REFRAG Work?

The Core Problem:

When you give an AI model twice as much text to read, it becomes 4 times slower
This happens because AI attention mechanisms get expensive very quickly
Memory usage grows exponentially with longer text

The Process:

Step 1: Question Input
- The system receives a query like "Who is the President of the US?" which gets processed through standard tokenization and embedding.
Step 2: Context Processing
- Retrieved documents containing relevant information (shown as three blue boxes with information about the President) are broken into chunks.
- Each chunk gets processed by lightweight encoders running in parallel.
Step 3: Compression Strategy
- Each encoder creates a single "chunk embedding" (green squares) that captures the meaning of multiple tokens. This is where the major compression happens — many tokens become one dense representation.
Step 4: Smart Selection
- The RL-trained policy (orange bar) acts as a quality controller, deciding which chunks stay compressed versus which expand back to full tokens. Critical information gets preserved while redundant content stays compressed.
Step 5: Final Processing
- The decoder receives a mix of compressed embeddings and expanded tokens — a much shorter sequence than traditional RAG, while maintaining accuracy.

Why This Matters

For Businesses:

Scale AI applications profitably while delivering more powerful answers
Analyze entire reports instead of just pages, faster and cheaper than before

For Developers:

No longer choose between large contexts and reasonable costs
Get comprehensive analysis and faster performance without changing your setup

Learning Resources:

How REFRAG Delivers 31× Faster RAG Performance in Production
Research Paper by Meta Superintelligence Labs

What did you think of today's newsletter?

Your feedback helps us create better emails for you!

You can also reply directly to this email if you have suggestions, feedback, or questions.

Until next time — The Code team