
Welcome back. Every developer using AI faces the same frustrations: hitting rate limits, mounting API costs, and privacy concerns. With open-source LLMs now matching GPT-4 performance and quantization enabling 120B models on consumer hardware, running locally is finally practical. It eliminates subscription fees, removes usage restrictions, and delivers instant responses while keeping your proprietary code completely private - never leaving your machine.
Here's a 30-min tutorial to set up your local LLM with Ollama - happy coding!
Today’s Insights
Codemaps helps devs understand codebases
Curation of 2,300+ Claude Skills for devs
McKinsey’s guide to implement Agentic AI
Trending social posts, top repos, new research & more
Welcome to The Code. This is a 2x weekly email that cuts through the noise to help devs, engineers, and technical leaders find high-signal news, releases, and resources in 5 minutes or less. You can sign up or share this email here.

THIS WEEK IN PROGRAMMING
Cognition puts code understanding back in engineers' hands: While most AI coding assistants are racing to write more code faster, Cognition is taking a different approach with Windsurf Codemaps — helping engineers understand their codebases before they vibe with AI-generated solutions. The feature addresses a critical pain point: senior engineers lose 5+ hours weekly onboarding others, while new hires take 3-9 months to fully ramp.
Indian founders launch world’s second open-source TTS model: Two 23-year-old Indian founders released Maya1, through their Bengaluru-based Maya Research. The 3B parameter open-source text-to-speech model supports over 20 emotions, zero-shot voice cloning, and runs on a single GPU with under 100ms latency, ranking 21st globally and second in the open-weight category per Artificial Analysis. Developers can download the model here.
Gemini to power Siri’s brain for $1B: After testing models from OpenAI, Anthropic, and Google, Apple has chosen Gemini to rescue its struggling voice assistant. The iPhone maker will pay about $1B annually for a custom 1.2 trillion parameter model — eight times larger than its current AI — to power Siri's biggest upgrade yet.

TRENDS & INSIGHTS
What Engineering Leaders Need to Know This Week

Source: The Code, Superhuman
Amazon veteran’s advice for new tech leaders: Eugene Yan's comprehensive guide reveals the counterintuitive truth about principal engineering — your coding skills got you promoted, but that’s no longer your main job. Drawing from Amazon's engineering culture and personal experience, Yan outlines how successful principals transition from doing the work to making it happen through others.
McKinsey’s guide for seizing the full potential of agentic AI: With 80% of companies using gen AI but seeing little P&L impact, the consulting giant just dropped a roadmap for the missing piece: management transformation. Their new research shows managers must evolve from task supervisors to "orchestrators" of blended human-AI teams.
From code to CTO in 7 years—without a CS degree: Engineering leader Gregor Ojstersek documented his rapid ascent from self-taught JavaScript developer to startup CTO, emphasizing how freelance work secretly turbocharged his career. His data is compelling: juggling side projects while in full-time roles doubled his experience and led to managing 35+ engineers.

IN THE KNOW
What’s trending on socials and headlines

Meme of the week
AI Teacher: The founder of a billion dollar startup shares his hack to understanding research papers with AI.
Tokens Matter: A list of 15 Claude Code habits to bring down weekly costs from $400 to $15.
Code with Skills: A curation of 2,300+ free Claude Skills, plugins, and workflows for developers - just download the zip file and use it.
Claude Check: A guy used Claude to turn a $195,000 hospital bill into $33,000.
Anthropic announces free Claude Code web usage credits to Pro and Max until November 18.
Gemini comes to Google Maps, allowing users to have a conversation about their surroundings using Lens.
Edison Scientific launches Kosmos, their newest AI Scientist that can read 1,500 papers and write 42,000 lines of code in one run.
LangChain empowers developers to ask docs and API-related questions with their chatbot.

TOP & TRENDING RESOURCES
3 Tutorials to Level Up Your Skills
How to automate code reviews with Codex: OpenAI's team just walked through how Codex's new code review feature actually works, and the setup couldn't be simpler. Toggle it on in your Codex web settings and every PR gets automatically reviewed.
How Anthropic saved 98% of context window space: Developers connecting AI agents to hundreds of tools through Anthropic's MCP standard no longer have to watch their context windows balloon with tool definitions. Anthropic engineers demonstrated a new approach where agents write code to interact with MCP servers, reducing a 150,000-token overhead to just 2,000 tokens.
How to run your own LLMs locally: Modern local LLMs now match GPT-4 on benchmarks, with quantization shrinking 120B models to run on consumer hardware. The guide covers installation, model selection, quantization tricks to fit larger models in limited RAM, and turning local AI into API servers for any app.
Top Repos
DeepCode: This repo helps you turn research papers and text prompts into working code.
skyvern: It automates browser-based workflows using LLMs and computer vision. It provides a simple API endpoint to fully automate manual workflows on a large number of websites, replacing brittle or unreliable automation solutions.
chef: This repo lets you build full-stack web apps with a built-in database, auth, file uploads, and real-time UIs. It uses AI to generate backend and frontend code automatically.
Trending Papers
Continuous Autoregressive Language Models: CALMs kills the “next-token” paradigm every LLM is built on. Instead of predicting one token at a time, CALM predicts continuous vectors that represent multiple tokens at once. This means that the model doesn’t think “word by word” but thinks in ideas per step.
Supervised Reinforcement Learning: For years, small open-source models could memorize solutions or copy examples, but they couldn’t reason. Now, a research team at Google has introduced SRL, a training method that finally helps models learn to think step by step instead of guessing the right answer at the end.
AgentFold: LLM web agents struggle with long tasks because they either accumulate too much noisy information or lose important details when summarizing. AgentFold actively manages context like human memory, treating it as a workspace to shape rather than just fill.
Whenever you’re ready to take the next step
What did you think of today's newsletter?
You can also reply directly to this email if you have suggestions, feedback, or questions.
Until next time — The Code team


