“Engineers are becoming sorcerers” | The future of software development with OpenAI's Sherwin Wu - YouTube video summary with timestamps by sumari.io

“Engineers are becoming sorcerers” | The future of software development with OpenAI's Sherwin Wu

Lenny's Podcast

Long video summarized in 2 parts: Part 1|Part 2

Lenny's Podcast is a popular tech and product management podcast hosted by Lenny Rachitsky. This episode features Sherwin Wu, head of engineering for OpenAI's API and developer platform, discussing the future of software development. (1:19:39)

Part 1/2(0:00 - 1:00:00)

TL;DR

  • Engineers Becoming AI Managers - Software engineers are transforming from code writers to managers of AI agents, orchestrating multiple AI threads simultaneously like wizards casting spells.
  • Golden Age of Micro-Startups - The one-person billion-dollar startup will spawn thousands of smaller companies building bespoke software, creating a B2B SaaS boom.
  • Business Process Automation Opportunity - Massive untapped potential exists in automating repetitive business processes outside Silicon Valley's focus on knowledge work and engineering.
  • Build for Tomorrow's Models - Companies should design products for where AI models are heading, not current capabilities, as models rapidly eat existing scaffolding.
  • Bottom-Up AI Adoption Strategy - Successful AI implementations require excited internal champions and tiger teams, not just top-down executive mandates without grassroots support.

Topic Breakdown with Timestamps

Introduction and AI Code Generation Statistics at OpenAI
0:00 - 3:15

Lenny Rachitsky opens his podcast with a striking revelation from guest Sherwin Wu, head of engineering for OpenAI's API and developer platform: 95% of OpenAI's engineers use Codex, and 100% of their pull requests undergo AI review. Wu paints a vivid picture of software engineering's transformation, describing how engineers are evolving into "wizards" who orchestrate fleets of AI agents rather than writing code line by line. This shift, he argues, could spawn one-person billion-dollar startups and trigger a golden age of B2B SaaS. His strategic advice cuts through the hype: companies should design for AI's future capabilities, not its current limitations, because "this is the worst the models will ever be."

Sherwin Wu Introduction and AI's Impact on Engineering Productivity
3:15 - 6:58

The productivity gains Wu describes defy conventional expectations. Engineers who embrace Codex heavily generate 70% more pull requests than their less AI-reliant colleagues, with this performance gap expanding over time rather than plateauing. What's most remarkable isn't just the volume—it's that nearly 100% of code now originates from AI generation. However, Wu acknowledges the human element remains crucial during this transition period, as engineers gradually build confidence in the models' reliability. Echoing Kevin Weil's insight that "this is the worst the models will ever be," Wu anticipates that growing trust will only accelerate as AI capabilities continue their relentless improvement.

The Transformation of Software Engineering Jobs with AI
6:58 - 12:27

Software engineering has undergone a philosophical transformation that Wu traces back to classical computer science literature. Drawing from MIT's seminal programming textbook SICP, which compared programmers to wizards casting spells through code, Wu notes this metaphor has become literal reality. Modern engineers wield natural language "incantations" to direct AI tools like Codex and Cursor, simultaneously managing 10-20 parallel threads like executives overseeing vast teams. Lenny embraces Wu's comparison to Disney's Sorcerer's Apprentice, observing that while these AI tools offer unprecedented leverage, engineers must maintain vigilance to prevent their digital "brooms" from spiraling into chaos like Mickey Mouse's magical mishap.

Managing AI Agent Stress and the 100% Codex Experiment
12:27 - 17:55

Managing multiple AI agents creates genuine stress for engineers—a challenge Wu openly acknowledges as teams adapt to working with imperfect AI systems. OpenAI's boldest internal experiment involves a team writing 100% of their codebase through Codex with no manual coding escape hatch. Their discoveries prove illuminating: most AI failures stem from inadequate context and documentation rather than fundamental model weaknesses. Meanwhile, the explosion of AI-generated pull requests demanded a systematic response. Codex now reviews every PR at OpenAI, slashing review times from 10-15 minutes to just 2-3 minutes while automating the tedious linting and CI processes that previously frustrated engineers.

AI Code Review Process and Cross-Model Validation
17:55 - 19:30

When Lenny probes about using external models to double-check Codex's work, Wu reveals OpenAI's measured approach to AI oversight. Rather than eliminating human judgment entirely, they've reduced engineers' attention requirements from 100% to roughly 30%—maintaining meaningful human oversight while dramatically improving efficiency. While OpenAI primarily "dogfoods" their own models to gather feedback, Wu confirms they do employ different internal model variants to provide diverse perspectives during code review. Though nearly every engineer relies heavily on Codex and the vast majority of code likely originates from AI, Wu stops short of claiming 100% AI authorship, citing the complexity of precise attribution in modern development workflows.

How AI is Changing Engineering Management Roles
19:30 - 24:17

Engineering management faces less dramatic disruption than individual contributor roles, but Wu identifies two pivotal trends reshaping leadership. AI tools like Codex create a amplification effect that widens productivity gaps, with top performers gaining disproportionate advantages from these technologies. This reinforces Wu's management philosophy of investing heavily in his highest achievers—those who can most effectively leverage AI capabilities. Looking ahead, he predicts managers will oversee much larger teams than today's recommended 6-8 people, similar to how engineers now coordinate 20-30 AI coding agents. Examples like using ChatGPT for performance reviews and organizational research hint at AI's expanding role in management workflows.

The One-Person Billion Dollar Startup and Second-Order Effects
24:17 - 31:46

Sam Altman's concept of the "one-person billion-dollar startup" captivates Wu, though he argues most people underestimate its cascading effects on the entire startup ecosystem. As AI democratizes software development, Wu envisions an explosion of small startups and vertical B2B SaaS companies—potentially hundreds of $100 million businesses and tens of thousands earning $10 million annually. This transformation would fundamentally reshape venture capital toward smaller, more individually profitable enterprises. While Lenny expresses skepticism about solo founders managing customer support at scale, Wu counters that specialized micro-startups will emerge to handle specific functions, allowing billion-dollar founders to outsource rather than build every capability internally.

Core Management Philosophy - The Surgeon Analogy and Supporting Top Performers
31:46 - 37:29

Wu's management philosophy centers on an elegant surgical metaphor borrowed from "The Mythical Man-Month." Just as a surgeon operates while an entire team provides tools and support, Wu dedicates over 50% of his time to his top 10% of performers, ensuring they feel empowered and unencumbered. His approach involves "looking around corners"—proactively identifying and eliminating organizational blockers before they impede his team's progress. Lenny's suggestion that AI could help managers anticipate future obstacles by analyzing company communications and knowledge bases sparks Wu's enthusiasm, recognizing this as a novel application he hadn't previously considered for management enhancement.

Negative ROI AI Deployments and Implementation Challenges
37:29 - 43:58

Many companies are generating negative ROI from their AI investments, a problem Wu attributes to Silicon Valley's insular perspective. Tech leaders assume universal AI fluency, but most employees outside the industry ask only basic questions and lack deep technological understanding. Successful AI implementations require both executive commitment and grassroots adoption—a dual approach many companies bungle by relying solely on top-down mandates. Wu advocates for "tiger teams" composed of technically-adjacent employees—often operations leads or Excel experts rather than software engineers—who can explore AI capabilities, apply them to specific workflows, and generate organic enthusiasm throughout the organization.

Why Listening to Customers Can Lead You Astray in AI Development
43:58 - 50:16

Customer feedback can mislead AI product development because models evolve so rapidly they "eat your scaffolding for breakfast," rendering complex tooling obsolete as capabilities advance. Wu points to popular 2022-2023 solutions like vector stores and agent frameworks that became unnecessary as models improved, with simpler approaches often outperforming elaborate systems customers had requested. His strategic insight challenges conventional product wisdom: build for AI's future trajectory rather than current limitations. Products designed around anticipated capabilities—even if only 80% functional today—will suddenly become exceptional as models rapidly evolve, while those built for today's constraints will quickly become outdated.

Future of AI Models - Multi-Hour Tasks and Audio Capabilities
50:16 - 53:36

Over the next 12-18 months, Wu anticipates AI models will dramatically extend their task duration capabilities, evolving from minute-long optimizations to coherently handling multi-hour work sessions—potentially 6-hour tasks users can "dispatch and let run." He champions audio as a vastly underrated domain, noting that while the tech industry obsesses over text-based coding applications, much of global business operates through spoken communication. Wu expects significant breakthroughs in native multimodal speech-to-speech models that could transform how people interact with AI systems. Lenny distills these predictions into two key trends: AI agents executing longer, more complex tasks and audio/speech interfaces becoming central to AI experiences.

Business Process Automation as AI's Biggest Opportunity and OpenAI's Platform Strategy
53:36 - 1:00:00

Business process automation represents AI's most underexplored opportunity, according to Wu, because Silicon Valley fixates on open-ended knowledge work while ignoring the vast economy of standardized, repeatable processes that power most companies. Unlike creative engineering work, most jobs follow standard operating procedures with high determinism—from customer support scripts to utility company workflows—making them ideal candidates for AI automation that could revolutionize enterprise operations over the next two decades. Addressing Lenny's concerns about OpenAI competing with startups, Wu advises entrepreneurs not to fear being "squashed," emphasizing that the AI opportunity space is so massive that successful startups like Cursor thrive by building beloved products. OpenAI views itself fundamentally as an ecosystem platform company, releasing all models through APIs to foster rather than stifle innovation.

Part 2/2(1:00:00 - 1:19:39)

TL;DR

  • Platform Strategy and Democratization - OpenAI's mission to spread AI benefits globally through open APIs, enabling developers to build diverse applications and reach underserved markets.
  • AI Development Infrastructure Evolution - Technical progression from basic API endpoints to sophisticated agent SDKs, evaluation tools, and UI components for streamlined AI development.
  • Seizing the AI Opportunity Window - Sherwin Wu's emphasis on actively engaging with AI tools now, predicting the next 2-3 years as exceptionally transformative for technology careers.
  • Massive Scale and Global Impact - ChatGPT reaching 800 million weekly users (10% of world population), demonstrating unprecedented adoption of AI technology across diverse demographics.
  • Practical AI Engagement Strategy - Advice to start small with existing tools, avoid information overload, and focus on hands-on experimentation rather than following every news cycle.

Topic Breakdown with Timestamps

OpenAI's Platform Strategy and Mission to Benefit All Humanity
1:00:00 - 1:04:10

Sherwin Wu from OpenAI explains how their platform strategy flows directly from the company's mission to spread AI benefits to all of humanity. Recognizing that OpenAI cannot reach every corner of the world alone, Wu emphasizes their platform-neutral approach—they don't block competitors and maintain open access to their models. This philosophy of "a rising tide lifts all boats" actually helps grow their API business by empowering others to build on their platform. With ChatGPT reaching 800 million weekly active users (roughly 10% of the world's population), Wu defends OpenAI's democratization efforts, pointing out that their free version gives anyone access to capabilities not dramatically different from what billionaires can access.

ChatGPT's Scale - 800 Million Weekly Users and Democratizing AI Access
1:04:10 - 1:05:22

The democratization of AI becomes even more striking when Wu compares today's access to what was available just two years ago. Users now receive GPT-4o for free—a massive improvement from the much weaker capabilities of 2022. Drawing parallels to smartphones, Wu notes that for merely $20 monthly, users can access essentially the same AI technology that billionaires use. This "raising the floor" globally through improved free and affordable tiers drives OpenAI's work across sectors like healthcare and education, making advanced AI capabilities remarkably accessible to ordinary users worldwide.

OpenAI API Technical Capabilities - Response API, Agents SDK, and Developer Tools
1:05:22 - 1:08:22

On the technical side, Wu outlines OpenAI's sophisticated layered API architecture. At the foundation sits the Responses API, enabling developers to build long-running agents by feeding text to models and receiving responses over time. The platform scales upward with increasingly complex tools: the Agents SDK for orchestrating multi-agent systems, Agent Kit for UI components, and evaluation APIs for testing workflows. Developers can choose their preferred level of abstraction—from the low-level, unopinionated Responses API where they can build anything, all the way up to full-stack solutions for rapidly deploying polished agent applications.

The Next 2-3 Years as the Most Exciting Period in Tech History
1:08:22 - 1:10:04

Looking ahead, Wu believes the next 2-3 years will mark the most exciting period in tech and startup history. Having entered the workforce in 2014, he experienced a 5-6 year lull before the current AI wave began three years ago. When Lenny asks how people can avoid missing this historic opportunity, Wu's advice is refreshingly inclusive: actively engage with AI tools regardless of your profession. Since many jobs will be transformed by this technology, you don't need software engineering skills to participate. Instead of waiting on the sidelines, Wu encourages people to use the tools, understand their limitations, and track their evolving capabilities as models improve.

Managing Information Overload in the Fast-Paced AI Industry
1:10:04 - 1:11:40

Addressing the overwhelming pace of AI news, Wu admits he's "chronically online" and absorbs most information flow—making him a poor example for balance. However, he offers reassurance that much of the constant news cycle is simply noise, and people don't need to consume everything to stay relevant. Rather than trying to track every new tool announcement, Wu recommends starting small: experiment with just one or two AI tools like Cursor or ChatGPT, connect them to your personal data sources, and learn through hands-on experience rather than information overload.

Lightning Round - Book Recommendations on Fiction and US-China Relations
1:11:40 - 1:13:51

During Lenny's lightning round, Wu shares his current reading recommendations across fiction and geopolitics. For science fiction, he enthusiastically endorses "There Is No Antimemetics Division" by Q&M—a brilliantly written and unexpectedly hilarious story about a government agency fighting things that make people forget. His non-fiction picks focus on US-China relations: Dan Wang's "Breakneck," which presents the compelling analogy of America as a "lawyerly society" versus China as an "engineering society," and Patrick McGee's book on Apple and China, which fascinated Wu with insider information about the company's Chinese operations.

Lightning Round - Anime Preferences and Ubiquiti Home Networking Products
1:13:51 - 1:16:19

Continuing the personal questions, Wu reveals his entertainment preferences and lifestyle philosophy despite having limited time due to two kids and a demanding job. He recently watched the new season of Jujutsu Kaisen and champions Japanese anime for creating novel plots and universes that Western media typically avoids. For home technology, Wu enthusiastically recommends Ubiquiti networking products as "the Apple of home networking," particularly praising their security cameras and well-designed management apps. His personal motto—"never feel sorry for yourself"—emphasizes maintaining agency to overcome challenges in both work and life.

OpenDoor House Pricing Models - Surprising Variables That Impact Home Values
1:16:19 - 1:19:39

Reflecting on his pre-OpenAI experience, Wu shares fascinating insights from building home pricing models at OpenDoor. Several variables proved more impactful than expected: high-voltage power lines significantly affected values due to buzzing noise and safety concerns for families with children. Floor plans were critically important but incredibly difficult to quantify in code, often existing only as paper documents held by a few people in each market. Perhaps most surprisingly, Wu had underestimated the importance of curb appeal and front doors—with front door replacement typically offering the highest ROI for home improvements, according to Zillow research.

Step-by-Step Guide

  1. 1.
    Access OpenAI's Responses API for basic model interaction2:05:42

    Use the most popular API endpoint that lets you give the model text input and receive responses. This is the lowest-level primitive for building with OpenAI models and allows maximum flexibility.

  2. 2.
    Build long-running agents using Responses API2:05:48

    Create agents that work for extended periods by feeding text to the model, monitoring its progress, and receiving responses when tasks complete.

  3. 3.
    Implement the Agents SDK for structured development2:06:30

    Use OpenAI's higher-level abstraction layer that builds on the Responses API to create traditional AI agents with infinite loops and delegation capabilities.

  4. 4.
    Configure sub-agents and task delegation systems2:06:52

    Set up your main agent to delegate subtasks to other specialized agents and orchestrate a swarm of agents working together with proper guard rails.

  5. 5.
    Deploy Agent Kit and UI widgets for interface2:07:16

    Use OpenAI's pre-built UI components to quickly create beautiful user interfaces on top of your API or Agents SDK implementation.

  6. 6.
    Set up evaluation systems using Eval API2:07:35

    Test your models, agents, and workflows quantitatively using OpenAI's evaluation products to ensure they're working correctly before deployment.

  7. 7.
    Install ChatGPT and connect internal data sources2:11:23

    Connect ChatGPT to your existing tools like Notion, Slack, and GitHub to understand what AI can and cannot do with your actual data.

  8. 8.
    Experiment with Cursor client for coding assistance2:11:23

    Install and play around with AI coding tools to get hands-on experience with how AI is transforming software development workflows.