A love letter to Pi | Lucas Meijer
Your codebase is a Marble Madness level and your AI agent is the marble—your job is removing hazards so it rolls smoothly, plus a vision of "Barbapapa software" that reshapes itself while running to fit your exact needs.
Read Original Summary used for search
TLDR
• Think of your repo as a Marble Madness level: incomplete docs, ignored warnings, and outdated instructions are cliffs your agent will fall off—read full transcripts to find friction points
• Make agents create "evaluation packs" (videos, screenshots, HTML reports) before you review their work—this catches their errors AND makes your evaluation 10x faster since you're now the bottleneck
• Use context branching to avoid paying for dead-end side quests—never argue with agents, just tree-navigate back and rephrase; stay under 50% context window or intelligence drops
• Pi enables "Barbapapa software": agents that write extensions for themselves while running, morphing the tool to fit your specific workflow instead of waiting for SF product teams to ship features
• Always ask "how will I evaluate this?" BEFORE sending the agent off, then put that evaluation method in the prompt—agents perform better when they know how they'll be judged (humans too)
In Detail
Lucas Meijer, former Unity engineer, presents a radically practical framework for working with AI coding agents built on a counterintuitive premise: nobody actually knows what they're doing yet, so stop chasing "stage nine" agent swarms and focus on solving problems you actually have. His core insight is treating your codebase like a Marble Madness level where the agent is a marble that needs to roll smoothly—your job is identifying and removing hazards like incomplete AGENTS.md files, build warnings you've been ignoring for years, or outdated documentation. The only way to find these hazards is reading full agent transcripts to see where it went wrong, then fixing the repo accordingly.
The talk's most actionable framework is "evaluation packs"—forcing agents to package their work for easy human review (screen recordings, annotated screenshots, HTML slide decks) before you evaluate it. This serves dual purposes: it makes agents actually test their code (they can't fake a video demo), and it dramatically reduces your evaluation time since you're now the bottleneck in the workflow. He demonstrates this by one-shotting a photo timeline website and having the agent record itself using every feature, catching bugs automatically. He also advocates for HTML output over terminal text for everything, context management through tree-navigation to avoid paying token costs for dead-end side quests, and staying under 50% context window to avoid the "dumb zone."
The talk culminates in a vision of "Barbapapa software" (named after shape-shifting cartoon characters from the '70s)—software that modifies and extends itself while running on the user's machine to fit their specific needs. He demonstrates Pi writing a Doom overlay extension for itself in real-time, exemplifying a future where software isn't written once for thousands of users but instead morphs into whatever shape each user requires. This represents a fundamental shift from static software to self-modifying systems that adapt to context.