Mistakes engineers make in large established codebases | sean goedecke
The biggest mistake in large codebases isn't writing messy code—it's trying to keep your code clean and isolated from the legacy patterns, when you should be sinking deeper into the mess to maintain consistency.
Read Original Summary used for search
TLDR
• The cardinal sin: implementing features "the right way" while ignoring existing patterns—this creates landmines and makes future improvements impossible
• Always research prior art first and follow those patterns even if ugly, because existing code represents a safe path through hidden complexity you don't know about
• Inconsistency kills large codebases by making general improvements impossible, creating a vicious cycle where the hardest 5% gets left behind
• Large established codebases (5M+ LOC, 100-1000 engineers, 10+ years old) produce 90% of value at big tech companies—they're not technical debt, they're your actual job
• You cannot split up or redesign a large codebase without first becoming fluent at shipping features inside it
In Detail
The author's core thesis challenges conventional wisdom: in large codebases (5M+ lines, 100-1000 engineers, 10+ years old), the worst mistake is trying to keep your code clean and isolated from legacy patterns. Instead, you must sink deeply into the existing codebase to maintain consistency above all else. This means researching prior art before every feature and following those patterns even when they seem ugly, overkill, or poorly designed.
The reasoning is threefold. First, existing patterns represent safe paths through minefields of hidden complexity—special user types, edge cases, internal tooling quirks that you don't know about. Second, inconsistency creates a vicious cycle: when auth logic is scattered across different implementations, you can't make general improvements without updating every variant. In practice, the hardest 5% gets left behind, further decreasing consistency. Third, you can't rely on testing every state combination in development—instead you need defensive coding, slow rollouts, and monitoring.
The author also defends large codebases against the "split it into microservices" crowd. Large established codebases produce 90% of revenue at big tech companies—all the productizing code (settings, billing, enterprise features) lives there even when core features are in elegant services. You cannot redesign these systems from first principles because there are too many accidental details supporting tens of millions in revenue. Teams that successfully split up large codebases were already fluent at shipping features inside them first.