← Bookmarks 📄 Article

Greg Young on CQRS and Event Sourcing (Code on the Beach 2014)

Event sourcing isn't new tech—it's how every mature industry (banking, accounting, insurance) has worked for centuries: store immutable facts, derive current state, never lose information, and separate how you read from how you write.

· software engineering
Read Original
Listen to Article
0:007:00
Summary used for search

• Your bank balance isn't a column—it's the sum of all transactions. Current state should always be a derivative of immutable facts, not the source of truth
• Every structural model loses information through updates/deletes. Event sourcing is the only model that preserves everything—you can time-travel to any point and see what reports would have shown
• CQRS is mandatory with event sourcing because querying events directly is terrible. Build separate, eventually-consistent read models (projections) optimized for different query patterns
• Real business value: provable audit logs (can't be corrupted), deterministic debugging (replay to any state), and the ability to answer questions about data you "should have been tracking" retroactively
• Scale reads independently by making copies of projections. Most systems do 100x more reads than writes—stop optimizing for write performance with normalized schemas

Greg Young argues that event sourcing—storing immutable facts rather than current state—is how every mature industry actually works, and for good reason. Your bank balance isn't a column in a table; it's the sum of all transactions. If it were just a column, disputes would be resolved with "the column says $103, that's what you have." Instead, you can add up the facts and verify correctness. This same principle applies to any business domain: current state should be a transient derivative of immutable events, not the source of truth. The key insight is that structure changes more often than behavior—your use cases are stable, but how you model internal data changes constantly. By storing behavior (events) instead of structure, you can refactor without migration scripts and run multiple versions side-by-side.

The business case is stronger than the technical one. Every structural model loses information through updates and deletes—you're making decisions about data value without consulting your CEO or predicting future needs. With event sourcing, when business asks for a report on "items removed from cart within 5 minutes of checkout," you don't just get current data—you can time-travel and see what that report would have shown on any date in history. You get deterministic debugging by loading state at event N and stepping through exactly what happened in production. You can rerun every command your system ever processed as smoke tests. And you get provable audit logs using WORM drives—solving super-user attacks where rogue admins manipulate data.

CQRS (Command Query Responsibility Segregation) becomes mandatory because event sourcing is terrible for queries. Finding "all users named Greg" by replaying every user's event stream is O(n) table scans. The solution: build separate read models (projections) that are eventually consistent. Most systems do 1-2 orders of magnitude more reads than writes, yet people optimize for write performance with third normal form. Instead, use multiple specialized projections—document DB for UI screens, OLAP cubes for reporting, Lucene for search. Each projection is just a left-fold over events. You can delete and rebuild them anytime, add new ones by replaying from event zero, and scale reads linearly by making copies. Snapshots (memoization of the left-fold) handle high-volume streams, but avoid them until you hit 1000+ events per aggregate. The fundamental lesson: one model cannot do everything well—embrace polyglot persistence and optimize each model for its specific job.