← Bookmarks 📄 Article

Distributed Elixir Made Simple - Johanna Larsson | ElixirConf EU 2025 - YouTube

Distributed Elixir doesn't require consensus algorithms or PhDs—if you're okay with occasional data inconsistency (rate limiting, caching), you can build distributed features using the same message-passing primitives you already know, keeping your stack simple.

· software engineering
Read Original
Listen to Article
0:002:02
Summary used for search

• Message passing works identically across clustered nodes as it does locally—same send/receive semantics, no HTTP clients or gRPC needed
• The safe distributed features checklist: Is it okay if data is lost/wrong/inaccessible? If yes to all three, you can skip the distributed systems complexity
• Hash rings provide consistent key-to-node routing that survives nodes joining/leaving, enabling distributed caches with just two added lines of code
• Clever persistence pattern for rolling deploys: new nodes ask all old nodes for their full state on startup—inelegant but simple and works
• Real war story: swoosh's global process registration caused cascading crashes during deployments when the leader node kept shutting down

The talk challenges the perception that distributed Elixir requires deep expertise by showing there's a sweet spot of practical use cases where clustering is straightforward. The key insight: message passing in Elixir works identically whether processes are on the same node or different servers—you just need a destination and a message. This is fundamentally simpler than reaching for HTTP APIs, Redis, or gRPC when scaling, which introduce serialization complexity, error handling, and separate monitoring.

The speaker provides a three-question checklist for identifying safe distributed features: Is it okay if some data is lost? Is it okay if some data is wrong? Is it okay if you can't access the data? If you answer yes to all three, you can build distributed functionality without consensus algorithms. This applies perfectly to rate limiting (occasional extra requests are fine), caching (stale data is acceptable), and custom reporting (recent data is good enough). The implementation uses hash rings for consistent key-to-node routing—when nodes join or leave, most keys still route to the same place. Adding distributed caching to a local cache requires just two lines: check which node owns the key, then send the message there.

For persistence across rolling deploys, the speaker shares an "inelegant but simple" pattern: new nodes ask all existing nodes for their complete state on startup, guaranteeing the new deployment has everything. The talk concludes with a debugging story about swoosh's global process registration causing cascading crashes—when the global leader shut down during deployment, all linked processes crashed and raced to become the new leader, but if the winner was also shutting down, the cycle repeated until supervisor restart limits were hit, taking down entire nodes.