← Bookmarks 📄 Article

How to Rack 30 Petabytes of Storage

A small AI lab built 30PB of storage for $354k/year instead of AWS's $12M/year by questioning whether ML training data needs enterprise reliability—spoiler: it doesn't.

· infrastructure
Read Original
Listen to Article
0:000:00
Summary used for search

• ML training data doesn't need AWS's "13 nines" reliability—you can lose 5% of training data with minimal impact, so 2 nines is plenty
• They achieved $1/TB/month vs AWS's $38/TB and Cloudflare's $10/TB by racking used enterprise drives in a SF colocation center
• Kept software obsessively simple: 200 lines of Rust + nginx + SQLite instead of Ceph/MinIO, which would require dedicated specialists to debug
• Threw a 36-hour "Storage Stacking Saturday" party where friends helped rack 2,400 hard drives in exchange for food and engraved drives
• Includes complete parts list and setup guide: 100 DS4246 chassis, used Intel servers, Arista switch, specific HBA/NIC recommendations

This is a detailed case study of how an AI research lab built 30 petabytes of storage for video training data at 1/40th the cost of AWS. The core insight: ML training data is a commodity that doesn't need enterprise-grade reliability. Since losing any 5% of pretraining data has minimal impact, they don't need AWS's "13 nines" of reliability—2 nines works fine. This single requirement difference unlocks massive cost savings that cloud providers can't match because their pricing assumes everyone needs maximum redundancy.

The technical approach prioritized radical simplicity over features. Instead of using Ceph (which requires dedicated specialists) or MinIO (unnecessary S3 compatibility overhead), they wrote 200 lines of Rust for writes, nginx for reads, and SQLite for metadata tracking. No redundancy, no sharding, just XFS-formatted drives that roughly saturate their 100Gbps connection. Key lessons: frontloaders were a mistake (had to screw in 2,400 drives individually), daisy-chaining bottlenecked performance (should have given each chassis its own HBA), and having the datacenter blocks from their office was worth the premium for debugging access. They provide a complete replication guide including specific part numbers, vendor recommendations, and networking setup tips.

The broader implication: small teams can compete with frontier labs by questioning default assumptions about infrastructure. When your requirements genuinely differ from enterprise defaults, cloud pricing becomes irrational. The article includes full cost breakdowns ($17.5k/month recurring + $12k/month depreciation), failure modes they encountered, and a parts list for anyone wanting to replicate this. They even rate-limited Cloudflare's R2 metadata layer during large training runs, confirming their scale genuinely needed custom infrastructure.