How Vercel Cut Build Wait Times From 90 Seconds To 5
Vercel achieved an 18x speedup by accepting a harder constraint—treating every build as potentially malicious—and building from scratch with Firecracker microVMs instead of containers, then stacking three optimizations on top.
Read Original Summary used for search
TLDR
• Vercel's Hive platform uses Firecracker microVMs (125ms boot, few MB memory) instead of containers because containers share a kernel—one exploit could reach every customer's build on the same machine
• The 90s→5s improvement came from three stacked optimizations: image caching + block device snapshotting (cut 45s), warm pools of pre-booted cells (eliminated cold starts for common case), and Firecracker's millisecond boot times
• Each build runs in an ephemeral "cell" (one microVM, one container, one build) that's destroyed after completion—reusing would be faster but creates a security leak path
• Warm pools burn substantial money on idle compute, but this tradeoff enables the 5s provisioning time and unlocked product features like enhanced build machines and Secure Compute
• The lesson: threat model drives architecture—cooperative tenants can use containers, but hostile multi-tenancy requires microVMs or sandboxed runtimes, and starting with the right primitive unlocks optimizations you can't retrofit later
In Detail
Vercel's Hive platform treats every customer build as potentially malicious code running on shared hardware—a constraint called hostile multi-tenancy. Containers were inadequate because they share the same Linux kernel; one kernel exploit in a customer's build could reach every other build on the machine. Traditional VMs provide kernel-level isolation but take 30-60 seconds to boot and consume hundreds of megabytes just to exist. Vercel chose Firecracker microVMs, which boot in 125ms and use only a few MB of memory while providing hardware-enforced isolation with separate kernels. This foundation made the speed optimizations possible.
The 18x improvement came from three compounding layers. First, faster boots: Vercel caches the build container image locally (shaving 45 seconds off cold starts) and uses block device snapshotting to start new cells from a saved known-good state instead of building from scratch. Second, warm pools: Vercel keeps cells pre-booted with the build container loaded, so most builds start immediately with zero wait. The 5-second provisioning time only applies when the warm pool is empty during traffic spikes. Third, Firecracker's baseline speed makes warm pools practical at scale—traditional VMs boot too slowly to make this work.
The architecture has real costs. Warm pools burn money on idle compute, and building Hive from primitives required enormous engineering investment versus using Kubernetes or ECS. But owning the substrate gave Vercel leverage to ship features like enhanced build machines and Secure Compute that would be difficult to retrofit onto a third-party platform. The broader lesson: threat model determines architecture, and sometimes accepting more complexity at the foundation unlocks capabilities you can't add later. Vercel got faster by choosing the harder path first.