AI Agent Claude Plays RollerCoaster Tycoon: Lessons for B2B SaaS
Ramp modded Claude Code into RollerCoaster Tycoon to test AI agents in a B2B SaaS proxy environment, discovering that agent success depends less on intelligence and more on interface legibility—a lesson that applies directly to modern business software.
Read OriginalMy Notes (1)
Coding Agents are a tremendous tailwind; build sails
Summary used for search
TLDR
• Claude excels at information synthesis (analyzing 100+ park metrics) and "pulling digital levers" (pricing, hiring, configs) but fails at spatial tasks (pathways, rollercoaster placement)—revealing that agents automate diligence, not intelligence
• RCT is secretly a B2B SaaS simulator with customer metrics, financial dashboards, and operational feedback loops, making it a better AI testbed than Minecraft or StarCraft
• They "vibe-coded" the entire mod with 4 parallel Claude Code instances over 40 hours, creating rctctl (a kubectl-like CLI) and JSON-RPC layer for Claude to control the game
• Key insight: environment legibility is the limiting factor for agents, not model capability—clean information surfaces and strong interfaces matter more than raw AI power
• Managing multiple coding agents felt like "a management simulation game," foreshadowing how programming itself becomes an orchestration task
In Detail
Ramp's experiment putting Claude Code into RollerCoaster Tycoon reveals a critical insight about AI agents: the bottleneck isn't intelligence, it's interface design. They chose RCT because it's actually a B2B SaaS simulator disguised as a game—complete with customer satisfaction metrics, financial dashboards, workforce management, and operational feedback loops. Unlike Minecraft (no capitalism) or StarCraft (not customer-centric), RCT mirrors the exact environment where Ramp and its customers operate.
Claude's performance split cleanly along interface quality lines. It excelled at information-dense tasks: scanning 100+ data points across park financials, ride breakdowns, and guest complaints to generate CFO reports and prioritized task lists. It reliably handled "digital lever" operations like pricing adjustments, hiring staff, and launching marketing campaigns. But it struggled badly with spatial reasoning—placing pathways, connecting ride entrances, and positioning large rollercoasters—because the ASCII map representations were fundamentally awkward interfaces for spatial tasks. The pattern is clear: agents automate diligence (processing clean information surfaces) rather than intelligence (navigating ambiguous spatial environments).
The technical implementation itself demonstrated coding agents' current capabilities. They "vibe-coded" the entire mod using four parallel Claude Code instances over 40 hours, building rctctl (a kubectl-inspired CLI), a JSON-RPC layer, and ASCII map rendering. The biggest bottleneck wasn't coding speed but feedback loops—agents couldn't QA their own work, requiring manual testing. The experience of orchestrating multiple coding agents felt "like a management simulation game," a meta-commentary on how programming is shifting from direct implementation to agent orchestration. The takeaway for B2B SaaS: don't wait for AGI to build agents. Start now with task-specific agents and clean interfaces—the models are already capable enough if your software is legible enough.