← Bookmarks 📄 Article

Choosing the Right Metric: A Guide to Percentiles and Averages | Product Blog • Sentry

Averages hide the performance issues that matter most—here's why p75 beats p95 for frontend apps and when each percentile actually tells you something useful.

· software engineering
Read Original
Summary used for search

• Averages mask outliers and only work when data is consistent—they're terrible for monitoring user experience on variable frontend apps
• p75 is the sweet spot for frontend (high variability), p95 for backend (uniform data), p99 for catching rare but impactful edge cases
• The right metric depends on your goal: p75/p95 for typical user experience, p99 for anomalies, averages for capacity planning
• Frontend percentile charts show steep climbs from p50 to p90; backend charts show gradual rises to p95 then sharp spikes to p99
• Wrong metrics = missed performance issues, poor user satisfaction, and longer resolution times

The core problem with performance monitoring is that averages lie when data is variable. If you're tracking page load times, a few users on slow networks can drag down your average while hiding that 95% of users have a great experience—or vice versa. Averages only work for uniform data like server capacity monitoring, where moving averages signal system saturation.

Percentiles solve this by showing distribution. p50 (median) captures typical performance. p75 is the goldilocks metric for frontend apps because it balances central trends with the wide variability of user devices and networks. p95 works for backend apps with consistent data, capturing what 95% of users experience while highlighting bottlenecks. p99 marks extreme cases—useful for backend systems with uniform data, but in variable frontend contexts it just represents noise. The author provides visual examples: backend percentile charts show gradual rises from p25 to p95, then sharp spikes to p99; frontend charts show steeper climbs from p50 to p90 due to higher variability.

The decision framework: Use p75 or p95 to understand majority user experience (p75 for frontend variability, p95 for backend consistency). Use p95 (frontend) or p99 (backend) to detect rare anomalies affecting small user subsets. Use averages for resource allocation and scalability planning during load spikes. The stakes are real—wrong metrics mean you miss performance issues, deliver poor user experiences, and take longer to resolve problems.