← Bookmarks 📄 Article

Gemini 3.5 Flash: more expensive, but Google plan to use it for everything

Google's new Gemini 3.5 Flash costs 3-6x more than previous Flash models, yet they're deploying it to billions of free users—a bold bet that API customers will absorb price increases while consumer deployment drives ecosystem lock-in.

· ai ml
Read Original
Listen to Article
0:000:00
Summary used for search

• Gemini 3.5 Flash is 3x the price of 3 Flash Preview ($1.50/M input, $9/M output) and 6x more than 3.1 Flash-Lite, approaching Gemini 3.1 Pro pricing
• Despite the price hike, Google is rolling it out across all consumer products (Gemini app, Search, enterprise) for free—testing price elasticity with API customers while subsidizing consumer usage
• This follows industry trend: GPT-5.5 was 2x GPT-5.4's price, Claude Opus 4.7 is 1.46x more than 4.6
• Artificial Analysis benchmark reveals 3.5 Flash (high) actually costs MORE to run than the previous 3.1 Pro Preview
• All three major AI labs are simultaneously raising API prices while deploying expensive models to free users—a strategic bet on developer price tolerance

Google released Gemini 3.5 Flash with a significant price increase—$1.50/million input and $9/million output tokens, making it 3x more expensive than Gemini 3 Flash Preview and 6x more than 3.1 Flash-Lite. This puts it nearly on par with Gemini 3.1 Pro ($2/$12), yet Google is deploying it to "billions of people globally" across the Gemini app, AI Mode in Search, and all enterprise products. The model supports 1M input tokens and 65K output tokens with a January 2025 knowledge cutoff, though notably lacks computer use capabilities.

The pricing strategy reveals a broader industry pattern: OpenAI's GPT-5.5 doubled GPT-5.4's price, and Claude Opus 4.7 is roughly 1.46x more expensive than 4.6 when accounting for the new tokenizer. Artificial Analysis's proprietary benchmark data shows that running tests against 3.5 Flash (high) actually costs more than the previous 3.1 Pro Preview—a counterintuitive result given the "Flash" branding typically signals efficiency. Google is also introducing a new Interactions API (beta) that mirrors OpenAI's server-side history management approach.

The simultaneous price increases across all major AI labs while deploying expensive models to free consumer products suggests a deliberate strategy: subsidize consumer adoption to drive ecosystem lock-in while testing how much API customers will tolerate. It's a bet that developers and enterprises have higher price tolerance than previously assumed, and that widespread consumer deployment creates enough strategic value to justify the costs.