Google Cuts AI Ultra to $100, Launches Gemini 3.5 Flash at Half the Price of Frontier Rivals
Krasa AI
2026-05-19
5 minute read
Google Cuts AI Ultra to $100, Launches Gemini 3.5 Flash at Half the Price of Frontier Rivals
Google used its I/O 2026 keynote to do something the major AI labs have mostly avoided over the last 18 months: cut prices. The AI Ultra subscription drops from $250 to $200 a month at its top tier and gains a new $100 mid-tier. The new Gemini 3.5 Flash model lands at $1.50 input and $9.00 output per million tokens, undercutting comparable frontier models by close to half.
The pricing moves arrived alongside Gemini Spark and the intelligent eyewear lineup, but they may end up being the most consequential announcement of the day. AI consumer subscriptions and API pricing have crept upward all year. Today, Google flipped the direction.
The Subscription Reset
The Ultra changes are aggressive on both ends. The flagship Ultra plan — previously the only Ultra tier and priced at $250 a month — drops to $200 and keeps every existing benefit, including Veo 4 video generation, the highest Gemini usage caps, and 30TB of cloud storage.
The bigger shift is the new tier underneath. Ultra at $100 a month is aimed squarely at heavy users who don't want to manage their own API bill but were turned off by the $250 ceiling. It includes Gemini 3.1 Pro access, full Spark agent capability, and most of the headline features minus the highest usage caps.
For comparison, ChatGPT Pro is $200 a month and ChatGPT Business sits between Pro and the corporate tier. Anthropic's Claude Max maxes at $200. Google has now opened a $100 gap that didn't exist 24 hours ago.
Why this matters: pricing is the lever that converts category adoption into category leadership. Google's bet is that the agentic features users actually want — Spark, the new search agents, Imagen and Veo for creative — are now sticky enough that lowering the price pulls a year of pent-up demand into the funnel at once.
Gemini 3.5 Flash: Cheap and Fast Enough to Replace Pro
The other half of the announcement is a new model. Gemini 3.5 Flash sits between the existing Flash and Pro tiers and is priced at $1.50 per million input tokens and $9.00 per million output tokens. Google claims it runs roughly 4x faster than other frontier-class models and ships with a full 1M-token context window.
The pricing is more interesting than the speed claim. At $1.50 input, Flash 3.5 is 40% cheaper than Gemini 3.1 Pro ($2.50 input) and dramatically cheaper than the equivalent OpenAI and Anthropic frontier offerings. Google is explicitly positioning Flash 3.5 as the new default for production workloads where Pro is overkill.
It is also the model running underneath Gemini Spark. That choice is a tell: Google is betting that agentic workloads — high-volume, low-latency, mostly tool use rather than deep reasoning — should run on Flash-class models, not Pro-class ones. Every other lab will be re-running their internal cost math after this benchmark.
The catch is that Flash 3.5 is roughly 3x more expensive than the existing Gemini 3 Flash at $0.50 / $3. So while it undercuts the Pro and Opus tier, it cedes the absolute floor of the market to the previous-generation Flash. Google appears comfortable letting Flash 3 hold the budget tier while Flash 3.5 takes the mainstream production tier.
Industry Impact
For OpenAI, the pricing pressure is immediate. The GPT-4 successor lineup has held steady at premium pricing for two quarters, and the IPO road show has been built partly on revenue-per-user growth. A real cheaper-and-good-enough alternative inside the Gemini app — with Workspace and Android distribution already attached — pushes OpenAI toward either a matching cut or a stronger differentiation argument.
For Anthropic, the math is more nuanced. Claude's growth has been driven by enterprise contracts where price is rarely the deciding factor. But Anthropic also runs a consumer product, and Claude Pro at $20 a month plus Claude Max at $200 a month now look expensive relative to the $100 Ultra tier with comparable feature breadth.
For API customers, this is a clear win. Flash 3.5 sets a new price-performance reference point that the rest of the market has to respond to. Expect aggressive matching from OpenAI's GPT-5 tier and Anthropic's Haiku-class models within the next 60 days.
Expert Perspectives
Analyst reaction has trended positive. Constellation Research framed the launch as Google's "token cost and performance balance" play — a deliberate decision to commoditize the middle of the model market before competitors establish a premium-only foothold. API resellers like OpenRouter were simpler: Flash 3.5 is now the recommended default for most production workloads.
The contrarian view on X is that lower pricing reflects pressure, not strength. Google's enterprise AI revenue has lagged Anthropic's, and Ultra subscriber numbers have never been broken out. The price cut may be a tacit acknowledgment that demand at $250 was lower than hoped.
What to Watch
Three signals matter over the next quarter. First, whether OpenAI and Anthropic match the Flash 3.5 pricing within 90 days — that tells you whether the market has actually re-priced or just Google has. Second, the AI Ultra subscriber numbers, if Google chooses to disclose them at the next earnings call. Third, the third-party developer reaction: a real shift in default model recommendations from API resellers would confirm that the new Flash tier is now the production standard.
The Bottom Line
Google did two things at I/O that the rest of the market has been avoiding: cut subscription prices and ship a serious model at a serious discount. If the strategy works, AI Ultra becomes the new floor for consumer AI subscriptions and Flash 3.5 becomes the new default for production APIs. If competitors match within 60 days, the consumer AI market has officially entered its commoditization phase — which is good news for everyone except the labs that were counting on premium pricing to fund the next round of training runs.
Don't fall behind
Expert AI Implementation →Related Articles
NVIDIA Cosmos 3: First Open Physical AI Omnimodel Cuts Training Cycles to Days
NVIDIA's Cosmos 3 launches at Computex 2026 — a fully open foundation model that unifies vision, world generation, and action for robots and autonomous systems.
min read
Anthropic Adds Services Track and Partner Hub to Claude Network
Anthropic launches a 3-tier Services Track and a public Partner Hub. 40,000 firms have applied; 10,000 consultants are certified.
min read
Apoha Exits Stealth With $36M to Build 'Liquid Brain' AI for Materials
UK startup Apoha emerges with $36M Series A and a wild new data type: how materials vibrate in liquid. The pitch is AI for materials discovery.
min read