Blog/analysis

Google Moved the Gemini App Default Model. Google's Own Migration Doc Tells Developers It Costs More.

2026-05-19·8 min read·by gotnerfed

PENDING EVIDENCE: Archive snapshots of the four primary Google sources have not been captured at time of writing. Google PR has not been contacted. The Vendor Response section is provisional. Do not broadcast this post until both are resolved. Specific API pricing figures ($1.50 input / $9.00 output per 1M tokens) are widely reported by third-party trackers and Google Cloud Console screenshots, but as of fetch time these figures are not yet visible on ai.google.dev/gemini-api/docs/pricing - Google's own developer pricing page. We have intentionally not cited those figures as primary-source confirmed.

On May 19, 2026, Google announced Gemini 3.5 Flash at Google I/O 2026. The headline product story was a faster, more capable model with frontier-level performance on agentic and coding benchmarks. The headline distribution story was that the new model became the default behind the Gemini app and AI Mode in Search globally on the same day.

A line in Google's developer migration documentation is more interesting than either: "Gemini 3.5 Flash is more expensive than Gemini 3 Flash Preview." That sentence is the receipt.

The swap, in Google's own words

The Gemini 3.5 announcement post on Google's official blog is direct about the default-model change. From the announcement: "3.5 Flash is now the default model for the Gemini app and AI Mode in Search globally." The accompanying Search I/O 2026 post repeats it: "we're upgrading Search with Gemini 3.5 Flash - our newest Flash model delivering sustained frontier performance for agents and coding - as the new default model in AI Mode for everyone globally."

These are Google's words, on Google's blog, dated May 19, 2026. The new model became the default for the consumer Gemini app and the AI Mode default in Search the same day it was announced.

Sundar Pichai's I/O 2026 keynote post confirms the user scale: the Gemini app had 400 million monthly active users at I/O 2025, and as of May 19, 2026, it has surpassed 900 million. AI Mode in Search has surpassed 1 billion monthly active users. These are the surfaces where the default swap took effect.

The change is documented in our receipt at /changes/gemini-2026-05-19-flash-default-swap. What that receipt records: on May 18, the default model serving consumer Gemini app and AI Mode users was Gemini 3 Flash, the model Google launched in December 2025. On May 19, the default became Gemini 3.5 Flash, a different model with different reasoning characteristics, different output style, and (per Google's developer docs) different pricing implications for anyone paying for it.

No proactive user-facing notice accompanied the consumer swap. The launch was framed as a capability announcement, not a model-substitution notice. There is no email to existing users explaining that the model behind their Gemini app conversations has changed.

Google's developer docs say the new model is more expensive

The more important document is buried two levels deep in the developer documentation, at ai.google.dev/gemini-api/docs/interactions/whats-new-gemini-3.5. It's the "What's new in Gemini 3.5 Flash" migration guide for developers moving from the prior model.

The guide instructs developers to make a series of changes when migrating. The relevant line: "Update model name: gemini-3-flash-preview → gemini-3.5-flash. Review pricing. Gemini 3.5 Flash is more expensive than Gemini 3 Flash Preview. See the pricing page for details."

This is Google, in its own documentation, telling developers to switch off the old model and that the new model costs more. It is not a "tier added" framing. It is a migration instruction, with a cost warning attached. The same migration doc also notes that the default reasoning effort changed from high to medium for 3.5 Flash, meaning developers who switch will get less reasoning by default and need to manually configure higher levels to match prior behavior. The doc also notes that Computer Use is not supported in 3.5 Flash and that developers needing that capability should "continue using Gemini 3 Flash Preview."

Read in combination, the migration doc tells a clearer story than the consumer-facing announcement does. On the API side: the new model is the recommended default, costs more, has reduced default reasoning effort, and supports fewer features than the model it is replacing. We've filed the API-side change as a separate receipt at /changes/gemini-2026-05-19-3-5-flash-tier-added. At the moment Gemini 3 Flash Preview is still available on the API at its prior pricing, so the technical classification is tier-added rather than price-increase. But the migration doc itself signals where Google expects developers to land.

The pricing page hasn't caught up to the announcement

There is a wrinkle worth flagging because it shows up across multiple Google surfaces.

As of fetch time, the official Gemini API pricing page at ai.google.dev/gemini-api/docs/pricing does not list Gemini 3.5 Flash. It lists Gemini 3 Flash Preview at $0.50 per million input tokens and $3.00 per million output tokens, the December 2025 prices. The same gap exists on Google's own DeepMind model card page at deepmind.google/models/gemini/flash, which displays content describing Gemini 3 Flash, not the model Google just launched.

This isn't a conspiracy. It's a documentation lag. Big launches don't propagate to every page on the same day. But the lag means that the most authoritative Google-owned source for the new model's API pricing, as of right now, is the developer migration guide warning that the new model is "more expensive than Gemini 3 Flash Preview," without a number attached. Multiple third-party outlets and developer screenshots from the Google Cloud Console put the figure at $1.50 input and $9.00 output per million tokens, which would be three times the prior Flash pricing. We are noting that figure here because the secondary-source convergence is strong, but we are not treating it as primary-source-confirmed until Google updates its own pricing page or publishes the figure in an official press release. The receipt will be updated when that lands.

The fact that Google's pricing page hasn't been updated on launch day is itself a data point. The model is the new default on a billion-plus consumer surfaces, but the developer documentation describing its commercial terms isn't fully published.

Why this works on consumer surfaces

The consumer-side swap doesn't require user notice because the consumer-side product never exposed model versions in the first place. The Gemini app uses thinking-level labels and product names for its model picker, not direct version strings. Users select a level of reasoning, not a numbered model. When the model behind that label changes, the picker continues to display the same options. The product is the label. The model is the implementation. They are intentionally decoupled.

This is not unique to Google. Every major consumer AI app has shipped this architecture in the last two years. The architecture is what makes a same-day default swap possible without surfacing the change to users. The vendor controls both halves: the label that users select, and the model that label routes to. Decoupling them means the vendor can iterate the underlying model whenever it wants and the product UI looks unchanged.

The relevant question is when the architecture is exercised on this scale. Today's swap moved 900 million consumer Gemini app users and over a billion AI Mode users onto a different model behind the same product labels they had yesterday. The architecture has existed for some time. The exercise of it on this scale, on launch day, with no proactive user notice, is the receipt.

For paying subscribers, the question is sharper. Google AI Plus at $7.99 per month, AI Pro at $19.99 per month, and AI Ultra starting at $99.99 per month are sold on the basis of access tiers and rate limits, not specific model versions. The tier descriptions on Google's subscription page do not commit to a specific model. Whether your Pro tier queries today route to the same model they routed to yesterday is not user-inspectable from inside the app. We are not alleging that Pro routing shifted today. We are flagging that the architecture which made today's default swap invisible is the same architecture that would make any future routing change invisible.

The launch-week pricing playbook is now industry-wide

Today's event is the third launch-week pricing change we have documented in 2026. The pattern is becoming hard to ignore.

On April 2, 2026, OpenAI moved Codex pricing to token-based credits across ChatGPT Plus, Pro, Business, and new Enterprise plans. The change was documented in OpenAI's official help center on the day it shipped and expanded to remaining Enterprise plans on April 23. For consumer Plus and Pro users, the practical effect was a less predictable monthly cost on a feature many had been using under a flat-rate pricing assumption. The receipt is at /changes/openai-2026-04-codex-token-credits.

On April 15, 2026, Anthropic shipped a change that affected how API users could pin to specific Claude model versions. A Hacker News thread titled "Tell HN: Anthropic no longer allows you to fix to specific model version" surfaced the issue. Anthropic's documentation has since clarified that dated model IDs remain pinnable, but the user experience on April 15 was that the pinning behavior had changed without proactive notice. The receipt is at /changes/anthropic-2026-pin-model-removed.

Today, Google made Gemini 3.5 Flash the default for the consumer Gemini app and AI Mode in Search, with no in-app notice, while simultaneously publishing developer documentation that tells API users to migrate to a more expensive successor model.

Three vendors. Three different mechanisms. One common pattern. A new model or pricing structure ships on a launch day. The press cycle covers the capability story. Underneath, the surface that users actually depend on has been modified - silently on the consumer side, with developer warnings buried in migration documentation on the API side.

This is not a new playbook. SaaS products have changed implementations underneath stable feature labels for as long as SaaS has existed. What is new is the rate, the scale, and the fact that the implementation underneath is a foundation model. Swapping a foundation model changes reasoning patterns, refusal behavior, output style, latency distribution, and cost economics. None of those properties are exposed in the consumer product label. All of them affect downstream user workflows.

What to watch next

Three threads worth tracking.

First: the Gemini 3 Flash Preview deprecation timeline. As of today, the December 2025 model is still available on the API at its original pricing. Google's developer migration doc tells developers to switch, but does not announce a deprecation date. If Google publishes a deprecation notice for Gemini 3 Flash Preview within the next quarter, the API-side receipt converts from tier-added to a forced migration, and the analysis here changes accordingly. We are setting a 60-to-90-day watch on the deprecation notice.

Second: the Gemini 3.5 Pro release. Google's announcement says Pro is currently in internal use and will ship "next month." If 3.5 Pro lands at a higher price point than Gemini 3.1 Pro and becomes the new Pro-tier default in the same fashion, the line from today gets clearer. The pattern is: ship the new model, make it the default, leave the old model temporarily available at lower pricing, deprecate later.

Third: the consumer subscription tiers. Google's subscription page describes Plus, Pro, and Ultra tiers in terms of access levels and rate limits, not specific model versions. As Google continues to ship new Gemini models, the question of which model a paid subscriber's queries reach is increasingly opaque from the user side. That opacity is the surface where a future change would do the most damage with the least public scrutiny.

We will update both this post and the underlying receipts when Google PR responds, when Google's pricing page is updated to include 3.5 Flash, when the Gemini 3 Flash deprecation notice arrives, or when the 3.5 Pro release lands. Until any of those happen, the facts as documented stand: Google made a new model the default for the consumer Gemini app and AI Mode globally on May 19, 2026, and Google's own developer documentation tells the people paying for the model that the replacement costs more.