Gemini 3 Flash Outperforms Gemini 3 Pro And Gpt 5 2 In These Key Bench

Bonisiwe Shabane
-
gemini 3 flash outperforms gemini 3 pro and gpt 5 2 in these key bench

Google today released its fast and cheap Gemini 3 Flash model, based on the Gemini 3 released last month, looking to steal OpenAI’s thunder. The company is also making this the default model in the Gemini app and AI mode in search. The new Flash model arrives six months after Google announced the Gemini 2.5 Flash model, offering significant improvements. On the benchmark, the Gemini 3 Flash model outperforms its predecessor by a significant margin and matches the performance of other frontier models, like Gemini 3 Pro and GPT 5.2, in some measures. For instance, it scored 33.7% without tool use on Humanity’s Last Exam benchmark, which is designed to test expertise across different domains. In comparison, Gemini 3 Pro scored 37.5%, Gemini 2.5 Flash scored 11%, and the newly released GPT-5.2 scored 34.5%.

On the multimodality and reasoning benchmark MMMU-Pro, the new model outscored all competitors with an 81.2% score. Google is making Gemini 3 Flash the default model in the Gemini app globally, replacing Gemini 2.5 Flash. Users can still choose the Pro model from the model picker for math and coding questions. Google's Gemini 3 Flash delivers 90.4% GPQA Diamond and 78% SWE-bench at $0.50/M tokens. What the fastest frontier model means for AI infrastructure. Google launched Gemini 3 Flash on December 17, 2025, delivering frontier-class performance at Flash-level speed and cost.

The model achieves 90.4% on GPQA Diamond and 78% on SWE-bench Verified while costing just $0.50 per million input tokens, roughly 6x cheaper than Claude Opus 4.5. For inference-heavy deployments, Gemini 3 Flash processes 218 tokens per second, outperforming GPT-5.1 (125 t/s) and DeepSeek V3.2 reasoning mode (30 t/s). Google released Gemini 3 Flash on December 17, 2025, one month after Gemini 3 Pro topped the LMArena leaderboard. The model combines Pro-grade reasoning with Flash-level latency and efficiency, targeting high-volume production workloads where cost and speed matter as much as capability. Gemini 3 Flash immediately became the default model in the Gemini app and AI Mode in Google Search, signaling Google's confidence in deploying frontier intelligence at consumer scale. The model outperforms Gemini 2.5 Pro across benchmarks while running 3x faster according to Artificial Analysis testing.

In several benchmarks, it trades blows with GPT-5.2, the model OpenAI rushed out to counter Gemini 3 Pro. Gemini 3 Flash is now available in Kilo Code. Google released the model this week, positioning it differently from previous Flash variants. Instead of the usual “fast but less capable” tradeoff, Gemini 3 Flash delivers Pro-grade reasoning at Flash-level speed and cost. On SWE-bench Verified, it scored 78%, beating both Gemini 2.5 Pro and Gemini 3 Pro. Within 24 hours of release, Gemini 3 Flash hit the top 20 on the Kilo leaderboard, outranking models several times its price.

We ran it through the same three coding challenges we used in our comparisons of GPT-5.1, Gemini 3 Pro, and Claude Opus 4.5 and GPT-5.2/Pro. TL;DR: Gemini 3 Flash scored 90% average across three tests while costing $0.17 total. That’s 7 points higher than Gemini 3 Pro (84.7%), 6x cheaper, and 3x faster. Gemini 3 Flash is built for agentic coding workflows and responsive applications where speed matters. At 4x less than Pro pricing, it completed our three tests for $0.17 total compared to $1.10 for Gemini 3 Pro, finishing in 2.5 minutes versus 9 minutes. To test the latest model from Google, we used the same three tests from our previous comparisons of GPT-5.1, Gemini 3 Pro, and Claude Opus 4.5 and GPT-5.2/Pro:

Gemini 3 Flash is our latest model with frontier intelligence built for speed that helps everyone learn, build, and plan anything — faster. Google is releasing Gemini 3 Flash, a fast and cost-effective model built for speed. You can now access Gemini 3 Flash through the Gemini app and AI Mode in Search. Developers can access it via the Gemini API in Google AI Studio, Google Antigravity, Gemini CLI, Android Studio, Vertex AI and Gemini Enterprise. Your browser does not support the audio element. Today, we're expanding the Gemini 3 model family with the release of Gemini 3 Flash, which offers frontier intelligence built for speed at a fraction of the cost.

With this release, we’re making Gemini 3’s next-generation intelligence accessible to everyone across Google products. Last month, we kicked off Gemini 3 with Gemini 3 Pro and Gemini 3 Deep Think mode, and the response has been incredible. Since launch day, we have been processing over 1T tokens per day on our API. We’ve seen you use Gemini 3 to vibe code simulations to learn about complex topics, build and design interactive games and understand all types of multimodal content. The funniest part of modern model launches is not the hype. It’s the moment the community does the math.

That moment hit hard on Dec 17, 2025, when Gemini 3 Flash showed up and people started asking the impolite question, “Why does the cheaper tier look like it’s dunking on the expensive one?”... The “Code Black for OpenAI” memes were predictable. The more interesting reaction was quieter: engineers began rewriting their default choices. Not for ideology. For physics. If a model is fast enough to stay inside your feedback loop and smart enough to avoid constant babysitting, it becomes the thing you reach for first.

That’s the real story of Gemini 3 Flash. It’s not “a Flash model that’s kinda good.” It’s a distilled reasoning engine that makes the price to competence curve bend in a new direction. A lot of teams will stop treating Pro models as the default and start treating them as the escalation path. At a high level, this is Google’s efficiency play inside the broader “Google Gemini 3” lineup: keep the multimodal foundation, keep strong reasoning, then optimize the whole package for speed and cost. The model card describes Gemini 3 Flash as a natively multimodal reasoning model built off the Gemini 3 Pro reasoning foundation, with “thinking levels” to control the quality, cost, and latency tradeoff. Here are the specs that actually change product decisions:

Google has just expanded its family of AI models with Gemini 3 Flash, a version designed to be extremely fast and inexpensive while maintaining the intelligence level of the recently introduced models, Gemini 3... The company describes it as “cutting-edge intelligence designed for speed,” and positions it as the default model in the Gemini app and in the AI mode of the search engine. This means that millions of users are already utilizing this version without having to change any configuration settings. Until now, the mental framework was simple: if you desired speed and a good price, you would opt for a lightweight model; if you required complex reasoning, you would choose a “large” model such... Gemini 3 Flash aims precisely to break that dichotomy: it preserves the Pro-level reasoning of Gemini 3, yet delivers response times three times faster than Gemini 2.5 Pro and at a fraction of its... The technical key to Gemini 3 Flash lies in how it manages its “thinking.” The model is capable of modulating how much work it processes according to the task: it can allocate more internal...

This approach directly benefits users financially, since, on average, Flash utilizes 30% fewer tokens than Gemini 2.5 Pro for typical traffic while completing everyday tasks with high precision. This means that for the same task, it processes fewer “units of text,” making it more cost-effective for users, as their monthly credits last longer. The official pricing is $0.50 per million input tokens, and $3 per million output tokens (audio remains at $1 per million input tokens), which is significantly lower than the costs of the Pro models. Get a detailed comparison of AI language models OpenAI's GPT‑5 and Google's Gemini 3 Flash, including model features, token pricing, API costs, performance benchmarks, and real-world capabilities to help you choose the right LLM... GPT‑5 is OpenAI’s most advanced and versatile model to date, launched on August 7, 2025. It manages reasoning, creative writing, coding, health queries, and visual comprehension within a unified system.

Equipped with intelligent routing and adjustable reasoning effort and verbosity, GPT‑5 delivers expert-level responses with reduced hallucinations and enhanced chain‑of‑thought transparency. Gemini 3 Flash is Google's frontier-speed model built to deliver strong multimodal and coding performance at a fraction of Gemini 3 Pro's cost. It combines advanced visual and spatial reasoning with faster response times, surpassing Gemini 2.5 Pro on many benchmarks while remaining highly efficient for production workloads. Gemini 3 Flash is 4 months newer than GPT‑5. Gemini 3 Flash has a larger context window (1M vs 400K tokens). Unlike GPT‑5, Gemini 3 Flash supports voice, video processing.

Compare costs for input and output tokens between GPT‑5 and Gemini 3 Flash. The AI wars continue to heat up. Just weeks after OpenAI declared a “code red” in its race against Google, the latter released its latest lightweight model: Gemini 3 Flash. This particular Flash is the latest in Google’s Gemini 3 family, which started with Gemini 3 Pro, and Gemini 3 Deep Think. But while this latest model is meant to be a lighter, less expensive variant of the existing Gemini 3 models, Gemini 3 Flash is actually quite powerful in its own right. In fact, it beats out both Gemini 3 Pro and OpenAI’s GPT-5.2 models in some benchmarks.

Lightweight models are typically meant for more basic queries, for lower-budget requests, or to be run on lower-powered hardware. That means they’re often faster than more powerful models that take longer to process, but can do more. According to Google, Gemini 3 Flash combines the best of both those worlds, producing a model with Gemini 3’s “Pro-grade reasoning,” with “Flash-level latency, efficiency, and cost.” While that likely matters most to developers,... You can see these improvements in Google’s reported benchmarking stats for Gemini 3 Flash. In Humanity’s Last Exam, an academic reasoning benchmark that tests LLMs on 2,500 questions across over 100 subjects, Gemini 3 Flash scored 33.7% with no tools, and 43.5% with search and code execution. Compare that to Gemini 3 Pro’s 37.5% and 45.8% scores, respectively, or OpenAI’s GPT-5.2’s scores of 34.5% and 45.5%.

In MMMU-Pro, a benchmark that test a model’s multimodal understanding and reasoning, Gemini 3 Flash got the top score (81.2%), compared to Gemini 3 Pro (81%) and GPT-5.2 (79.5). In fact, across the 21 benchmarking tests Google highlights in its announcement, Gemini 3 Flash has the top score in three: MMMU-Pro (tied with Gemini 3 Pro), Toolathlon, and MMMLU. Gemini 3 Pro still takes the number one spot on the most tests here (14), and GPT-5.2 topped eight tests, but Gemini 3 Flash is holding its own. Google notes that Gemini 3 Flash also outperforms both Gemini 3 Pro and the entire 2.5 series in the SWE-bench Verified benchmark, which tests the model’s coding agent capabilities. Gemini 3 Flash scored a 78%, while Gemini 3 Pro scored 76.2%, Gemini 2.5 Flash scored 60.4%, and Gemini 2.5 Pro scored 59.6%. (Note that GPT-5.2 scored the best of the models Google mentions in this announcement.) It’s a close race, especially when you consider this is a lightweight model scoring alongside these company’s flagship models.

That might present an interesting dilemma for developers who pay to use AI models in their programs. Gemini 3 Flash costs $0.50 per every million input tokens (what you ask the model to do), and $3.00 per every million output tokens (the result the models returns from your prompt). Compare that to Gemini 3 Pro, which costs $2.00 per every million input tokens, and $12.00 per every million output tokens, or GPT-5.2’s $3.00 and $15.00 costs, respectively. For what it’s worth, it’s not as cheap as Gemini 2.5 Flash ($0.30 and $2.50), or Grok 4.1 Fast for that matter ($0.20 and $0.50), but it does outperform these models in Google’s reported... Google notes that Gemini 3 Flash uses 30% fewer tokens on average than 2.5 Pro, which will save on cost, while also being three times faster.

People Also Search

Google Today Released Its Fast And Cheap Gemini 3 Flash

Google today released its fast and cheap Gemini 3 Flash model, based on the Gemini 3 released last month, looking to steal OpenAI’s thunder. The company is also making this the default model in the Gemini app and AI mode in search. The new Flash model arrives six months after Google announced the Gemini 2.5 Flash model, offering significant improvements. On the benchmark, the Gemini 3 Flash model ...

On The Multimodality And Reasoning Benchmark MMMU-Pro, The New Model

On the multimodality and reasoning benchmark MMMU-Pro, the new model outscored all competitors with an 81.2% score. Google is making Gemini 3 Flash the default model in the Gemini app globally, replacing Gemini 2.5 Flash. Users can still choose the Pro model from the model picker for math and coding questions. Google's Gemini 3 Flash delivers 90.4% GPQA Diamond and 78% SWE-bench at $0.50/M tokens....

The Model Achieves 90.4% On GPQA Diamond And 78% On

The model achieves 90.4% on GPQA Diamond and 78% on SWE-bench Verified while costing just $0.50 per million input tokens, roughly 6x cheaper than Claude Opus 4.5. For inference-heavy deployments, Gemini 3 Flash processes 218 tokens per second, outperforming GPT-5.1 (125 t/s) and DeepSeek V3.2 reasoning mode (30 t/s). Google released Gemini 3 Flash on December 17, 2025, one month after Gemini 3 Pro...

In Several Benchmarks, It Trades Blows With GPT-5.2, The Model

In several benchmarks, it trades blows with GPT-5.2, the model OpenAI rushed out to counter Gemini 3 Pro. Gemini 3 Flash is now available in Kilo Code. Google released the model this week, positioning it differently from previous Flash variants. Instead of the usual “fast but less capable” tradeoff, Gemini 3 Flash delivers Pro-grade reasoning at Flash-level speed and cost. On SWE-bench Verified, i...

We Ran It Through The Same Three Coding Challenges We

We ran it through the same three coding challenges we used in our comparisons of GPT-5.1, Gemini 3 Pro, and Claude Opus 4.5 and GPT-5.2/Pro. TL;DR: Gemini 3 Flash scored 90% average across three tests while costing $0.17 total. That’s 7 points higher than Gemini 3 Pro (84.7%), 6x cheaper, and 3x faster. Gemini 3 Flash is built for agentic coding workflows and responsive applications where speed ma...