Google reveals Gemini 2.5 Flash, its ‘most cost-efficient considering mannequin’

Simply weeks after unveiling Gemini 2.5 Professional, Google is on to its subsequent top-performing mannequin.

On Thursday, the corporate launched an "early model" of Gemini 2.5 Flash in preview within the Gemini API, AI Studio, and Vertex AI. The mannequin has a information cutoff of January 2025. It may well take textual content, photos, video, and audio prompts, and has a one-million-token context window.

Additionally: Gemini Professional 2.5 is a stunningly succesful coding assistant – and an enormous menace to ChatGPT

Google says the brand new model expands on Flash 2.0 with improved reasoning, however "with out compromising its famend pace or value." Reasoning fashions spend extra time "considering" — or decoding a question — earlier than responding, which ends up in extra thorough and direct output that, ideally, aligns higher with a consumer's wants, in comparison with earlier fashions that prioritize pace. Fashions that purpose are additionally higher geared up to precisely ship on multi-step issues or duties.

"Gemini 2.5 Flash performs strongly on Onerous Prompts in ChatBot Area, second solely to 2.5 Professional," Google notes within the announcement.

Referring to the brand new mannequin as its most cost-efficient, Google notes that 2.5 Flash "permits builders to configure the quantity of considering it does to maximise efficiency." This provides builders a "considering funds," or the facility to pay for reasoning solely after they want it most. With reasoning on, the output value jumps from 60 cents per a million tokens to $3.50.

If builders don't give the mannequin a funds, it determines the question's considering wants itself by evaluating the request for complexity. For instance, it can establish prompts with minimal reasoning wants — like "What number of states are there within the US?" — individually from multi-step math issues. Google notes that to copy Flash 2.0 latency and value, builders ought to set the funds to 0.

Additionally: Methods to attempt Google's Veo 2 AI video generator – and what you are able to do with it

Gemini 2.5 Flash scored 12% on Humanity's Final Examination (HLE), a brand new, various benchmark to business assessments which have turn out to be too simple for quickly evolving fashions. This rating outperformed competitor fashions, together with Claude 3.7 Sonnet and DeepSeek R1, however not OpenAI's just-launched o4-mini, which got here in at 14% on the check.

You’ll be able to attempt Gemini 2.5 Flash in preview by the Gemini API in Google AI Studio and Vertex AI.

Need extra tales about AI? Sign up for Innovation, our weekly publication.

Google reveals Gemini 2.5 Flash, its ‘most cost-efficient considering mannequin’

Synthetic Intelligence

Latest stories

CMS Uses Machine Learning to Fully Reconstruct LHC Collisions

LANL: AI Accelerates Elucidation of Nuclear Forces with Explosive Neutron...

PNNL: Integrating AI into Biological Research

Rick Stevens on the Genesis Mission and the Future of...

Inside the DOE’s 26 AI Challenges for Genesis Mission

You might also like...

CMS Uses Machine Learning to Fully Reconstruct LHC Collisions

LANL: AI Accelerates Elucidation of Nuclear Forces with Explosive Neutron Star Data

PNNL: Integrating AI into Biological Research