The buzz around the imminent launch of Google Gemini 2 is heating up. According to a recent leak on X, Google is preparing to launch a new model: Gemini-2.0-Pro-Exp-0111.
On X, Google’s senior product manager Logan Kilpatrick posted, “AI is cool I guess,” in what seems like a subtle nod to OpenAI chief Sam Altman.
The new model is expected to appear under the Advanced section, though it’s unclear if it’s intended for an internal testing group or a public launch. The user tried prompting it and received responses. According to them, it seems to work quite fast, but they’re still unsure if the responses are truly from version 2.0.
AIM had previously explored ‘why Google will make a better model than OpenAI’s o1’, and it looks like that prediction is coming true.
“An unknown Gemini model is available in the LMSYS Arena (battle). While it’s unclear if this is Gemini 2.0, the “Gemini-test” outperformed one of my tests with OpenAI o1-mini,” posted a user on X.
Meanwhile, AI insider Jimmy Apples shared a scoop on Gemini 2, posting, “Someone may have gotten too drunk and said Gemini 2.0 has already been deployed to select B2B customers…”
Similar to Gemini 1.5, Gemini 2 will continue to generate images and perform web searches—features likely included to help Google compete with OpenAI’s Search GPT and Perplexity AI. Meta is also expected to join the search race.
Interestingly, Google AI Studio and the Gemini API recently introduced ‘Grounding with Google Search’, allowing developers to improve response accuracy by incorporating real-time data from Google Search. With this update, Gemini 1.5 models can pull live information from Google Search, increasing accuracy and transparency.
Developers can access grounding features directly through the “Tools” section in Google AI Studio or by enabling the ‘google_search_retrieval’ tool in the Gemini API. Gemini 2 and its APIs will likely have this feature.
A user on X who attended Kilpatrick’s session in San Francisco revealed that Gemini 2 will be a larger model with multi-turn capabilities, vision, audio, embeddings, and more.
Inspired by Anthropic
Google plans to release a new feature that can take control of a user’s web browser to perform tasks like gathering research, purchasing products, or booking flights. This feature will also be integrated into Gemini 2.
Code-named ‘Jarvis’, the product was recently leaked, according to a report that said it was briefly available for download through Google’s Chrome web browser extension store and described itself as ‘a helpful companion that surfs the web with you’.
This is quite similar to Anthropic’s ‘Computer Use’ feature, which can take control of a user’s screen to perform actions such as viewing the screen, moving the cursor, clicking buttons, and typing text.
Similarly, Microsoft is testing Copilot Vision, a feature that enables its AI to understand and interact with content on web pages. With Copilot Vision, the AI can interpret what users are viewing in Microsoft Edge, answer questions about the content, and suggest next steps based on what’s displayed.
Google Steals Spotlight from OpenAI
Google has recently seen success with its latest products, with NotebookLM as a prime example. It has been widely praised and even called Google’s “ChatGPT Moment.” Furthermore, during the company’s recent earnings call, Google chief Sundar Pichai revealed that Google Gemini API calls have increased 14x times in the past six months.
GitHub recently partnered with Google to bring Gemini 1.5 Pro to GitHub Copilot. Gemini 1.5 is known for its two-million-token context window and ability to process code, images, video, and text simultaneously.
Gemini’s reasoning capabilities are expected to be better than OpenAI’s o1. A recent report reveals that Google is working on AI with reasoning capabilities similar to those of humans, most likely for its Gemini platform.
Kilpatrick said in an exclusive interview with AIM that Google plans to release Gemini 2, which will feature better reasoning quality and a longer context window—potentially up to billions or trillions of tokens. According to Kilpatrick, the model will be fully multimodal, with the capability to understand large videos as well.
Recently, Apples shared a document on X, dated last year, revealing that Google is planning to integrate the ‘PLANNING’ piece in the LLM. Moreover, in an old Wired article, Google’s Demis Hassabis also said that his team will combine the technologies used in AlphaGo to give the system new capabilities such as planning and solving new problems.
Notably, Google recently published a paper titled ‘Training Language Models to Self-Correct via Reinforcement Learning’. Google DeepMind has developed a multi-turn online reinforcement learning approach to improve an LLM’s capabilities to self-correct.
With further improvements to Google DeepMind’s RL techniques and their integration with Chain of Thought in Gemini, Google could easily create a model that outperforms OpenAI’s o1.
Kilpatrick told AIM that Google Gemini and Google DeepMind collaborate closely, with Google DeepMind focused on making AI accessible to developers and the public. Google DeepMind’s recent models AlphaProof and AlphaGeometry 2 won a silver medal at the International Mathematical Olympiad (IMO), while OpenAI’s o1-preview achieved 83% in a similar test.
Meanwhile, OpenAI is preparing to launch o1. According to a recent Reddit thread, Altman appears more confident about the imminence of AGI, likely due to their latest model, o1.
He even said that they have achieved human-level reasoning and will now move on to level 3 in their AGI roadmap. Many are now suggesting that OpenAI’s o1 may be regarded as the first successful commercial launch of a System 2 LLM.
The battle is on, and it seems like Google is finally ready to steal the spotlight from OpenAI. As one X post put it, “We will finally see Gemini 2.0 Pro soon, long overdue. But they will probably wait for the release of full o1 to steal the show from OpenAI, so to speak – just as OpenAI has done to Google every time.”
The post Google Gemini 2 Likely to Dethrone OpenAI o1 appeared first on Analytics India Magazine.