All eyes are on Google. The tech giant has reportedly been stealing the spotlight from OpenAI. Even AI insider Jimmy Apples agrees. “Beat down,” said Apples, sharing a meme featuring the faces of Google CEO Sundar Pichai and co-founder Larry Page morphed into Justice’s Stress video, burning OpenAI (the car) in a chaotic celebration of dominance.
Adding fuel to the fire, Google recently took the internet by storm with the latest release of the much-anticipated Gemini 2 – a first-of-its-kind model that supports multimodal inputs like images, video, and audio.
OpenAI, on the other hand, is going through a brief turbulence as Sora and ChatGPT’s website experienced an outage, compelling its team to have sleepless nights fixing them on the backdrop of the ‘12 Days of OpenAI’ livestream sprint.
As they say, after every storm comes a rainbow. OpenAI is hinting at dropping something big, maybe even an ‘AGI bomb’, at the end of the event.
This comes amid the backdrop of OpenAI’s executive accidentally revealing the ‘Super Secret AGI’ calendar on the fifth day of shipmas. “AGI is in the air,” said OpenAI co-founder and president Greg Brockman, in a cryptic post on X, hinting at a likely release of GPT-4.5 or GPT-5 in the coming days.
That explains why Google has been in a hurry to ship products as if there is no tomorrow. Meanwhile, Amazon also released Nova and plans to release any-to-any model sometime next year.
To date, OpenAI has launched Sora, its video generation model, along with o1, o1 Pro, and Canvas, alongside supporting Apple Intelligence for free.
Going Beyond the AGI Hype
One thing is surely becoming clearer:Google seems to be focusing more on the impact side of things and less on the hype – just like in the case of releasing quantum chip Willow, alongside the release of a playable 3D world model Genie 2, which has been suspiciously gaining praises from former OpenAI co-founder Elon Musk, who once warned about Google DeepMind’s AI dominance.
Despite the differences, OpenAI chief Sam Altman also lauded Google’s quantum breakthrough. “Big congrats!”
Needless to say, Google really showed the world how AI is really done in 2024. ‘The OG of AI’ – in all sense, and Pichai couldn’t agree more.
Speaking of impact, the tech giant recently released Multimodal Live API, which offers real-time audio and video-streaming input, as well as the capability to combine various tools. Since its launch, many in the developer community have been experimenting with it.
“Gemini 2.0 real-time AI is absolutely wild! Watch how I use it as an AI research assistant by sharing my screen and asking it about an AI paper. Ten times your paper reading skills, or just let Gemini summarise key points. What an incredible time to be alive!” said Elvis Saravia, co-founder of DAIR.AI, in a post on X.
“Google Gemini 2.0 Flash’s multimodal feature is next-level! I can’t believe Google rolled this out before OpenAI. It takes live feeds and answers in real time with almost no latency. Pretty mind-blowing,” AI evangelist Ashutosh Shrivastava said.
If you try nothing else today, give the demo at https://t.co/kUyz0fxIfE a go – it lets you stream video and audio directly to Gemini 2.0 Flash and get audio back, so you can have a real-time audio conversation about what you can see with the model
Feels like science fiction! pic.twitter.com/EKKQ06rgmV— Simon Willison (@simonw) December 11, 2024
OpenAI might be feeling the pressure as it hasn’t yet launched advanced voice mode (AVM) with vision capabilities. However, a recent video surfaced showing Brockman demonstrating ChatGPT with real-time voice and vision features, indicating that the release might be imminent.
Agentic Era Loading
Similar to OpenAI’s ‘Operator’ agent, which is set to roll out early next year, Google launched Project Mariner yesterday, challenging Microsoft’s Copilot Vision and Anthropic’s Computer Use.
This AI agent, Mariner, can browse the internet and reason across everything on your browser screen, including pixels and web elements like text, code, images, and forms. It can type, scroll, or click in the active tab on your browser.
“Book a flight from San Francisco to Berlin, departing on March 5 and returning on the 12. The era of being able to give a computer a fairly complex high-level task and have it go off and do a lot of the work for you is becoming a reality,” said Jeff Dean, chief scientist at Google DeepMind.
“I think it’s the future of computer control, where systems can understand what’s on your browser, the interface, and create a new way of using the web through a more intuitive UI,” said Google DeepMind chief Demis Hassabis in a recent interview with The Rundown CEO Rowan Cheung.
In addition to this, Google announced new updates to Project Astra, which it introduced earlier this year at Google I/O. Google described it as a universal digital assistant capable of understanding and interacting with the world in real-time through multiple modalities, including text, speech, images, and video.
Hassabis explained that the future goal with Astra is to build an assistant that knows the preferences of the users and what they are trying to achieve. He said their long-term goal is for it to have infinite memory.
This is similar to what Microsoft is trying to do with Copilot Vision. Mustafa Suleyman, chief of Microsoft AI, recently revealed that AI with “near-infinite memory that just doesn’t forget” is coming in 2025 and will be “truly transformative”.
There is more.Google released Trillium, its sixth-generation tensor processing unit, and ‘Deep Research’, an AI research assistant feature in Gemini Advanced. Google also introduced Jules, a developer-focused agent that integrates with GitHub workflows to assist with coding tasks under supervision.
Google currently deserves praise for its groundbreaking products and AI models, but equal credit goes to OpenAI for setting the benchmark in the AI landscape.
With just seven days left in the ‘12 Days of OpenAI,’ the ball is firmly in its court. OpenAI clearly knows what it’s doing – its quirky sense of fashion, Christmas vibes, and dad jokes have kept the livestreams entertaining and well-deserving of all the spotlight. This is unlike anything the industry has seen before, sending shockwaves through competitors and allies alike.
The post Google Crushes ‘12 Days of OpenAI’ With Just 1 Day of Gemini 2 appeared first on Analytics India Magazine.