Nobody is cooking up improvements fairly like Google. At I/O 2025, the search big dropped a slew of bulletins that left everybody surprised and questioning whether or not what that they had witnessed was even actual.
Google CEO Sundar Pichai and DeepMind CEO Demis Hassabis confirmed no mercy to their rivals, firmly securing Google’s place within the lead of the AGI race.
The most important buzz is round Google’s new video technology mannequin, Veo 3. Not solely does it create high-quality movies, however it additionally provides audio, a function we haven’t seen earlier than. Even OpenAI’s Sora lacks this function. Different instruments like Runway ML Gen-4, Meta’s MovieGen, Pika Labs, and Stability AI’s Steady Video 4D 2.0 don’t assist it both.
Veo 3 can generate the sound of site visitors within the background of a metropolis road scene, birds singing in a park, and even dialogue between characters.
“Veo 3 is the AGI second for AI video,” quipped AI influencer Ashutosh Shrivastava on X.
Social media platforms are flooded with clips generated by Veo 3, and the thrill reveals no signal of slowing down. The mannequin is surprisingly good at capturing real-world physics, from the noise and motion of water to the look and sound of strolling in snow. It even handles lip-syncing with spectacular accuracy.
One person on X posted a video imagining how Greek thinker Pythagoras might need defined the Pythagorean theorem in historic Greece. One other person shared a clip of a person performing a stand-up set, which, surprisingly, was truly humorous.
"Pythagoras explaining his theorem, in historic Greece"
Video and audio generated by Veo 3 natively. pic.twitter.com/vR1gbrLYYj— Pietro Schirano (@skirano) Might 20, 2025
Veo 3 is now accessible to Extremely subscribers within the US by means of the Gemini app and Stream, in addition to to enterprise customers by way of Vertex AI.
Filmmaking is Slated to Change Utterly
The tech big has launched a brand new software known as Stream for filmmakers. This software permits customers to generate cinematic clips and scenes, combine property throughout photographs, and reference inventive parts in plain language.
In response to Google, Stream is impressed by what it seems like when time slows down and creation is easy, iterative and stuffed with risk.
For many years, Steven Spielberg has been the gold normal in cinematic storytelling, identified for mixing emotional depth with visible spectacle in movies like E.T., Jurassic Park, and Schindler’s Record. If Veo 3 had existed in his early days, he might need been one in every of its early customers.
My first Veo 3 gen
> a video with dialogue of two muffins whereas baking in an over, the primary muffin says "I can't imagine this Veo 3 factor can do dialogue now!", the second muffin says "AAAAH, a speaking muffin!" pic.twitter.com/VA2VUZF8sS— fofr (@fofrAI) Might 20, 2025
Stream contains options equivalent to digicam controls, a scene builder for enhancing and increasing present photographs, and asset administration instruments. A showcase part known as Stream TV gives entry to clips and channels generated with Veo, together with the precise prompts and strategies used, permitting customers to “study and adapt new kinds”.
Consultants and customers alike are already imagining the longer term influence of Veo 3.
Derya Unutmaz, professor at The Jackson Laboratory, believes AI might quickly carry feature-length movies to life at a fraction of the price and time. “Quickly we’ll have Toy Story high quality feature-length movies created with AI, presumably even utilizing Veo 3 or near-future variations, in only a matter of days and for a couple of thousand {dollars},” he mentioned, including that Toy Story initially price $30 million and took 4 years to provide.
In the meantime, a person on X known as Google’s Veo 3 “greater than loopy”, predicting that inside two years, films might begin utilizing AI as an alternative of conventional CGI for shorter scenes. They added that this shift might speed up rapidly, probably leading to a big-budget movie made virtually solely with AI, with people nonetheless guiding the inventive course of.
In the meantime, Google DeepMind is partnering with Primordial Soup, a brand new storytelling enterprise based by director Darren Aronofsky. The purpose is to discover how superior video technology fashions can assist extra inventive and emotionally wealthy storytelling.
As a part of the partnership, Primordial Soup will produce three quick movies utilizing DeepMind’s generative AI instruments, together with Veo. Every movie shall be directed by an rising filmmaker, with Aronofsky offering mentorship and DeepMind’s analysis crew providing technical assist.
On the identical time, Google can also be increasing entry to Lyria 2, providing musicians extra instruments to create music.
Bye Bye Ghibli
Google wasn’t completed but. It additionally launched Imagen 4, the most recent model of its text-to-image mannequin that mixes velocity with precision to provide strikingly detailed visuals.
The brand new picture technology mannequin delivers outstanding readability in high quality textures like intricate materials, water droplets, and animal fur, whereas dealing with each photorealistic and summary kinds with ease.
Imagen 4 helps a variety of side ratios and may generate photos at as much as 2K decision, making it excellent for printing and shows. It additionally reveals vital enhancements in spelling and typography, opening up new use instances like personalised greeting playing cards, posters, and comics.
The mannequin is out there at present within the Gemini app, Whisk, Vertex AI and throughout Slides, Vids, Docs and extra in Workspace. It’s going to compete immediately with OpenAI’s picture technology mannequin, which went viral just lately after customers flooded social media with Ghibli-style photos.
Word: The headline has been up to date for higher readability
The put up Google’s Veo 3 Simply Did to Video What ChatGPT Did to Textual content appeared first on Analytics India Journal.