Why AI Keeps Creating Body Horror

Luma AI’s Dream Machine has some pretty impressive capabilities, but its most interesting one lies in creating body horror.

While many have succeeded in jailbreaking the relatively new video generation model to generate gory or NSFW videos, most have inadvertently faced some pretty shocking results.

CW: Body Horror?
This AI video attempt to show gymnastics is one of the best examples I have seen that AI doesn’t actually understand the human body and it’s motion but is just regurgitating available data. (Which appears to be minimal for gymnastics) pic.twitter.com/8dD2q30e4G

🌟Cheshire Cat ᓚᘏᗢ, (@autismsupsoc) June 29, 2024

This isn’t uncommon, as generative AI has been pretty notorious for creating nightmare fuel when it comes to generating humans. From generating too many fingers to messing up basic body proportions and fusing faces, users have been pointing out these flaws with the first iterations of DALL-E, Midjourney and Stable Diffusion.

Responding to Dream Machine’s attempt at generating the video of a gymnast, Meta’s chief AI scientist Yann LeCun implied that currently, it’s nearly impossible for video generation models to generate anything physics-based.

Video generation models do not understand basic physics.
Let alone the human body. https://t.co/qas7HS2m5p

— Yann LeCun (@ylecun) June 30, 2024

Are We Doomed to Have AI Mess Ups?

Early image generation models largely relied on layering several images and finetuning them to create a prompt-relevant image, this resulted in the models often mistaking hands and other body parts for something else.

diffusion is such an unserious algorithm
mf you're literally just repeatedly exclaiming "Enhance!" at an image of grey noise until it becomes an anime girl or whatever

— henry (@arithmoquine) June 30, 2024

This is in both parts due to the dataset that the model relies on as well as how the model goes about identifying different parts, resulting in pretty outlandish hallucinations.

Responding to a query from Buzzfeed last year, Stability AI explained the reason behind this. “It’s generally understood that within AI datasets, human images display hands less visibly than they do faces. Hands also tend to be much smaller in the source images, as they are relatively rarely visible in large form,” a spokesperson said.

Midjourney and other image generation models, over time, have managed to rectify these issues, through refining their datasets to focus on certain aspects and improving the model’s capabilities.

Just like image generation models got better, LeCun conceded that video generation models, too, would improve. However, his bold prediction was that systems that would be able to understand physics would not be generative.

“Video generation systems will get better with time, no doubt. But learning systems that actually understand physics will not be generative. All birds and mammals understand physics better than any video generation system. Yet none of them can generate detailed videos,” he said.

Forget the Horrors, What About Physics?

While the body horror aspects of AI-generated content have garnered significant attention, the more fundamental challenge lies in creating AI systems that truly understand and replicate real-world physics.

As LeCun points out, even the most advanced video generation models struggle with basic physical principles that animals intuitively grasp. Maybe improving this could solve the issue of body horror altogether.

This goes beyond just aesthetics or generating uncanny valley humans. A core challenge with AI, which includes achieving AGI, is trying to bridge the gap between pattern recognition and a genuine understanding of how the world works.

Current generative models excel at producing visually convincing imagery, but, as LeCun and many others have pointed out, they lack the underlying comprehension of cause and effect, motion, and physical interactions that govern our reality.

Addressing this challenge could require a shift in approach. Rather than focusing solely on improving generative capabilities, researchers might need to develop new architectures that can learn and apply physical principles.

This could involve incorporating physics engines, simulations, or novel training methods that emphasise understanding over mere reproduction. Maybe even trying to incorporate 3D models within datasets to give them a better understanding of how objects, including human bodies, could move in certain situations.

Though lesser known, we already have models like MotionCraft, PhyDiff and MultiPhys which make use of physics simulators and 3D models.

The future of AI in visual content creation may not lie in increasingly realistic generative models but in systems that can reason about and manipulate physical concepts. These advancements could lead to AI that avoids body horror and also produces generations that are fundamentally more coherent and aligned with our physical world.

The post Why AI Keeps Creating Body Horror appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Inline Feedbacks
View all comments

Latest stories

You might also like...