Meta has been one of the biggest proponents of self-supervised learning when it comes to AI. Today, Meta AI has announced I-JEPA, a self-supervised computer vision model that learns the world by predicting it, based on Yann LeCun’s vision of autonomous machine intelligence to learn and reason similar to how humans and animals do.
Click here to check it out. The paper is also being presented at CVPR2023 next week.
The model’s training code and checkpoints are open-sourced under a non-commercial licence.
I-JEPA (Image Joint Embedding Predictive Architecture) learns by creating an internal model representing the outside world and compares abstract representation of images, instead of comparing the pixels.
According to the paper, this model delivers strong performance on various computer vision tasks, and is highly efficient than other similar models.
I-JEPA offers versatile applicability without requiring extensive fine-tuning. Meta AI successfully trained a visual transformer model with 632M parameters utilising 16 A100 GPUs within a span of 72 hours. This model attains state-of-the-art results for low-shot classification on ImageNet, with a mere 12 labelled examples per class. In comparison, alternative approaches often consume two to 10 times the GPU-hours and yield inferior error rates when trained with an equivalent dataset size.
By learning from representations instead of pixels, the model is able to avoid biases and issues that occur due to invariance-based pre-training. This also enables the model to learn directly from the images, instead of representation, which Meta AI says is a problem with the current LLM models.
The Theory
Last year, Yann LeCun, Meta’s Chief AI Scientist, introduced an innovative architecture designed to address the significant constraints faced by contemporary AI systems, called the world model. LeCun envisions the development of machines capable of rapidly acquiring internal models of the world’s dynamics, enabling them to efficiently learn, strategise for complex tasks, and seamlessly adapt to novel circumstances.
This work by Meta AI is highly based on the hypothesis that common sense information is the key for enabling intelligent behaviour. This knowledge is achieved by passively observing the world which is stored on the background of the mind. Meta believes that self-supervised learning is the path towards human-like intelligence.
For this to work, the system needs to acquire these representations through self-supervised learning, which entails learning directly from unlabeled data like images or sounds, instead of relying on manually curated labelled datasets.
Meta AI demonstrates the potential of I-JEPA, showcasing the ability to learn competitive off-the-shelf image representations without relying on additional knowledge encoded through manually designed image transformations.
Advancing JEPAs further to acquire broader world-models from richer modalities would be particularly intriguing. This advancement could enable making long-range spatial and temporal predictions about future events in videos based on a short context, while conditioning these predictions on audio or textual prompts.
The post Meta AI Chief Yann LeCun’s Self-Supervised Idea Finally Sees Open Source Light appeared first on Analytics India Magazine.