Meta Likely to Release Llama 4 Early Next Year, Pushing Towards Advanced Machine Intelligence (AMI)

AI models are getting better at reasoning — of sorts. OpenAI’s o1, for instance, has levelled up enough to earn a cautious nod from Apple.

Meanwhile, Kai-Fu Lee’s 01.AI is also making waves with Yi-Lightning, claiming to outpace GPT-4o on reasoning benchmarks. With China’s models catching up fast, Meta is also stepping up Llama’s game.

The big question: can Meta bring Llama’s reasoning closer to the likes of GPT-4o and o1? Manohar Paluri, VP of AI at Meta told AIM that the team is exploring ways for Llama models to not only “plan” but also evaluate decisions in real time and adjust when conditions change.

This iterative approach, using techniques like ‘Chain of Thought,’ supports Meta’s vision of achieving “advanced machine intelligence” that can effectively combine perception, reasoning, and planning.

Meta AI chief Yann LeCun believes that advanced machine intelligence or AMI, also known as “friend” in French, systems can truly help people in their daily lives. This, according to him, involves developing systems that can understand cause and effect, and model the physical world.

This might also be an alternative term for AGI or ASI, which OpenAI is so obsessed with achieving–or most likely has already achieved internally by now. That explains why Sam Altman recently debunked the rumours of Orion (GPT-5) being released this December, labelling them as “fake news out of control,” waiting for Google, Meta and others to catch up.

(check out the video below, which was released by OpenAI five years ago)

AGI or AMI talks aside, Paluri further highlighted that reasoning in AI, particularly in “non-verifiable domains”, requires breaking down complex tasks into manageable steps, which allows the model to dynamically adapt.

For example, planning a trip involves not only booking a flight but also handling real-time constraints like weather changes, which may mean rerouting to alternative transportation. The fundamental learning aspect here is the ability to know that I’m on the right track and to backtrack if needed. That’s where future Llama versions will excel in complex, real-world problem solving,” he added.

Recently, Meta unveiled Dualformer, a model that dynamically switches between fast, intuitive thinking and slow, deliberate reasoning, mirroring human cognitive processes and enabling efficient problem-solving across tasks like maze navigation and complex maths.

What’s Llama’s Secret Sauce?

Meta said that it leverages self-supervised learning (SSL) during its training to help Llama learn broad representations of data across domains, which allows for flexibility in general knowledge.

RLHF (reinforcement learning with human feedback), which currently powers GPT-4o and majority of other models today, however, focuses on refining behaviour for specific tasks, ensuring that the model not only understands data but aligns with practical applications.

Meta is combining models that are both versatile and task-oriented. Manohar said that SSL builds a foundational understanding from raw data, while RLHF aligns the model with human-defined goals by providing specific feedback after tasks.

Self-supervised learning enables models to pick up general knowledge from vast data autonomously. In contrast, RLHF is about task-specific alignment; it’s like telling the model ‘good job’ or ‘try again’ as it learns to perform specific actions.”

Enables High Quality Synthetic Data Generation

That explains how Llama has become a preferred choice for synthetic data generation. Llama 3.1 405B, in particular, is trained to generate data that supports language-specific nuances, improving model effectiveness in regions where data scarcity is a barrier. By focusing on Indic language datasets, Meta is looking to strengthen its model’s ability to cater to multilingual environments, boosting accessibility and functionality for speakers of these languages.

Case in point: These models empower startups like Sarvam AI and others and scale across Meta’s platforms—WhatsApp, Instagram, Facebook, and Threads—for impactful, region-specific AI solutions.

Speaking at India’s biggest AI summit, Cypher 2024, Vivek Raghavan, the chief of Sarvam AI, revealed that they used Llama 3.1 405B to build Sarvam 2B. He explained that it is a 2 billion parameter model with 4 trillion tokens, of which 2 trillion are Indian language tokens.

“If you look at the 100 billion tokens in Indian languages, we used a clever method to create synthetic data for building these models using Llama 3.1 405B. We trained the model on 1,024 NVIDIA H100s in India, and it took only 15 days,” said Raghavan.

“So, one important thing to consider is when you think about the flagship models like 405B, they’re very expensive for inference,” shared Paluri, saying that these models possess the capability to generate high-quality synthetic data, particularly for underserved languages like those in the Indic family, which is often challenging to source in traditional datasets.

“Synthetic data generated by Llama 3.1 405B is instrumental in building diverse language resources, making it feasible to support languages like Hindi, Tamil, and Telugu, which are often underserved in standard datasets,” said Paluri.

Llama 4, when?

Meta CEO Mark Zuckerberg, in a recent interview with AI influencer Rowan Cheung, said the company has already started pre-training for Llama 4. Zuckerberg added that Meta has set up compute clusters and data infrastructure for Llama 4, which he expects to be a major advancement over Llama 3.

Meta’s VP of product, Ragavan Srinivasan, at Meta’s Build with AI Summit, hinted at releasing “Next gen” Llama models by 2025, with native integrations, extended memory and context capabilities, cross-modality support, and expanded third-party collaborations, while advancing memory-based applications for coding and leveraging deep hardware partnerships.

Paluri joked that if you asked Zuckerberg about the timeline, he’d probably say it would be released “today”, highlighting his enthusiasm and push for rapid progress in AI development.

Citing Llama 3 release in April, 3.1 in July, and 3.2 in September, he outlined the rapid iteration of Llama model releases, highlighting that the team strives to release new versions every few months to continually improve AI capabilities.

“We want to maintain a continuous momentum of improvements in each generation, so developers can expect predictable, significant upgrades with every release,” said Paluri, hinting at ‘next gen’ Llama to be released—potentially around early to mid-2025 if Meta continues this cadence of frequent updates.

Quantisation of LLMs: Meta recently introduced quantized versions of its Llama 3.2 models, enhancing on-device AI performance with up to four times faster inference speeds, a 56% reduction in model size, and a 41% decrease in memory usage.

The post Meta Likely to Release Llama 4 Early Next Year, Pushing Towards Advanced Machine Intelligence (AMI) appeared first on AIM.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...