This Summer, Expect GPT-5 and Llama 3 to Heat Up the LLM Race

The LLM race has indeed heated up, with many models now reaching the capabilities of GPT-4. Cohere’s latest open-source model, Command R+, recently climbed to the 6th spot, matching the GPT-4 level by over 13,000 human votes!

Things are really heating up in AI:
* New @MistralAI 7x22B MoE (170B) model just came out – we'll see how it performs over the next few weeks!
* @cohere released Command R+, by far the best public (non-commercial use-case only) LLM judging by the lmsys benchmark.
* New GPT-4… pic.twitter.com/NcjHQ2rCtq

— Aleksa Gordić 🍿🤖 (@gordic_aleksa) April 10, 2024

Anthropic’s Claude 3 Opus outperforms GPT-4 on common benchmarks like MMLU and HumanEval. Meanwhile, Elon Musk has announced that xAI’s next model, Grok-2, will begin training in May and is expected to surpass GPT-4. Most recently, Mistral has introduced its latest model, 8X22B.

Apparently the new Mistral model beats Claude Sonnet and is a tad bit worse than GPT-4
In a couple of months, the open source community will fine tune it to beat GPT-4
This is a fully open weights model with an Apache 2 license! I can’t believe how quickly the OSS community…

— Bindu Reddy (@bindureddy) April 10, 2024

On the other hand, Google’s Gemini 1.5 which features the longest 1 million context window is now available in 180+ countries via the Gemini API in public preview. It also includes native audio (speech) understanding capability and a new File API for simplified file handling.

Apple is also not left behind. Its latest LLM model, ReALM, matches the performance of OpenAI GPT-4.

Hey, look, who’s catching up?

It’s quite uncommon for OpenAI to be catching up with the new models emerging in the market. While GPT-4 has maintained its top position for the past year, it has lost its lead for the first time.

At the recent Google Next 24, Google enhanced several capabilities of Gemini 1.5 Pro, including better system instruction and JSON mode. Shortly after, OpenAI also announced GPT-4 Turbo with Vision, which has ‘improved reasoning capabilities.’

OpenAI’s release of GPT-4 Turbo with Vision is definitely a stopgap measure to ensure the hottest AI startup stays relevant.

“The new GPT-4 definitely feels better at coding. It is less lazy and more willing to write code. I was able to give it a few files, and it wrote perfect code (which was very uncommon before),” wrote Sully Omar, founder of Cognosys, on X.”

Altman himself feels that while GPT-4 is great, it’s time for the company to introduce a new model that is far better than GPT-4. “I think it sucks,” said Altman regarding GPT-4 in a recent interview with Lex Fridman. “I expect that the delta between five and four will be the same as between 4 and 3,” he added.

He further said that the company will release GPT-5 in the ‘coming months,’ adding that OpenAI has more important things to release before GPT-5. “Before we talk about a GPT -5-like model… I know we have a lot of other important things to release first,” said Altman.

In the episode of Unconfuse Me with Bill Gates, Altman also spoke at length with Gates about how GPT-5 would emphasise on customisation and personalisation. “The ability to know about you, your email, your calendar, how you like appointments booked, connected to other outside data sources—all of that. Those will be some of the most important areas of improvement,” said Altman.

Furthermore, he claimed that GPT-5 would have much better reasoning capabilities. “GPT-4 can reason in only extremely limited ways. Also, reliability is a concern. If you ask GPT-4 most questions 10,000 times, one of those 10,000 is probably pretty good, but it doesn’t always know which one. You’d like to get the best response of 10,000 each time,” said Altman.

Meta’s Llama 3 is Around the Corner

While everyone is excitedly waiting for GPT-5, Meta is quietly working on Llama 3. During a recent event in London, Meta announced its plans for the initial launch of Llama 3 set for release within the next month. The company did not disclose the size of the parameters used in Llama 3, but it’s expected to have about 140 billion parameters.

According to recent reports, Meta is planning to launch two smaller versions of Llama 3 next week. These smaller models are expected to serve as a precursor to the launch of the largest version of Llama 3, anticipated this summer.

“Can’t wait to start playing with the 7B version of Llama-3. It will be a HUGE winner if it can beat Claude Haiku. It will also give us a huge clue, if the big Llama-3 model beats Claude Opus,” wrote Bindu Reddy, Abacus AI chief, on X.

Llama 3 is expected to be multimodal, capable of understanding and generating both text and images. Additionally, it is expected to have enhanced reasoning skills.

Meta researchers are working to ensure that Llama 3 can handle controversial and tricky questions, a capability that Llama 2 lacked. They are enhancing Llama 3 to engage users effectively, providing context and addressing difficult questions instead of avoiding them.

Meta is planning to integrate its new AI model into WhatsApp and its Ray-Ban smart glasses as well.

The company is planning to launch Llama 3 in a range of model sizes suitable for various applications and devices. “There will be a number of different models with different capabilities and different versatility [released] during the course of this year, starting really very soon,” said Chris Cox, Meta’s Chief Product Officer.

Earlier this year, Meta chief Mark Zuckerberg announced that Meta is training Llama 3 using a massive compute infrastructure. The company plans to procure 350k H100s by the end of this year, with an overall total of almost 600k H100s equivalent of compute if other resources are included.

With both OpenAI and Meta planning to launch their new models this summer, temperatures are bound to rise.

The post This Summer, Expect GPT-5 and Llama 3 to Heat Up the LLM Race appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...