xAI’s Grok-2 Ranks Second on the Chatbot Arena Leaderboard, Competing with Gemini 1.5 and GPT-4o

Elon Musk’s xAI Open Sources Grok

In an exciting development from the xAI team, Grok-2 and Grok-Mini have officially secured positions on the LMSys Chatbot Arena leaderboard. Grok-2 has taken the #2 spot, surpassing GPT-4o (May) and tying with the latest Gemini model, driven by over 6,000 community votes.

Meanwhile, Grok-2-Mini has earned the #5 position.

Chatbot Arena update❤️‍🔥
Exciting news—@xAI's Grok-2 and Grok-mini are now officially on the leaderboard!
With over 6000 community votes, Grok-2 has claimed the #2 spot, surpassing GPT-4o (May) and tying with the latest Gemini! Grok-2-mini also impresses at #5.
Grok-2 excels in… pic.twitter.com/5lyQgratJQ

— lmsys.org (@lmsysorg) August 23, 2024

Grok-2 has excelled particularly in mathematical tasks, ranking #1 in this category, and secured the #2 positions across various other tasks, including hard prompts, coding, and instruction-following.

Additionally, Grok-2-Mini has undergone significant speed enhancements, now performing twice as fast as before. This boost was achieved after xAI’s inference team as they completely rewrote the inference stack using SGLang, enabling more efficient multi-host inference and improved accuracy.

The team also introduced new algorithms for computation and communication kernels, alongside better batch scheduling and quantisation, further enhancing the models’ performance.

Grok 2 mini is now 2x faster than it was yesterday. In the last three days @lm_zheng and @MalekiSaeed rewrote our inference stack from scratch using SGLang (https://t.co/M1M8BlXosH). This has also allowed us to serve the big Grok 2 model, which requires multi-host inference, at a… pic.twitter.com/G9iXTV8o0z

— ibab (@ibab) August 23, 2024

Several people are still sceptical about the performance. OpenAI’s GPT-4o, which claims the top spot, does not perform as well as Claude 3.5, which is at the 5th spot. Though, people have started experimenting with Grok-2 and claim that the model is actually brilliant in coding and maths related tasks.

Released in Beta this month, the Grok-2 family of models are also available for testing on X. The model also allows users to generate images using the FLUX.1 image generation model.

The post xAI’s Grok-2 Ranks Second on the Chatbot Arena Leaderboard, Competing with Gemini 1.5 and GPT-4o appeared first on AIM.

Follow us on Twitter, Facebook
1 1 vote
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...