OpenAI’s newly released GPT-4o mini dominates the Chatbot Arena. Here’s why.

One week ago, OpenAI released GPT-4o mini. In that short time, it has already been updated and climbed the leaderboards of the Large Model Systems Organization (LMSYS) Chatbot Arena, ahead of giants such as Claude 3.5 Sonnet and Gemini Advanced.

The LMSYS Chatbot Arena is a crowdsourced platform where users can evaluate large language models (LLMs) by chatting with two LLMs side by side and comparing their responses to each other without knowing the models' names.

Also: Want to try GPT-4o mini? 3 ways to access the smarter, cheaper AI model – and 2 are free

Immediately after its unveiling, GPT-4o mini was added to the Arena, where it quickly climbed to the top of the leaderboard behind GPT-4o. This is especially notable because GPT-4o mini is 20 times cheaper than its predecessor.

As the results came out, some users took to social media to express apprehensions about how such a new mini model could rank higher than more established, robust, and capable models such as Claude 3.5 Sonnet. To address the concerns, LMSYS — posting on X — explained the factors contributing to GPT-4o mini's high placement, highlighting that the Chatbot Arena positions are informed by human preferences depending on the votes.

For users interested in learning which model works better, LMSYS encourages them to look at the per-category breakdowns to understand technical capabilities. These can be accessed by clicking the Category dropdown that says "Overall" and selecting a different category. When you visit the various category breakdowns — such as coding, hard prompts, and longer queries — you will see a variation in the results.

Also: OpenAI launches SearchGPT – here's what it can do and how to access it

In the coding category, GPT-4o mini is ranked third behind GPT-4o and Claude 3.5 Sonnet, which holds first place. However, GPT-4o mini is number one in other categories, such as multi-turn, conversations greater than or equal to two turns, and longer query queries equal to or greater than 500 tokens.

Chatbot Arena results in the "coding" category.

If you want to try GPT-4o mini, visit the ChatGPT site and log into your OpenAI account. If you would rather participate in the Chatbot Arena and let luck show you GPT-4o mini, you can start by visiting the website, clicking Arena side-by-side, and then entering a sample prompt.

OpenAI’s newly released GPT-4o mini dominates the Chatbot Arena. Here’s why.

Artificial Intelligence

Claude Now Available in Xcode for Developers

Tax Concerns Weigh Heavy on GCC Strategy

Google’s Nobel-Winning AI Scientist Says Learning How To Learn Is The Key Skill in the AI Age

Latest stories

Tax Concerns Weigh Heavy on GCC Strategy

Claude Now Available in Xcode for Developers

Google’s Nobel-Winning AI Scientist Says Learning How To Learn Is...

Google Launches Agent Payments Protocol to Standardise AI Transactions

Nagaland University Brings Fractals Into Quantum Research

You might also like...

Tax Concerns Weigh Heavy on GCC Strategy

Claude Now Available in Xcode for Developers

Google’s Nobel-Winning AI Scientist Says Learning How To Learn Is The Key Skill in the AI Age