Mistral AI Unveils Mistral Large 2, Beats Llama 3.1 on Code and Math

Mistral AI to Raise $487 Mn Nearing $2 Bn Valuation

A day after Meta released Llama 3.1, Mistral AI has announced Mistral Large 2, the latest generation of its flagship model, offering substantial improvements in code generation, mathematics, and multilingual support. The model introduces advanced function-calling capabilities and is available on la Plateforme.

Yesterday, @AIatMeta dropped Llama 405B. Today, @MistralAI drops Mistral Large, their biggest dense model with 123B parameters 🤯.
TL;DR:
🧮 123B Instruct model with 128k context
📃 Non-Commercial License, Research only
🌐 Strong Multilingual, including English, German, French,… pic.twitter.com/cW4PAoAQxD

— Philipp Schmid (@_philschmid) July 24, 2024

With a 128k context window and support for dozens of languages, including French, German, Spanish, and Chinese, Mistral Large 2 aims to cater to diverse linguistic needs. It also supports 80+ coding languages, such as Python, Java, and C++. The model is designed for single-node inference and long-context applications, boasting 123 billion parameters.

Mistral Large 2 is released under the Mistral Research License for research and non-commercial use. It achieves 84.0% accuracy on the MMLU benchmark, setting a new standard for performance and cost efficiency in open models. In code generation and reasoning, it competes with leading models like GPT-4o and Llama 3.

The model’s training focused on reducing hallucinations and ensuring accurate outputs, significantly enhancing its reasoning and problem-solving skills. Mistral Large 2 is trained to acknowledge its limitations in providing solutions, reflecting its commitment to accuracy.

Improvements in instruction-following and conversational capabilities are evident, with the model excelling in benchmarks such as MT-Bench, Wild Bench, and Arena Hard. Mistral AI emphasizes concise responses, vital for business applications.

Mistral Large 2’s multilingual proficiency includes languages like Russian, Japanese, and Arabic, performing strongly on the multilingual MMLU benchmark. It also features enhanced function calling skills, making it suitable for complex business applications.

Users can access Mistral Large 2 via la Plateforme under the name mistral-large-2407. Mistral AI is consolidating its offerings, including general-purpose models Mistral Nemo and Mistral Large, and specialist models Codestral and Embed. Fine-tuning capabilities are now extended to these models.

The model is available through partnerships with Google Cloud Platform, Azure AI Studio, Amazon Bedrock, and IBM watsonx.ai. This expansion aims to bring Mistral AI’s advanced models to a global audience, enhancing accessibility and application development.

Mistral Large 2 is the fourth model from the company in the past week, following the release of MathΣtral, a specialized 7B model designed for advanced mathematical reasoning and scientific exploration.

The company also released Codestral Mamba 7B, based on the advanced Mamba 2 architecture, which is trained with a context length of 256k tokens and built for code generation tasks for developers worldwide. Additionally, Mistral AI introduced Mistral NeMo, a 12-billion parameter model with a 128k token context length, developed in partnership with NVIDIA.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...