India Loves Llama, However Flirts with Mistral and Qwen

Mistral is again within the recreation. The French AI startup is rolling out new fashions one after one other, profitable over builders globally. Nonetheless, it faces stiff competitors from Alibaba’s Qwen, Meta’s Llama, and DeepSeek R1.

Indian AI startup Sarvam AI just lately launched Sarvam-M, a 24-billion parameter hybrid language mannequin constructed on prime of Mistral Small. Nonetheless, some, like Menlo Ventures’ Deddy Das, raised doubts concerning the want for Indic LLMs except they’re clearly world-class. However this doesn’t take consideration away from Mistral.

In a weblog publish, Sarvam AI shared that it utilized SFT and RLVR strategies to fine-tune Mistral Small, which was launched beneath the Apache 2.0 license. The outcome was Sarvam-M, the place “M” stands for Mistral. The mannequin exhibits sturdy features, with a 20% common enchancment on Indian language benchmarks, 21.6% on math, and 17.6% on programming.

Sarvam stated it selected Mistral Small as a result of it could possibly be considerably improved for Indic languages, making it a robust basis for a hybrid reasoning mannequin that helps India’s linguistic range.
Mistral Small launched in January and competes strongly with bigger fashions like Llama 3.3 70B and Qwen 2.5 32B. It matches Llama’s efficiency whereas working over thrice sooner on the identical {hardware}, providing a strong open various to closed-source fashions like GPT-4o-mini.

Mistral has just lately launched a number of fashions, together with Mistral Medium 3, Devstral for coding, Mistral Doc AI (OCR), and Mistral Saba for South Asian languages.

Mistral Saba helps Arabic and lots of Indian-origin languages, and is especially sturdy in South Indian languages corresponding to Tamil.

“An underrated characteristic of Mistral Saba is its functionality with Indic languages. I discover it to be among the finest fashions for its dimension on the subject of Hindi, and I’m tremendous excited to have it working on Groq Inc,” stated Aarush Sah, head of evals at Groq.

“In comparison with fashions like Llama or Falcon, Mistral hundreds sooner, responds faster, and handles constrained environments higher. For purposes the place each millisecond and each GB of RAM issues, it’s a realistic alternative,” stated Pradeep Sanyal, AI and knowledge chief at a worldwide tech consulting firm.

Sarvam AI is Not Alone

AIM reached out to a number of different Indian builders to know their expertise working with Mistral.

Shantipriya Parida, senior AI scientist at Silo AMD, advised AIM that whereas engaged on an open-source analysis venture utilizing out there open-source LLMs, Mistral emerged as one of many prime three fashions they evaluated for Indic languages, alongside Llama and Qwen.

“Whereas constructing a Hindi-based AI tutor, we discovered Mistral-7B significantly efficient for duties involving Indic language understanding and era. For instance, certainly one of our Hindi AI tutors famous its fluency and contextual accuracy in dialogue methods,” he stated.

He added that they most well-liked Mistral-7B over different open-source LLMs for AI tutor purposes, primarily based on each automated and guide evaluations. They performed the guide analysis of question-and-answer duties utilizing three key metrics: readability, correctness and perplexity.

The Problem from Qwen and Llama

Nonetheless, with the rising recognition of different open-source fashions, builders are exploring choices.

Adithya S Kolavi, founding father of CognitiveLabs, advised AIM that they initially began with Mistral, however now largely use Llama fashions and Qwen for his or her analysis, as these cowl all their wants.

Nonetheless, he added that individuals appear to love Mistral as a result of fine-tuning is well-supported, permitting the crew to adapt fashions successfully for particular duties. He additionally talked about that builders proceed to experiment with totally different architectures, together with the combination of consultants (MoE) and dense variants, to discover efficiency trade-offs and optimise mannequin effectivity.

However, Pratik Desai, founding father of KissanAI, doesn’t assume too extremely of Mistral fashions. “I don’t love Mistral. I don’t use them, by no means educated them, and I’m not a giant fan. If I’ve to decide on, Qwen fashions are far superior in each side,” he advised AIM.

Notably, Alibaba just lately launched the Qwen3 household of open-weight fashions, starting from 0.6B to 235B parameters. The flagship 235B mannequin, with 22B energetic parameters, beats OpenAI’s o1 and o3-mini on math and coding benchmarks and matches Google’s Gemini 2.5 Professional on a number of checks.

He additional added that Llama fell out of grace after it educated big MoE fashions which are tough for smaller startups or resource-constrained corporations.

Llama 3.1 noticed sturdy adoption in India, not like the newer Llama 4 fashions Scout and Maverik, which have but to realize related traction. Sarvam AI had beforehand labored with Llama 3.1. Founder Vivek Raghavan advised AIM they used the 405B variant to construct Sarvam 2B.

Don’t Overlook DeepSeek

DeepSeek R1, which made waves globally, is now seeing adoption in India. As introduced by IT minister Ashwini Vaishnaw, the mannequin is at the moment hosted on Indian servers.

Ola chief Bhavish Aggarwal has additionally made DeepSeek R1 out there by means of the Krutrim Cloud platform.

In the meantime, Mumbai-based AI firm Fractal has launched a brand new open-source massive language mannequin, Fathom-R1-14 B. The mannequin delivers sturdy mathematical reasoning efficiency, surpassing o1-mini and o3-mini, and coming near o4-mini, with a post-training value of simply $499.

Fathom-R1-14B is a 14-billion-parameter mannequin derived from DeepSeek-R1-Distilled-Qwen-14 B. It was developed as a part of a proposed initiative to construct India’s first large-scale reasoning mannequin beneath the IndiaAI mission.

It wouldn’t be unsuitable to say India’s AI battleground has shifted. The race is not about having the most important mannequin however about delivering smarter, sooner, and related options for India’s distinctive challenges.

The publish India Loves Llama, However Flirts with Mistral and Qwen appeared first on Analytics India Journal.

India Loves Llama, However Flirts with Mistral and Qwen

Sarvam AI is Not Alone

The Problem from Qwen and Llama

Don’t Overlook DeepSeek

Latest stories

CMS Uses Machine Learning to Fully Reconstruct LHC Collisions

LANL: AI Accelerates Elucidation of Nuclear Forces with Explosive Neutron...

PNNL: Integrating AI into Biological Research

Rick Stevens on the Genesis Mission and the Future of...

Inside the DOE’s 26 AI Challenges for Genesis Mission

You might also like...

CMS Uses Machine Learning to Fully Reconstruct LHC Collisions

LANL: AI Accelerates Elucidation of Nuclear Forces with Explosive Neutron Star Data

PNNL: Integrating AI into Biological Research