Sarvam AI Releases Indic Dataset ‘Samvaad’

Sarvam AI has released “samvaad,” a new open-source series of carefully curated India datasets. This release includes 100,000 high-quality, multi-turn conversations, totaling over 700,000 turns, in English, Hindi, and Hinglish.

These datasets have been thoughtfully curated with an exclusive focus on an Indic context and are now accessible on Hugging Face. For developers and enthusiasts operating within the Indic space, this announcement promises valuable resources, with the potential for more exciting releases in the near future.

Sarvam AI, in collaboration with Hugging Face, invites the community to stay tuned for forthcoming updates. To engage with the community and explore these datasets further, Sarvam AI encourages interested parties to join their Discord channel at https://lnkd.in/eXCbKTF5.

Sarvam AI recently partnered with Microsoft Azure to make its Indic voice large language model (LLM) available on Azure. Sarvam AI is building generative AI models targeting Indic languages and contexts. The startup aims to make the development and deployment of generative AI apps in India more accurate and cost effective.

Sarvam AI’s Indic voice LLM aims to offer a natural voice-based interface to language models (LLMs) and will initially be available in Hindi. Sarvam AI is actively working to expand coverage to include more Indian languages while ensuring support for colloquial language use.

Sarvam AI recently released OpenHathi-Hi-v0.1, the first Hindi LLM in the OpenHathi series. Developed on a budget-friendly platform, the model, an extension of Llama2-7B, boasts GPT-3.5-like performance for Indic languages.

The Bangalore-based startup also raised USD 41 million in a Series A funding round led by Lightspeed and supported by Peak XV Partners and Khosla Ventures. Sarvam’s objective is not just to build open-source Indic LLMs but to develop a platform and help build AI-powered applications that can be deployed at a population scale.

The post Sarvam AI Releases Indic Dataset ‘Samvaad’ appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...