Indian AI Models That Made History in 2024

2024 is turning out to be a game-changer for AI in India. With the government supporting initiatives for homegrown AI Startups, the country is witnessing some big changes in the tech and business sectors. By integrating local languages, cultural relevance, and practical applications, AI models this year have empowered Indian businesses and set benchmarks for the global AI ecosystem.

This article describes the best Indian AI models launched in 2024, focusing on how they help India advance AI developments worldwide.

  1. BharatGen

The Indian government recently unveiled BharatGen, the first government-funded initiative for developing multimodal LLMs. BharatGen stands for building “GenAI for Bharat, by Bharat.”

The initiative recently unveiled e-vikrAI—an advanced solution powered by Vision Language Models, designed for product images in Indic e-commerce. e-vikrAI simplifies cataloguing for sellers, eliminating the need for manual input. By simply uploading a product image, sellers receive auto-generated titles, descriptions, features, and pricing suggestions, all with a deep understanding of Indic culture.

  1. NVIDIA Nemotron-4-Mini-Hindi-4B

During his visit to India, NVIDIA CEO Jensen Huang launched the Nemotron-4-Mini-Hindi-4B model, a small but powerful Hindi language model designed to help businesses create AI solutions for regional demands.

This model, part of NVIDIA’s NIM microservice, can be deployed on NVIDIA GPU-accelerated systems, optimising performance for various applications.

Tech Mahindra is the first to implement this model, creating Indus 2.0, which focuses on Hindi and its dialects. The Nemotron Hindi model has 4 billion parameters and was derived from a 15-billion-parameter multilingual model, Nemotron-4. It was trained with real-world Hindi data and synthetic data, including English.

After fine-tuning with NVIDIA NeMo, it leads on multiple accuracy benchmarks for AI models with up to 8 billion parameters. Packaged as a microservice, it supports various industry applications, including education and healthcare.

  1. Krutrim

Earlier this year, Bhavish Aggarwal’s Ola Krutrim launched its first AI model, Krutrim. The model can understand 22 Indian languages and generate text in 10, including Hindi, Marathi, Bengali, Tamil, Kannada, Telugu, Odia, Gujarati, and Malayalam.

Krutrim has been trained on over 2 trillion tokens, with a strong focus on Indian data, making it uniquely suited to reflect the nuances of India’s heritage and culture.

  1. Sarvam-1

Indian AI startup Sarvam AI has launched Sarvam-1, the first LLM optimised specifically for Indian languages.

Developed with 2 billion parameters, Sarvam-1 supports 10 major Indian languages—Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu.

One of Sarvam-1’s key features is its computational efficiency. The model offers four to six times faster inference speeds compared to larger models like Gemma-2-9B and Llama-3.1-8B while maintaining competitive performance levels. This makes Sarvam-1 particularly suitable for deployment in production environments, including edge devices where computing resources may be limited.

  1. Surya OCR

Surya OCR is a comprehensive optical character recognition (OCR) toolkit developed to handle a wide range of document types and languages. Created by Vik Paruchuri, it recently launched its v2, which offers improved accuracy across all document types and outperforms both Tesseract and Google Cloud OCR.

Surya can detect various layout elements such as tables, images, headers, etc., and determine their arrangement within the document

Surya OCR can be easily installed via pip and requires Python 3.9+ along with PyTorch. The model weights are automatically downloaded during the first run, simplifying the setup process. A reliable API is also available for those preferring hosted solutions. This API supports various document formats and ensures consistent performance for different OCR needs.

Surya also offers a robust command-line interface, allowing users to perform tasks like OCR, text detection, layout analysis, and reading order detection directly from the terminal. Additionally, Surya can be integrated into Python scripts, enabling custom OCR pipelines and preprocessing workflows for more advanced use cases.

  1. Everest 1.0

SML’s Hanooman introduced Everest 1.0, a multilingual AI system supporting Indian languages, including Hindi, Bengali, Tamil, and Telugu. It currently supports 35 languages, with plans to expand to 90 languages in the coming months. Powered by the executable expert model (EEM) architecture, Everest 1.0 handles tasks such as real-time data access, predictive insights, and image analysis.

It seeks to improve access and inclusivity in areas like customer service, education, healthcare, and finance.

  1. Chitralekha

Chitralekha is an open-source video transcreation platform supported by the EkStep Foundation and built using AI models developed in-house by AI4Bhārat. The platform is open to further enhancements and new feature integrations.

Originally developed for video annotation, it allows users to auto-generate and edit audio transcripts in Indic languages. Its key features include subtitle generation and download, audio/video dubbing, and video translation across multiple Indic languages.

  1. Airavata

Airavata is an open-source instruction-tuned LLM for Hindi, developed by AI4Bharat, a research lab at IIT Madras. Released in 2024, Airavata was created by fine-tuning the OpenHathi model (developed by Sarvam AI) using diverse instruction-tuning datasets in Hindi.

The model, named after the Sanskrit word for “elephant”, improves performance on a wide range of assistive tasks in Hindi. Airavata was trained using human-curated, license-friendly datasets, including translated versions of English instruction-tuning datasets, to ensure sustainability and avoid licensing restrictions.

  1. Sutra

Two AI, a startup founded by Pranav Mistry, has introduced a family of models called SUTRA. Its dual-transformer architecture powers multilingual, cost-efficient AI solutions in over 50 languages, including several South Asian languages like Gujarati, Marathi, Tamil, and Telugu. The model features two mixture-of-experts transformers: a concept model and an encoder-decoder for translation, working together to deliver efficient and accurate language processing.

  1. Devika

Devika is an open-source AI software engineer developed by Mufeed VH, a 21-year-old developer from Kerala. Created as an alternative to Cognition Labs’ Devin AI, Devika understands high-level human instructions, breaks them down into steps, researches relevant information, and writes code to achieve given objectives.

Devika utilises advanced AI models like Claude 3, GPT-4, GPT-3.5, and local LLMs via Ollama. It can execute the code it generates, autonomously fix errors, and even deploy static websites on Netlify.

The post Indian AI Models That Made History in 2024 appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...