In August 2023, IBM Chairman and CEO Arvind Krishna said India must develop sovereign capability in AI along with a computing and data infrastructure. “You need a way for the government and private companies to be able to leverage that in a way unique to India,” he said during his visit to India for the B20 Summit.
And India appears to be doing exactly that. Union Minister Rajeev Chandrasekhar, while speaking at a Financial Express event, recently said India has the opportunity to develop something very sovereign and unique and in line with the Digital Public Infrastructure (DPI) like Unified Payment Interface (UPI) and Aadhar, in which India has found tremendous success. Now, with AI, India wants to take the same DPI approach.
“We are determined that we must have our own sovereign AI. We can take two options. One is to say, as long as there is an AI ecosystem in India whether that is driven by Google, Meta, Indian startups, and Indian companies, we should be happy about it. But we certainly don’t think that is enough,” the minister said.
AI as Digital Public Infrastructure
The concept of Sovereign AI is no longer theoretical, gaining traction globally. Countries like France, UAE, and Singapore are even Europe considering its implementation.
However India’s approach to technology has been quite distinctive from the West. India sees technology as an enabler and the focus will always be on maximising economic development and real-life use cases in agriculture, healthcare and governance.
For example, Bhashini, a division of the Ministry of Electronics and Information Technology (MeitY), is testing a Whatsapp chatbot powered by OpenAI’s GPT models, which will answer their queries sent through voice notes in Indic languages. The chatbot could particularly be useful for Indian farmers who may not be accustomed to typing on smartphones.
Future use cases are anticipated to follow a similar trajectory, where the technology will be designed or leveraged by Indian enterprises and benefit the common citizen. Moreover, Chandrasekhar has also emphasised that AI is a kinetic enabler for India’s digital economy and could play a key role in the country’s ambitious plan of a USD 1 trillion digital economy by 2026.
India’s approach towards sovereign AI
While Chandrasekhar has not revealed how India’s approach towards sovereign AI will be, it could be very similar to Singapore‘s approach, where they are developing a base model with a regional context which will cater to Singapore’s unique linguistic characteristics.
As it stands, tokenisation costs for non-English languages in large language models like GPT-3.5 or GPT-4 are higher due to limited training data, model complexity, linguistic nuances, resource-intensive training, and the need for extensive evaluation and customisation, making adaptation more challenging and costly.
Considering India’s linguistic diversity, the government could create an open-source base model with multilingual capabilities. This model could be fine-tuned and utilised by the public and private sectors for various applications.
“The only other way for sovereign AI is to have a government, not curated, or managed, or approved but a government-sponsored India database platform,” the minister said. During the discussion, he added that it could be registered as a Section 8 type of non-profit company or a public-private partnership project over time.
Moreover, as per media reports, India’s sovereign AI programme could be announced during the three-day Global Partnership on Artificial Intelligence (GPAI) summit hosted by India in New Delhi.
Data infrastructure
India has many languages. Besides the 22 official languages, there are hundreds of languages and thousands of dialects spoken in different parts of the country. Building data sets for these distinctive languages will be the first big challenge, but work is already underway.
Through Bhashini, the government is already building open-source datasets of Indic languages like Assamese, Bengali, Hindi, Marathi and Dogri, among others. The Minister also said that the government is developing a framework to allow Indian startups and enterprises to use anonymised personal data with consent.
“We’re also creating a framework where Indian startups and the Indian AI research and innovation ecosystem will have preferential access, curated access to this huge data sets programme,” the minister said.
However, building datasets on all Indian languages by the government could take years. “If we had to collect as much data in Indian languages as went into a LLM like GPT, we’d be waiting another 10 years,” Kalika Bali, principal researcher at Microsoft Research India, told the Thomson Reuters Foundation.
Hence, to make this possible, a public-private partnership (PPP) model could be the right approach. IT giant Tech Mahindra is already working on a project to develop a 7 billion parameter LLM which will initially support 40 different Hindi dialects.
Moreover, building a base model capable of conversing in over 100 Indian languages is a challenging endeavour.“ So what we can do is create layers on top of generative AI models such as ChatGPT or Llama,” Bali suggested.
It will be intriguing to observe the approach India adopts. However, one clear requirement for the initiative’s success is the support of the right infrastructure, including robust computing capabilities.
Computing infrastructure
To meet India’s computation needs, the government is turning its attention to NVIDIA, the company making the most advanced Graphics Processing Units (GPUs), used for accelerating the training and inference processes of AI models.
Colette Kress, Chief Financial Officer at NVIDIA, speaking during the company’s earnings call on November 21, said that NVIDIA is already working with India’s government and largest tech companies including Infosys, Reliance and Tata to boost their sovereign AI infrastructure.
“Many countries are awakening to the need to invest in sovereign AI infrastructure to support economic growth and industrial innovation. With investments in domestic compute capacity, nations can use their own data to train LLMs and support their local generative AI ecosystems,” she said.
In September, NVIDIA founder and CEO Jensen Huang met Narendra Modi at his official residence in New Delhi. The government is also looking to build a 25,000 GPU cluster for INR 8000-10,000 crores. The project is expected to follow a public-private partnership model and the AI compute capacity will be provided to startups as a service.
The post India Plans to Replicate UPI Model with AI appeared first on Analytics India Magazine.