
How are Indian firms training LLMs? | Explained
The Hindu
Explore how Indian firms are training Large Language Models, overcoming challenges with data, capital, and innovative architectures.
The story so far:
At the AI Impact Summit, the Bengaluru-based startup Sarvam AI released two Large Language Models (LLMs), which are the foundation for AI systems that power services like Google’s Gemini and OpenAI’s ChatGPT. The two models were trained on 35 billion and 105 billion parameters respectively, and were less power- and compute-intensive than comparable models, while demonstrating improvements over other models in Indian languages, Pratyush Kumar, a Sarvam co-founder said.
LLMs are trained and operated on clusters of Graphics Processing Units (GPUs). The combined cost of the GPUs and the electricity needed to run them long enough to train a model, run into millions of dollars. The grist for this mill is data, largely scraped from the Internet, where English, European languages and East Asian languages like Korean and Japanese are more richly represented than Indian languages.
This creates a twofold challenge for training an LLM on Indian soil with Indian capital: for one thing, with scarce data sources, many LLMs either perform worse when operating on Indian languages, or burn more “tokens” on inference to translate sentences into English (and translating responses back) to perform better. Since machine translation has improved dramatically for Indian languages, this remains the gold standard for many LLMs. Secondly, since capital is also scarce, efforts to train an LLM by Indian firms targeting Indian users can be challenging, especially if there is no immediate business use case for doing so.
Using translations as a fulcrum can be a challenge for developers who want to leverage local LLMs — like Sarvam’s 35 billion parameter model, which was shown off in a demo during the summit’s research symposium working on a feature phone — where suboptimal performance in Indian languages can impact adoption and quality of performance.
The IndiaAI Mission has subsidised efforts to conduct training in India, by commissioning over 36,000 GPUs in data centres operated by Indian firms like Yotta, and allowing researchers and startups to run training and inference workloads at a relatively nominal fee. The government gave Sarvam access to 4,096 GPUs from its common compute cluster, and the subsidy so far is estimated at almost ₹100 crore. The “bill of materials” for this cluster is ₹246 crore, though these GPUs can probably be continued to be used by others.

The draft policy for “Responsible Digital Use Among Students”, released on Monday by the Department of Health and Family Welfare, has recommended that parents set structured routines with clear screen-time rules and prioritise privacy, safety, and open conversation with children on digital well-being.












