
AI: is India falling behind? Premium
The Hindu
Government and startups in India aim to create a foundational AI model, facing challenges in data and cost.
The Government of India and a clutch of startups have set their sights on creating an indigenous foundational Artificial Intelligence large language model (LLM), along the lines of OpenAI’s ChatGPT, Google’s Gemini, and Meta’s Llama. Foundational AI, or LLMs, are manually trained systems that can churn out responses to queries. Training them requires large amounts of data and enormous computing power, two resources that are abundant on the internet and in the cyberspaces of Western countries respectively.
In India, the crucial advance of creating a homegrown LLM is likely to be an uphill climb, albeit one that the government and startups are keen on achieving. Hopes have especially been heightened after the success of DeepSeek. The Chinese firm, at a far lower cost than Western tech companies, was able to train a so-called ‘reasoning’ model that arrives at a response after a series of logical reasoning steps that are displayed to users in an abstracted form and are generally able to give much better responses. Policymakers have cited India’s low-cost advances in space exploration and telecommunications as a critical example of the potential to hit a similar breakthrough, and soon.
LLMs and small language models (SLMs) are generally compiled by condensing massive volumes of text data, typically scraped from the web, and ‘training’ the system through a neural network. A neural network is a machine learning model that roughly imitates the way a human brain works by linking several pieces of information and passing them through ‘layers’ of nodes until an output, based on multiple interactions in the hidden layers, results in an acceptable response.
Neural networks have been a tremendous breakthrough in machine learning and have for years been the backbone of services such as automated social media moderation, machine translation, recommendation systems on services such as YouTube and Netflix, and a host of business intelligence tools.
While deep learning and machine learning developments surged in the 2010s, the underlying research had several landmark developments, such as the ‘attention mechanism’, a natural language processing framework that effectively gave developers a way to break down a sentence into components, allowing computer systems to reach ever closer to ‘understanding’ an input that was not a piece of code. Even if this technology was not completely based on any sort of actual intelligence, it was still a massive leap in machine learning capabilities.
The transformer, which built on these advances, was the key breakthrough that paved the way for LLMs such as ChatGPT. A 2017 paper by researchers at Google laid out the transformer architecture, laying out for the first time the theory of practically training LLMs on graphics processing units (GPUs), which have emerged as critical for the entire tech industry’s AI pivot.
It was quite some time before OpenAI started practically implementing the findings of the advancement in a way that the public could witness. ChatGPT’s first model was released more than five years after the Google researchers’ paper, for a reason that has emerged as both a commercial headache for firms looking to leverage AI and for countries looking to build their capabilities: cost.

In October this year, India announced its intention to build Maitri II, the country’s newest research station in Antarctica and India’s fourth, about 40 forty-odd years after the first permanent research station in Antarctica, Dakshin Gangotri, was established. The Hindu talks to Dr Harsh K Gupta, who led the team that established it












