Primary Country (Mandatory)

Other Country (Optional)

Set News Language for United States

Primary Language (Mandatory)
Other Language[s] (Optional)
No other language available

Set News Language for World

Primary Language (Mandatory)
Other Language(s) (Optional)

Set News Source for United States

Primary Source (Mandatory)
Other Source[s] (Optional)

Set News Source for World

Primary Source (Mandatory)
Other Source(s) (Optional)
  • Countries
    • India
    • United States
    • Qatar
    • Germany
    • China
    • Canada
    • World
  • Categories
    • National
    • International
    • Business
    • Entertainment
    • Sports
    • Special
    • All Categories
  • Available Languages for United States
    • English
  • All Languages
    • English
    • Hindi
    • Arabic
    • German
    • Chinese
    • French
  • Sources
    • India
      • AajTak
      • NDTV India
      • The Hindu
      • India Today
      • Zee News
      • NDTV
      • BBC
      • The Wire
      • News18
      • News 24
      • The Quint
      • ABP News
      • Zee News
      • News 24
    • United States
      • CNN
      • Fox News
      • Al Jazeera
      • CBSN
      • NY Post
      • Voice of America
      • The New York Times
      • HuffPost
      • ABC News
      • Newsy
    • Qatar
      • Al Jazeera
      • Al Arab
      • The Peninsula
      • Gulf Times
      • Al Sharq
      • Qatar Tribune
      • Al Raya
      • Lusail
    • Germany
      • DW
      • ZDF
      • ProSieben
      • RTL
      • n-tv
      • Die Welt
      • Süddeutsche Zeitung
      • Frankfurter Rundschau
    • China
      • China Daily
      • BBC
      • The New York Times
      • Voice of America
      • Beijing Daily
      • The Epoch Times
      • Ta Kung Pao
      • Xinmin Evening News
    • Canada
      • CBC
      • Radio-Canada
      • CTV
      • TVA Nouvelles
      • Le Journal de Montréal
      • Global News
      • BNN Bloomberg
      • Métro
How are Indian languages faring in the age of AI and language models?
Premium

How are Indian languages faring in the age of AI and language models? Premium

The Hindu
Tuesday, May 30, 2023 10:54:20 AM UTC

As large language models like ChatGPT find more applications around the world, their adoption also passively spreads a prejudice against languages other than English, including Indian languages. Some researchers are working to remedy this.

“Sanskrit suits the language of computers and those learning artificial intelligence learn it,” Indian Space Research Organisation chairman S. Somanath said at an event in Ujjain on May 25. His was the latest in a line of statements exalting Sanskrit and its value for computing but without any evidence or explanation.

But beyond Sanskrit, how are other Indian languages faring in the realm of artificial intelligence (AI), at a time when its language-based applications have taken the world by storm?

The answer is a mixed bag. There is some passive discrimination even as the languages’ fates are buoyed by public-spirited research and innovation.

Behind both seemingly intelligent chatbots and art-making computers, algorithms and data-manipulation techniques turn linguistic and visual data into mathematical objects (like vectors), and combine them in specific ways to produce the desired output. This is how ChatGPT is able to respond to your questions.

When working with a language, a machine first has to break a sentence or a word down into little bits in a process called tokenisation. These are the bits that the machine’s data-processing model will work with. For example, “there’s a star” can be tokenised to “there”, “is”, “a”, and “star”.

There are several tokenisation techniques. A treebank tokeniser breaks up words and sentences based on the rules that linguists use to study them. A subword tokeniser allows the model to learn some common word and modifications to that word separately, such as “dusty” and “dustier”/“dustiest”.

OpenAI, the maker of ChatGPT and the GPT series of large language models, uses a type of the subword tokeniser called byte-pair encoding (BPE). Here’s an example of the OpenAI API using this on a statement by Gayathri Chakravorty Spivak:

Read full story on The Hindu
Share this story on:-
More Related News
Harley-Davidson X440T launched: Refined midsize motorcycle with modern rider features

Harley-Davidson X440T introduces updated rider aids, refined performance and refreshed styling, offering a modern, city-focused evolution of the successful X440.

How can India benefit from neurotechnology? | Explained

Explore how neurotechnology, including BCIs, can transform healthcare and enhance India's capabilities in addressing neurological disorders and beyond.

On zoos and magnets: the physics behind sounds Premium

Explore the fascinating physics of sound waves and their enchanting presence in nature and everyday life.

Parambriyam reopens in Anna Nagar, showcasing South Indian culinary heritage

Parambriyam reopens in Anna Nagar, offering a vibrant dining experience celebrating South India's rich culinary traditions and flavors.

Fossils, genomes clash as scientists debate the mosquito’s origins Premium

Scientists debate mosquito origins as fossil evidence and genomic studies clash, revealing new insights into their evolutionary timeline.

A guide to going sustainable this Christmas

From Christmas trees fashioned out of books to real spices as ornaments

How new DGCA rules put human limits at the centre of air safety Premium

Explore how DGCA's new fatigue regulations prioritize human limits to enhance aviation safety amidst recent airline disruptions.

India needs a diversified portfolio of future-ready power system technologies, say experts

Experts advocate for a diversified energy strategy in India to enhance reliability and support the 500 GW non-fossil capacity target.

Bakelite, the first synthetic plastic Premium

On December 7, 1909, Belgian-American chemist Leo Baekeland’s process patent for making Bakelite was granted, two years after he had figured it out. Bakelite is the first fully synthetic plastic and its invention marked the beginning of the Age of Plastics. A.S.Ganesh tells you more about Baekeland and his Bakelite…

Unlocking the potential India’s research in medicine Premium

Explore the challenges and opportunities in enhancing India's medical research ecosystem to unlock its potential for groundbreaking discoveries.

The rise of the secure workspace

Explore how India’s flexible workspaces are transforming into secure, intelligent environments that safeguard data and enhance employee confidence.

Affordable housing: the missing pillar in India’s urban growth

Discover how collaborative policies and innovative financing can unlock affordable housing in India's urban growth landscape.

An excerpt from Michelin-starred chef Suvir Saran’s memoir, ‘Tell My Mother I Like Boys’

“When I turned to the mirror, it caught me unguarded. The reflection was both familiar and foreign.”

Niraba brings sabai grass weaving to contemporary furniture design

Discover Niraba, a collaboration merging Odisha's sabai grass weaving and dhokra craft into contemporary furniture and lighting design.

The story behind Goa’s 18-foot crochet Christmas tree

How do you create a Christmas tree with crochet? Take notes from crochet artist Sheena Pereira, who co-founded Goa-based Crochet Collective with crocheter Sharmila Majumdar in 2025. Their artwork takes centre stage at the Where We Gather exhibit, which is part of Festivals of Goa, an ongoing exhibition hosted by the Museum of Goa. The collective’s multi-hued, 18-foot crochet Christmas tree has been put together by 25 women from across the State. “I’ve always thought of doing an installation with crochet. So, we thought of doing something throughout the year that would culminate at the year end; something that would resonate with Christmas message — peace, hope, joy, love,” explains Sheena. 

Science Quiz: Remembering Max Born, quantum physics architect Premium

Max Born made many contributions to quantum theory. This said, he was awarded the Nobel Prize for physics in 1954 for establishing the statistical interpretation of the ____________. Fill in the blank with the name of an object central to quantum theory but whose exact nature is still not fully understood.

‘Moms of Kochi’ to organise Purple Carnival, a lifestyle exhibition

Join the Moms of Kochi for the Purple Carnival, a vibrant lifestyle exhibition featuring stalls, competitions, and entertainment on December 13-14.

Why human-rating matters as India prepares for Gaganyaan Premium

Human-rating emerges as a crucial process ensuring that space systems like LVM-3 can safely carry humans by adding redundancy, robust abort capabilities, and rigorous testing

The snail as a model for restoring vision in humans Premium

Discover how golden apple snails' eye regeneration offers insights into restoring human vision through genetic understanding and CRISPR technology.

Why do microwave ovens sometimes overheat water? Premium

Discover how microwave ovens can superheat water, leading to unexpected boiling when disturbed. Stay safe while heating.

Manhattan’s hot new INDN Bar: Butter chicken cocktails, keema pav, zero kids 

INDN has launched in NYC’s NoMad. Indian food goes 21+ here, with savoury cocktails, nostalgic small plates and a bar that refuses desi clichés — much to one dad’s horror

2025 to be second or third-hottest year on record: EU scientists

EU scientists predict 2025 will be among the hottest years on record, highlighting urgent climate change concerns and inadequate global action.

Tata Sierra review — The icon returns with purpose, poise and personality

Explore the complete review of the 2025 Tata Sierra, India’s reborn SUV icon. From its standout design and premium cabin to refined petrol and diesel engines, advanced tech and competitive pricing, this in-depth breakdown reveals why the new Sierra is poised to lead the midsize SUV segment.

SpaceX to pursue 2026 IPO raising above $30 billion: Report

Elon Musk’s SpaceX is moving ahead with plans for an initial public offering that would seek to raise significantly more than $30 billion and target a valuation of about $1.5 trillion, Bloomberg News reported

Aditya-L1 in a global effort reveals why the 2024 solar storm behaved unusually

Aditya-L1 and U.S. satellites uncover why the May 2024 solar storm behaved unusually, revealing significant magnetic field dynamics.

© 2008 - 2025 Webjosh  |  News Archive  |  Privacy Policy  |  Contact Us