
IISc researchers devise a new language for ML models
The Hindu
IISc researchers develop STRONG language to encode nanopore structure, train machine learning models for property prediction.
Indian Institute of Science researchers have devised a new language that encodes the shape and structure of nanopores in the form of a sequence of characters.
This language devised by Ananth Govind Rajan’s lab and the study published in the Journal of the American Chemical Society can be used to train any machine learning model to predict the properties of nanopores in a wide variety of materials.
IISc said the language called STRONG (STring Representation Of Nanopore Geometry) assigns different letters to different atom configurations and creates a sequence of all the atoms on the edge of a nanopore to specify its shape.
“For instance, a fully bonded atom (having three bonds) is represented as ‘F’, and a corner atom (bonded to two atoms) is represented as ‘C’ and so on. Different nanopores have different kinds of atoms at their edge, which dictates their properties,” IISc said.
It added that STRONGs allowed the team to devise fast ways for identifying functionally equivalent nanopores having identical edge atoms, such as those related by rotation or reflection. This drastically cuts down on the amount of data that needs to be analysed for predicting nanopore properties.
Just like how ChatGPT predicts textual data, neural networks (machine learning models) can read the letters in STRONGs to understand what a nanopore will look like and predict what its properties will be, it added.
The team turned to a variant of a neural network used in Natural Language Processing that works well with long sequences and can selectively remember or forget information over time. Unlike traditional programming, in which the computer is given explicit instructions, neural networks can be trained to figure out how to solve a problem they have not encountered so far.













