
Does Alexa understand your toddler? Western University researchers are looking into it
CBC
Olivia Daub’s two-year-old son is obsessed with “doodidees.” He talks about them and screams for them at 5 a.m. every day.
Daub said most people don’t have a clue what her son is saying, but she knows to bring him the tiny, dark blue fruit that he actually wants: blueberries.
“We’ve all been children and we’ve all had the experience of not being understood by adults. Inversely, we [adults] have all had a really hard time understanding children because they produce speech and language in ways that are different from adults."
Daub, an assistant professor in Western University's school of communication sciences and disorders in London, Ont., said understanding toddler-speak is even trickier for artificial intelligence (AI). That’s why she is leading new research on how AI can better understand the way toddlers talk.
Daub said that while automatic speech recognition software, such as automatic closed captions on Zoom meetings and Amazon’s Alexa virtual assistant, has become good at recognizing adult speech, it still struggles to accurately pick up what young children are saying.
“I think we’ve all seen YouTube clips of a toddler asking Alexa to play a song, and getting something completely different and really inappropriate,” she said. “This study is trying to understand how we can leverage AI and machine-learning principles to improve recognition for toddlers and preschoolers.”
To do that, she’s working with Western electrical and computer engineering assistant professor Soodeh Nikan to train an AI model on toddlers’ common speech patterns and shortcuts.
“Most of the speech models that we have are trained with adult speech, so that’s why most of these models are not very successful in recognizing that toddler speech, especially the mistakes that they make,” Nikan said.
“You have to provide examples to AI [for it] to be able to understand and distinguish normal mistakes and speech disorder problems."
Daub plans to bring in a sample of 30 children to play, tell stories and speak to research assistants. Each session will be recorded and transcribed by humans, who will also collect data on children’s speaking patterns.
One common pattern, Daub said, is that many English-speaking toddlers struggle to pronounce the “r” sound and instead use a “w” sound.
Data like this will be handed over to Nikan, who will feed the information into a private AI model in order to train it.
“We can fine-tune these models using the data that is specifically annotated for this purpose,” said Nikan, adding the AI model will also be trained with some data from already-existing OpenAI online.
Daub and her team have so far met with nine toddlers and are still seeking more study participants.













