Artificial Intelligence converts thoughts into speech

Wednesday, February 13, 2019

Artificial Intelligence converts thoughts into speech

Neuroscientists at Columbia University have discovered a ground-breaking way of turning thoughts into speech, that could potentially give individuals who have lost their ability to speak a voice. Professor Mesgarani and his team from Colombia University are using Artificial Intelligence (AI) to recognise patterns that appear in someone’s brain when they listen to speech. The AI is similar to the algorithms used by Apple for Siri and Amazon for Alexa.

Using computer processing software, scientists monitor brainwaves of patients who are unable to vocalise thoughts. They also use neural networks, along with the technology able to channel brain activity into the device to be translated into speech.

Neural networks are fundamentally based on connections, strengths and functions. It is a model that, taking inspiration from the brain, is composed of layers (at least one of which is hidden) consisting of simple connected units or neurons followed by nonlinearities. What makes Neural Networks special is their use of a hidden layer of weighted functions called neurons, with which you can effectively build a network that maps a lot of other functions. Without a hidden layer of functions, Neural Networks would be just a set of simple weighted functions. The networks, as in the biological brain, are able to transmit signals from on to the other.

Figure 1. This technology could give people a voice
The aim of the experiment was to teach the Vocoder to interpret brain activity using patterns of neural behaviour. By mimicking the structure of neurons in the brain, they were able to produce a robot sounding voice, that was able to almost perfectly translate the patients brainwaves. The process began with neural signals transmitted from the patient’s brains. They then used future extraction networks to decode the signals, and then feature summation networks to prepare the signals to be inputted into the vocoder and generate reconstructed speech.

Researchers tried to decipher the brains´ language signals by monitoring parts of the brain when people read aloud and listened to recordings. By compiling this data, they were able to covert the brain signals into words and simple sentences humans would be able to understand. The nature of the collection of the data was very invasive, and so the researchers could only do it for 15 minutes at a time.

They trained a Vocoder using epilepsy patients that were undergoing brain surgery to treat their condition, which is a computer algorithm capable of synthesizing speech after being trained. Vocoder analyses and synthesizes human voice signals, it then takes these signals and compresses them to emit manipulated sounds. The process resulted in around 75% of thoughts being translated per patient.

The study first used Linear and Spectrogram models to establish a baseline. They then used a vocoder to generate a Deep Neural Network (DNN) and combined this with the original spectrogram. The highest result of objective and subjective intelligibility quality scores was from the combination of the DNN and vocoder, which produced sounds that are clear to the listener. The technology is able to reconstruct the words the person hears and artificially generate them with a staggering rate of clarity: have a listen for yourself!

Figure 2. Sound waves

Famously, the scientist Stephen Hawking, diagnosed with ALS at just 21, used a rudimentary version of speech synthesis to communicate. He used a system that involved a cheek switch that connected to his glasses and then chose words spoken by a voice synthesizer. The findings made by the team at Columbia University has the potential to cut out the middle man, the computer, so individuals would be able to produce speech without the help of a computer or movement-sensitive system. 

There are, of course, limitations to these developments, mainly due to the small size of the sample used. To take the technology farther many more studies will need to be done on much larger sample sizes to obtain reliable results that can be transmitted to a larger public. There is also the issue of individualization. As in the days of early speech recognition systems, the algorithms and decoders need to be individualized for each user.

However, the team at Columbia University have definitely given hope to those without the ability to verbalize thoughts. With many people around the world suffering from devastating illnesses that prevent their ability to communicate verbally, this advancement could be a turning point in medicine and give people the chance to have a real voice.

You can also follow us on TwitterYouTube and LinkedIn

No comments:

Post a Comment