Brain activated speech for the paralysed

International researchers have designed a communication device which could provide hope for people who cannot speak or type.


The research, published 8 November 2022 in the Nature Communications, detailed how the ‘neuroprosthesis’ could decode brain activity relating to single letters to spell out full sentences in real-time, demonstrated in a participant who suffered from limited communication due to severe paralysis.

The device was able to decode brain activity to produce sentences from a 1,152-word vocabulary at a speed of 29.4 characters per minute, with an average character error rate of 6.13%, and in further experiments, the authors found that the approach generalized to large vocabularies containing over 9,000 words, averaging an 8.23% error rate.

Existing neuroprosthetic systems have only managed to decode up to 50 words, and were limited to a specific vocabulary, with the participant required to attempt to speak the words aloud – a significant effort given their diagnosis.

Lead author and neurosurgeon, Professor Edward Chang, from the Weill Institute for Neurosciences and the University of California, said that his team built on the previous approach by designing their system to decode brain activity associated with the NATO phonetic alphabet.

“One drawback of the current approach is that it relies on code words instead of letters during spelling. Although the use of these longer code words improved neural discriminability, they are less natural to use,” Professor Chang explained.

“In tests, our device was able to decode the brain activity of the participant as they attempted to silently speak each letter phonetically to produce sentences.

“The results highlight the potential of silently controlled speech neuroprostheses to generate sentences through a spelling-based approach using phonetic code words, though further work is required to demonstrate if this approach is reproducible in more participants.”

During training sessions, Professor Chang’s team cued the participant to attempt to produce individual code words and a hand-motor movement, using a simultaneously recorded cortical activity from an implanted 128-channel electrocorticography (ECoG) array to train classification and detection models.

A beam-search algorithm used predicted code-word probabilities from a classification model to find the most likely sentence given the neural activity while automatically inserting spaces between decoded words.

“To initiate spelling, the participant silently attempted to speak, with a speech-detection model identified this start signal directly from ECoG activity,” Dr Chang explained.

In addition to enabling spatial coverage over the lateral speech motor cortical brain regions, the implanted ECoG array also provided simultaneous access to neural populations in the cortical area typically associated with hand movements, so that after spelling out the intended sentence, the participant was able to use a hand-motor movement to disengage the device.

“When the classification model identified this hand-motor command from ECoG activity, a large neural network-based language model rescored the potential sentence candidates from the beam search and finalized the sentence,” Professor Chang said.

“Our approach is the first to combine the two cortical areas to control a BCI.”