Words, like people, can achieve a lot more when they work together than when they stand on their own. Words working together make sentences, and sentences can express meanings that are unboundedly rich. How the human brain represents the meanings of sentences has been an unsolved problem in neuroscience, but my colleagues and I recently published work in the journal Cerebral Cortex that casts some light on the question. Here, my aim is to give a bigger-picture overview of what that work was about, and what it told us that we did not know before.
To measure people's brain activation, we used fMRI (functional Magnetic Resonance Imaging). When fMRI studies were first carried out, in the early 1990s, they mostly just asked which parts of the brain "light up,” i.e. which brain areas are active when people perform a given task.
However, in the last decade or so, a different approach has been rapidly gaining in popularity and influence: instead of just asking which areas light up, neuroscientists try to do what is known as "neural decoding”: we observe a pattern of brain activation, and then try to figure out what gave rise to it. As an analogy, consider walking through a forest and seeing an animal footprint in the mud. By looking at the pattern in the mud, i.e. the shape of the footprint, we might be able to figure out which animal made it. But in order to do that, we first need to learn what the footprints of different animals tend to look like, and, harder still, learn how to decode these footprints even when the mud is smudged or the imprint is faint.
Neural decoding is very similar. Instead of deducing which animal gave rise to a pattern in the mud, which draws on what one might know about the footprints of familiar animals, we instead decode what stimuli (words and sentences, in this case) might have given rise to a given brain pattern, based on the patterns we’ve seen in the past from known words and sentences.
The new aspect of our study was that neural decoding had not previously been achieved at the level of entire sentences. To give a rough idea of why sentence-level decoding is difficult, let us return to the animal footprint analogy. Suppose the person inside the MRI scanner were just reading a single word at a time. This would be like seeing just one footprint in the mud, and trying to figure out which animal it came from. In contrast, when the person in the scanner is reading an entire sentence, brain activation patterns from several words are present at the same time. Decoding that is like if several different species of animal all ran over the same piece of wet mud together, and then our task was to try to identify as many of those animals as possible from the compound mass of tracks.
However, our study also went beyond that. We built a computer model that didn't only learn the "neural footprints" of specific words. The model also used information about different sensory, emotional, social and other aspects of the words, so that it could learn to predict brain patterns for new words, and also for new sentences made out of recombinations of the words. Extending our animal footprint metaphor, this would be as if we were trained to recognize the footprints of a deer and of a cow, and then we get confronted for the first time with a footprint that we have never seen before, e.g. that of a moose. If we have a model that tells us that a moose is a bit like a cow-sized deer, then that model can predict that a moose footprint will be a bit like a cow-footprint-sized deer-print. That prediction isn't exactly right, but it isn't far off either. It's good enough to do a lot better than a random guess.
Along similar lines, our computer model could predict the brain patterns for a new sentence that it had not been trained on, as long as it had been trained on enough of the words in that sentence in different contexts. For example, our model could predict the brain pattern for "The family played at the beach," using the patterns that it had been trained on for other sentences sharing some of the same words, such as "The young girl played soccer" and "The beach was empty.”
This process of using a computer model to extract information from brain data is, in many ways, the same as other types of technology that are becoming woven into our everyday lives. Computer models which extract meaningful information from large patterns of data are developed in the field of research known as "machine learning", also often referred to as "data science.” When you point your phone camera at someone and it draws a box around their face, the phone is taking in lots of data—millions of pixels—and extracting the meaningful information of where the face is.
Voice-recognition software such as Siri takes in lots of data about rapidly changing air vibrations (speech sounds) and extracts words from them. Neural decoding takes in brain data in the form of three-dimensional pixels that depict brain activation on fMRI scans, (called "voxels") and extracts information from them. In our study, that information consisted of the meanings of words and sentences, which people were reading while their brains were being scanned.
To decode information about sets of words, we needed an interdisciplinary team of people. The study was led by Andy Anderson, a postdoctoral research fellow in my lab. Andy has expertise that spans all the required domains: computational models of the meanings of words, machine learning, and brain imaging. Another key member of the team was Dr. Jeffrey Binder of the Medical College of Wisconsin a neurologist and world-renowned investigator of how the brain represents meaning.
But the full team was much larger than that: our paper has nine authors, all of whom played different and crucial roles, and the authors span six different nationalities, from both industry and academia. Our funding came in part from two different government agencies (the Intelligence Advanced Research Projects Activity and the National Science Foundation). Scientific advances these days often get made by large collaborative teams, made up of people from many different countries of origin.
Decoding sentences from the brain may well be intriguing, but why does it matter? There are two answers to this question. One answer is that the human brain literally makes us who we are, and language is one of the most fundamental aspects of human cognition.
Beyond the intrinsic scientific interest, such work may also one day have practical applications. Our study extracts meaning from people's brains, and there are many people with traumatic brain injuries who have meaning in trapped in their heads that they are unable to express themselves verbally, e.g. patients with damage to a brain region called Broca's area .
Our study also used computer models to represent meaning. Existing computer models work much better than they did just a few years ago, as can be seen from the success of systems such as Siri and Google Translate. But these existing models also have many imperfections, as can also been seen from those same computer systems, which all too often produce garbled output. By far the best representer of meaning in the world is the human brain. In seeking to understand how the brain achieves that, we might be able to make our computer models of meaning work better.
These practical pay-offs won't come tomorrow, or even next year. The question of how the brain represents meaning is extremely difficult, and our new paper, although an advance, leaves many problems still unaddressed. To tackle difficult problems, science needs to work with a long time horizon. Just as the words in a sentence work together to build a richer meaning, many individual scientific studies jointly help us better to understand our world.