From Conditioning Monkeys to Drug Addiction: Understanding Prediction and Reward

Frans de Waal; http://creativecommons.org/licenses/by/2.5/deed.en

Breakthroughs in cognitive neuroscience: Highlighting influential research from the past 20 years

This series will explore influential papers in cognitive neuroscience, as measured by the number of times they are cited each year. The papers featured are just a sampling of many important works in the field over the past 20 years.

Life for all organisms is a constant cycle of trying to anticipate where and when to find food, shelter, and a mate, all the while avoiding danger. How we predict these events and learn from them is vital to our and other animals’ success. For years, cognitive neuroscientists have been exploring these processes with an eye toward better understanding a variety of complex functions of the human brain, from decision-making to working memory. Their work got a big boost in the late 1990s, with the publication of two papers, long in the making, that matched behavior and physiology with a computer model to explain how we predict and respond to rewards.

The roots of the work are reinforcement learning theory – the computer science-based idea of engineering artificial systems that can use experience to adapt to their environment the way humans and other organisms do. In a 1996 paper published in the Journal of Neuroscience, P. Read Montague, Peter Dayan, and Terry Sejnowski, explored how reinforcement learning could be applied to earlier work on learning aong primates. A year later in Science, Wolfram Schultz, along with Dayan and Montague, published a review of that earlier work.

Together, the papers describe the role of the neurotransmitter dopamine in constructing information about possible rewarding events. In a series of experiments, researchers implanted electrodes into monkeys and recorded neuron firings in an area of the brain with many dopamine neurons. They gave the monkeys an unexpected reward – fruit juice – and watched the neurons fire. They then conditioned the monkeys with a tone or light associated with the fruit juice to learn to anticipate the reward.

When the monkeys were stimulated with the tone or light but no juice arrived, there was a spike in dopamine firing associated with the anticipation of the juice but then a decrease in the dopamine neuron firing when the juice did not arrive. The monkeys knew both when the reward was expected and if it had been obtained – thereby showing, for the first time, that dopamine neurons encode both the expectation of the reward and the response to it. The papers match these observations to a mathematical model that has become a paradigm for understanding reward prediction.

“Somehow we tripped and guessed a basically correct setting for understanding the, at the time, complicated changes in dopamine activity transients,” says Montague of the Virginia Tech Carilion Research Institute. “The mathematics behind this description has a natural connection to certain ways of modeling in classical decision sciences so we immediately had something that accounted in a cool way for changes in dopamine firing and could be used to understand how to make choices based on the signal.”

“For the non-scientist, I think this work has provided a plausible path from important issues like drug addiction all the way down to the neural substrate,” Montague says. The reward-prediction model explains how drug addiction becomes “over-valued” in the brain through the normal way that the brain learns to assign values to events in the world. “For example, the white powder of cocaine has no intrinsic value to the nervous system until someone takes it, perhaps repeatedly,” he explains. “However, its influence on dopamine signaling causes the entire system to learn to value this powder and behavioral settings that lead to it. This effect is explained quite well by the model, but in terms of malfunctioning computations.”

For the scientific community, the papers gained so much traction, Montague says, because it brought together years of work in computer science, the psychology of learning, and cognitive neuroscience. “So there were lots of stakeholders in diverse disciplines that at least ‘could’ be interested in the model.” Montague recalls, however, the long road to these papers, noting that the initial study on primates was rejected seven times before its 1996 publication.

“Over the last decade, reinforcement learning concepts have been highly influential in cognitive neuroscience,” says David Badre of Brown University, whose lab studies the neural systems supporting the cognitive control of memory and action.“The Schultz et al., (1997) paper, along with Montague, Dayan, and Sejnowski (1996), highlighted a striking correspondence between behavioral and physiological markers of reward-based learning and the reward prediction error signals used by artificial systems that rely on reinforcement learning. By drawing this link, these papers were among those that inspired a generation of new research investigating reinforcement learning theory in cognitive neuroscience.”

Two big developments in the last two decades enabled the work on prediction and reward to advance, Montague says. “The first is the meteoric growth, acceptance, and diversification of computational neuroscience – models in these domains are now taken more seriously and there are simply more people pursuing that approach,” he explains. “The second big development in my opinion is human neuroimaging and the possibility of testing reward prediction error models in healthy humans using fMRI.”

Looking to the future of this research, Montague says he would like to see healthy humans becoming an even better source of important neurobiological data for understanding cognition. “There will always be a need to use model organisms and certainly this has been very successful in the mouse models,” he says. “But there is always a stretch to understand how to connect behaviors in these organisms, and the biology that underwrites it, with the human analog.”

-Lisa M.P. Munoz

“A neural substrate of prediction and reward,” W. Schutz, P. Dayan, P.R. Montague, Science, March 14, 1997, 275(5306):1593-9.

“A Framework for Mesencephalic Dopamine Systems Based on Predictive Hebbian Learning, P. Read Montague, Peter Dayan, and Terrence J. Sejnowski, Journal of Neuroscience, March 1, 1996, 16(5):1936-197.

Media contact: Lisa M.P. Munoz, CNS Public Information Officer, cns.publicaffairs@gmail.com

From Conditioning Monkeys to Drug Addiction: Understanding Prediction and Reward

Breakthroughs in cognitive neuroscience: Highlighting influential research from the past 20 years

Contact

Recent Posts

Archives