(Disponível em Português)
Learning to predict the outcomes of actions happens through two separate cognitive processes. Though distinct, it is frequently difficult to tell which scheme an individual is executing at any given instance. A new study in mice implements a novel experimental approach that untangles the two, and pins down how a specific brain structure represents the various features involved in the decision making process.
Predicting the outcomes of actions in order to make good decisions is a critical role of brain function. This process is thought to work through two fundamentally different mechanisms called “model-free” and “model-based” learning. Though fundamental for flexible and adaptive behaviour, the neurobiology of model-based learning remains poorly understood.
Now, in a study published in the scientific journal Neuron, scientists pin down a brain area crucial for this type of learning and demonstrate how its activity encodes multiple aspects of the decision-making process.
Disentangling cognitive schemes
“Model-free and the model-based learning are distinct, but complementary”, says lead author Thomas Akam, a researcher at Oxford University, who worked together on this study with Rui Costa, investigator at the Champalimaud Centre for the Unknown and now Director and CEO of Columbia’s Zuckerman Institute, and Peter Dayan, Director of the Max Planck Institute for Biological Cybernetics in Tübingen.
Whereas the model-based approach relies on understanding the underlying structure of the problem and creating a plan, for instance figuring out the best route to get to a new restaurant, the model-free approach allows you to act quickly with less mental effort in familiar situations.
“The model-free approach simply consists of opting for actions that gave a good result in the past. So in this example, after multiple visits to the restaurant, making the trip would become completely habitual, freeing your mind to focus on other things”, Costa adds.
According to the authors, we switch between these modes of acting all the time without even realising. For instance, if you find a closed road on your habitual route to the restaurant, you may quickly transition to the model-based approach to come up with an alternative.
“The two approaches often operate in parallel, which creates a challenge in studying the neural basis of the model-based decision making”, says Akam.
A custom-made puzzle
To isolate the contribution of these two cognitive schemes, the researchers developed a novel experimental task.
“We adapted a task that was originally developed for humans so that we could study brain mechanisms in mice”, says Akam.
Mice would initiate a trial by poking their noses into one of two central ports, located one above the other. This would light up one of two side ports where the mice could collect a water reward (one located to the left and the other to the right of the central ports).
To do the task well, the mice had to figure out two key variables. The first was which side-port was more likely to offer a reward. And the second was which of the central ports activated the more rewarding side port. Once the mice learned the task, they would opt for the action sequence that offered the best outcome.
Though this task may seem artificial, Akam points out that it captures certain important features of real-world decision-making. “Just like in real life, the subject has to perform extended sequences of actions, with uncertain consequences, in order to obtain desired outcomes”, he explains.
To promote flexible learning strategies, every now and then, one of two changes would happen. “One manipulation was to switch the mapping between the central and the side ports. The other was to change which of the side ports had a higher probability of giving reward”, Akam explains.
How was this experimental approach useful for disentangling the different cognitive schemes? “In principle, the task can be solved by either model-free or model-based learning; mice could simply learn the model-free prediction ‘top is good’, or they could learn a model of the task ‘top leads to left, left to reward ”, Akam says. “However, these different strategies would generate different patterns of choices. By looking at the subjects’ behaviour we were able to assess the contribution of either approach.”
When the team analysed the results – about 230,000 individual decisions – they learned that the mice were using both approaches in parallel. “This confirmed that the task was suitable for studying the neural basis of these mechanisms”, Costa says. “We then moved on to the next step – investigating the neural basis of this behaviour.”
A neural map of model-based learning
The team focussed on a brain region called anterior cingulate cortex (ACC).“Previous studies established that ACC is involved in action selection and provided some evidence that it could be involved in model-based predictions,” Dr. Costa explained. “But no one had checked the activity of individual ACC neurons in a task designed to differentiate between these different types of learning”
Remarkably,the researchers discovered that the activity of the neurons created a map that represented various aspects of the behaviour of the mice. “By looking at the pattern of activity across the population we could decode very accurately where in the trial the subject was. For instance, if it was about to choose the bottom port, or was moving from the top to the right port, or receiving a reward on the left”, Akam recounts.
In addition to representing the animal’s current location in the task, ACC neurons also encoded which state was likely to come next. “This provided direct evidence that ACC is involved in making model-based predictions of the specific consequences of actions, not just whether they are good or bad”, says Costa.
Moreover, ACC neurons also represented whether the outcome of actions was expected or surprising, potentially providing a mechanism for updating predictions when they turn out to be wrong.
Finally, to test whether the ACC was needed for model-based decision-making, the team silenced ACC neurons in individual trials while the animals were deciding what option to choose. As a result, “mice failed to correctly update their strategy, suggesting that silencing ACC prevents the animals from using model-based predictions. Consistent with this interpretation, ACC silencing had a stronger effect on subjects who relied more on a model-based strategy”, Akam explains.
“These results were very exciting,” Costa points out. “These data identify the anterior cingulate cortex as a key brain region in model-based decision-making, more specifically in predicting what will happen in the world if we choose to do a particular action versus another.”
According to the authors, a big challenge in contemporary neuroscience is understanding how the brain controls complex behaviours like planning and sequential decision making. “Our study is one of the first to demonstrate that it is possible to study these aspects of decision-making in mice”, says Akam.
“These results will allow us and others to use the powerful tools for monitoring and manipulating brain activity available in this species to build mechanistic understanding of flexible decision making”, he concludes.