Information integration and endogenous control during exploration and exploitation
Nathan Tardiff1, Sharon L Thompson-Schill1; 1University of Pennsylvania
Learning and decision-making in dynamic environments involve a fundamental challenge—whether to continue pursuing the current behavioral policy (exploitation), or to abandon it in favor of alternative and potentially more beneficial courses of action (exploration). In order to inform the decision to change control state between exploitation and exploration, learners must identify, track, and integrate relevant environmental variables, such as reward outcomes. Notably, both unexpected environmental changes and decisions to explore are associated with changes in arousal, which is thought to partly reflect activity of the locus coeruleus-norepinephrine system (LC-NE). This suggests that brain regions involved in changing control state may be ones that integrate information about decision-relevant environmental variables with information about the current control and arousal state. Here we sought to identify such integrative regions. Subjects underwent concurrent fMRI and pupillometry—an indirect measure of LC-NE activity—while completing a bandit task. As expected, pupil diameter increased following both changes in outcome value and decisions to explore. We then used an fMRI conjunction analysis to identify brain regions that showed greater activation to exploration than exploitation, greater activation to changes in outcome value than no change, and also were modulated by continuous pupil diameter. This analysis implicated several regions in joint or interdigitated coding of all three variables, including cingulate and paracingulate cortices, anterior insula, inferior frontal junction, and intraparietal sulcus, as well as some visual areas. These results shed light on regions involved in endogenously motivated (as opposed to cued) changes in control state in dynamic environments.
Topic Area: THINKING: Decision making