The 1st Sagol Fellow Minisymposium:
The School of Psychological Sciences & Sagol School of Neuroscience present:

Decision-making in the brain: consensus and controversies.

March 31st | Yaglom Hall, Tel Aviv University

Asset 1@7x.png


Titles and

Yael Niv: Model-based predictions for dopamine

Phasic dopamine responses are thought to encode a prediction-error signal consistent with model-free reinforcement learning theories. However, a number of recent findings highlight the influence of model-based computations on dopamine responses, and suggest that dopamine prediction errors reflect more dimensions of an expected outcome than scalar reward value. In this talk I will focus on these challenges to the scalar prediction-error theory of dopamine, and to the strict dichotomy between model-based and model-free learning, suggesting that these may better be viewed as a set of intertwined computations rather than two alternative systems. Alas, phasic dopamine signals, until recently a beacon of computationally-interpretable brain activity, may not be as simple as we once hoped they were.

Nathaniel Daw: Revisiting dopamine and value estimation

The reward prediction error theory of midbrain dopamine function has been celebrated for offering a formal, if stylized computational account that spans all the way from neural spiking to behavior. One of the traditional strengths of the model has been that the global, scalar error signal it describes seems well matched to the sweeping, diffuse projections of dopamine neurons and their apparently homogenous phasic responses. However, I review recent evidence that now clearly demonstrates that the dopamine response is instead heterogeneous from target area to target area and even from neuron to neuron.

I revisit the role of the scalar error signal in temporal-difference learning, and lay out a pair of related computational proposals how the model might accommodate heterogeneous prediction errors. I aim for a maximally simple and generic account, making few assumptions beyond the standard model and changing only its mapping onto the circuitry. The core insight is that in a realistic biological system the state input to the learning system is continuous and high-dimensional (unlike most previous models). If this input is represented with a distributed feature code, then this population code may be inherited by the prediction error signal for which it serves as both input and target. These interactions between prediction error and a high dimensional state space can explain many of the seemingly anomalous features of the heterogeneous dopamine response.

Dr. Peter Dayan: Savouring and its Modulation by Prediction Errors

Humans and animals apparently extract intrinsic value from anticipating, or savoring, impending rewards. Further, when these outcomes are uncertain, people typically prefer to know their fate in advance. We link these two phenomena through the suggestion that reward prediction errors occasioned by the revelation can boost the level of savoring. The result is a behavioral anomaly that has consequences for maladaptivity such as gambling. We formalize this proposal, and investigate its neurobiology in humans using fMRI. In a task involving delayed probabilistic rewards, we found that participants had a greater preference for advance information for greater delays and lower probabilities, consistent with the boosting hypothesis. Ventromedial prefrontal cortex (vmPFC) BOLD signals covaried with the time-varying anticipatory value signal predicted by the behavioral model. Reward prediction errors, encoded in midbrain BOLD, were coupled to vmPFC via hippocampus. We suggest that boosting might be driven by enhanced ippocampus-based imagination of future outcomes.