Figure - PMC

Skip to main content

View full-text article in PMC

. Author manuscript; available in PMC: 2025 Jun 10.

Published in final edited form as: Nat Neurosci. 2025 Mar 18;28(6):1280–1292. doi: 10.1038/s41593-025-01915-4

TD learning models can explain dopamine responses in contingency degradation with appropriate ITI representation.

(a) ¬Temporal Difference Zero, TD(0), model – The state representation determines value. The difference in value between the current and gamma-discounted future state plus the reward determines the reward prediction error or dopamine. This error drives updates in the weights.

(b) Belief-State Model: After the ISI, the animal is in the Wait state, transitioning to the pre-transition (‘Pre’) state with fixed probability p. Animal only leaves Pre state following the observation of odor or reward.

(c) State representations: from the left, Complete Serial Compound (CSC) with no ITI representation, CSC with ITI states, Cue-Context model and the Belief-State model.

(d) Value in Odor A trials of each state representation using TD(0) for Conditioning and Degradation conditions

(e) TD error is the difference in value plus the reward.

(f) Mean normalized TD error of Odor A response from 25 simulated experiments. Error bars are SD.