Dopamine Lab

Cue capture, omission dips, and reinforcement phenotypes

The dopamine module now teaches more than a single cueward shift. It frames temporal-difference learning as a way to compare blunted transfer, cue-dominant expectation, and brittle omission sensitivity without pretending this simple model is a literal disease simulator.

Teaching presets

Start from a learning phenotype, not just loose sliders

Each preset is a teaching lens for a different reinforcement story: clean transfer, cue capture, blunting, or brittle omission sensitivity.

Classical transfer

A clean teaching baseline where reward prediction error migrates from reward delivery toward the predictive cue over repeated trials.

Use this as the canonical reinforcement-learning scaffold before discussing disease, addiction, or motivational blunting.

This is a teaching baseline, not a literal patient phenotype.

Prediction error across a trial

Snapshot traces through learning

cuerewardNovel rewardEarly transferLate transferReward omittedWell learned

Anchor trials

Cue and reward peaks by checkpoint

T1T12T24T28T36
Cue peakReward peak

Anchor trials let you compare the cue takeover directly against the omission trial dip instead of only reading one full trace.

Transfer balance

Cue takeover index

Cue error minus reward errorOmission marked

The curve crosses zero around trial 11 and ends cue-dominant, which is the clean signature of transfer.

Learning phenotype

Brittle high-expectation learning

Volatile expectationsevere omission penalty

Cue value rises quickly, but the learned expectation is fragile: once the expected reward fails to appear, the negative prediction error is disproportionately deep.

Final cue response

0.454

How strongly the predictive cue now carries positive error.

Final reward response

+0.042

Positive means reward is still surprising; small or near-zero means value has shifted upstream.

Cue / reward ratio

9.08

A fast way to see whether the system is still reward-locked or already cue-dominant.

Shift trial

11

The first trial where cue response overtakes reward response.

Transfer index

+0.412

Positive values mean the cue has inherited more of the predictive burden.

Omission dip

-0.998

How hard the system crashes when expected reward fails to appear.

Value function

Final trial expectation

Learning curve

Cue versus reward responses

CueRewardOmission trial marked

Snapshot comparison

Trial-by-trial anchor cards

Novel reward

Trial 1

Reward on

Cue peak

0.000

Reward peak

+1.000

Cue value

0.004

Reward value

0.200

Early transfer

Trial 12

Reward on

Cue peak

+0.132

Reward peak

+0.086

Cue value

0.152

Reward value

0.931

Late transfer

Trial 24

Reward on

Cue peak

+0.344

Reward peak

+0.006

Cue value

0.364

Reward value

0.995

Reward omitted

Trial 28

Reward off

Cue peak

+0.397

Reward peak

-0.998

Cue value

0.411

Reward value

0.798

Well learned

Trial 36

Reward on

Cue peak

+0.454

Reward peak

+0.042

Cue value

0.467

Reward value

0.966

Clinical lens

Volatile expectation

Helpful when teaching frustration sensitivity, brittle reward expectation, and the difference between strong prediction and stable control.

A temporal-difference learning model used as a teaching scaffold for dopamine-like reward-prediction error signals described by Wolfram Schultz and colleagues.

Behavioral readout

What a learner should notice

  • Predictive cues quickly dominate the response profile.
  • Omission produces a large negative dip because expectation outruns resilience.
  • Behavior would likely feel highly expectation-bound and abruptly disrupted by reward failure.

Differential traps

What this model should not make you overclaim

  • A large omission dip does not necessarily mean the model is healthier; it can mean the expectation is brittle.
  • Fast learning is not the same thing as stable learning.

Next questions

Useful follow-up experiments

  • Does lowering learning rate or increasing trace stability soften the omission penalty?
  • How much of the volatility is driven by discounting versus the learning rate itself?

Model notes

Four reminders for students

  • Unexpected reward produces a strong positive prediction error when the model has not yet assigned value to the cue.
  • With learning, value back-propagates toward the predictive cue, so positive error shifts earlier in time.
  • Once expectation is established, omitted reward generates a negative error around the expected reward time.
  • This model is deliberately explanatory rather than biologically exhaustive: it separates learning transfer, cue capture, and omission sensitivity without claiming to be a literal disease simulator.

Continue the loop

Use this with anatomy, plasticity, and tutoring

Brain Atlas

Post-clinical anatomical convergence

Basal Ganglia Loop Explorer

Movement-disorders circuitry

Synaptic Plasticity

Mechanistic learning theory

Neuro Tutor

Cross-module consult reasoning with explicit scoring