Skip to main content
. Author manuscript; available in PMC: 2017 Dec 6.
Published in final edited form as: Nature. 2016 Mar 23;531(7596):642–646. doi: 10.1038/nature17400

Extended Data Figure 3. Predictive validity of the logistic regression classifier.

Extended Data Figure 3

ac, The model was trained on two-thirds of data and tested on the one-third of data that was held-out. The blue histogram indicates the chance distribution, determined by the model’s performance over a 1,000-fold shuffle of the held-out test data. The dashed line indicates cross-validation accuracy (CV) on held-out data. This calculation was performed for data from all rats (a; P < 0.001 by Monte Carlo simulation; CV is 24.3 s.d. outside the chance distribution), a balanced subset of data from risk-averse rats, such that approximately 50% of choices were safe and 50% were risky (b; P < 0.001 by Monte Carlo simulation; CV is 20.6 s.d. outside the chance distribution), and a balanced subset of data from risk-seeking rats (c; P < 0.001 by Monte Carlo simulation; CV is 8.5 s.d. outside the chance distribution). df, Receiver operating characteristic (ROC) curves derived from model performance on held-out test data across all rats (d; area under the curve (AUC) = 0.85), a balanced subset of data from risk-averse rats (e; AUC = 0.76), and a balanced subset of data from risk-seeking rats (f; AUC =0.78). g, h, Histogram of run lengths for risk-averse rats (g) and risk-seeking rats (h). Blue bars indicate runs on the risky lever. Grey bars indicate runs on the safe lever. Insets show exceptionally long runs.