Virtual reality conditioned place preference using monetary reward

Emma Childs; Robert S Astur; Harriet de Wit

doi:10.1016/j.bbr.2017.01.019

. Author manuscript; available in PMC: 2018 Mar 30.

Published in final edited form as: Behav Brain Res. 2017 Jan 17;322(Pt A):110–114. doi: 10.1016/j.bbr.2017.01.019

Virtual reality conditioned place preference using monetary reward

Emma Childs ¹, Robert S Astur ², Harriet de Wit ³

PMCID: PMC5335910 NIHMSID: NIHMS845711 PMID: 28108321

Abstract

Computerized tasks based on conditioned place preference (CPP) methodology offer the opportunity to study learning mechanisms involved in conditioned reward in humans. In this study, we examined acquisition and extinction of a CPP for virtual environments associated with monetary reward ($).

Healthy men and women (N=57) completed a computerized CPP task in which they controlled an avatar within a virtual environment. On day 1, subjects completed 6 conditioning trials in which one room was paired with high $ and another with low $. Acquisition of place conditioning was assessed by measuring the time spent in each room during an exploration test of the virtual environments and using self-reported ratings of room liking and preference. Twenty-four hours later, retention and extinction of CPP were assessed during 4 successive exploration tests of the virtual environments.

Participants exhibited a place preference for (spent significantly more time in) the virtual room paired with high $ over the one paired with low $ (p=0.015). They also reported that they preferred the high $ room (p<0.001) and liked it significantly more than the low $ room (p<0.001). However, these preferences were short-lived: 24h later subjects did not exhibit a behavioral or subjective preference for the high $ room.

These findings show that individuals exhibit transient behavioral and subjective preferences for a virtual environment paired with monetary reward. Variations on this task may be useful to study mechanisms and brain substrates involved in conditioned reward and to examine the influence of drugs upon appetitive conditioning.

Keywords: Virtual reality, conditioned place preference, monetary reward

Introduction

Conditioned place preference (CPP) is a paradigm that has been used for many decades to study mechanisms of drug reward and aversion in rodents [1]. In the model, two distinct environments are paired separately with administration of drug or placebo. After several pairing (conditioning) sessions, rats are given the opportunity to explore both environments. With drugs known to produce pleasurable effects in humans, laboratory animals tend to exhibit a preference for (spend more time in) the environment paired with that drug. Conversely, with drugs known to produce disagreeable effects in humans, rodents tend to exhibit an aversion to (spend less time in) the environment paired with that drug. Although widely used, the utility of the model has been questioned, in part because it is not known what aspects of the drug stimulus or environment become conditioned, and in part because it is not clear how the model is related to drug use in humans. Recently, the CPP paradigm has been translated to humans [2–4] in an attempt to provide answers to some of these questions. However, the equivalent human model requires sufficient space (at least 2 separate rooms) and considerable time commitment (6–10 separate visits). Thus, investigators have sought to establish more convenient paradigms using computer tasks based on CPP methodology using virtual environments.

In the first published virtual reality CPP study [5], individuals exhibited a conditioned place preference for i.e., spent more time in, a house associated with pleasant consonant music versus a house with static noise. Further, the investigators showed that individuals avoided a house associated with unpleasant dissonant music versus a house with static noise. Astur and colleagues [6, 7] went on to show that individuals exhibited a conditioned place preference for a virtual environment associated with food over one without food. They also showed that the strength of place preference was greater after food deprivation and greater in women who were dieting. Most recently, Astur and colleagues [8] extended their model to show that individuals exhibited a CPP for an environment associated with points, or secondary reinforcers with no intrinsic value. Finally, Radell et al. [9] used a modified virtual reality CPP task with contingent delivery of secondary reinforcers (golden eggs) to show that individuals with a preference for familiarity over unpredictability exhibited a bias toward a room associated with a high probability of reward. Together these studies have demonstrated that it is possible to establish preferences for virtual environments associated with music, food and secondary reinforcers. A prime utility of these virtual reality approaches, aside from their ease of implementation in the human laboratory, is that they may be conducted in the scanning environment e.g. FMRI, PET, in order to study the brain substrates involved in learning a CPP.

In the current experiment, we aimed to establish appetitive conditioning using a virtual reality CPP task with monetary reinforcement. Money is a generalized conditioned reinforcer that has been used extensively in behavioral tasks and maintains a diverse range of behaviors [10]. As a reinforcer, money is a robust reward. It is relatively independent of deprivation state, is easily quantified, is scalable, and does not interfere with ongoing behavior. Thus, establishing a virtual task with monetary reinforcement will optimize the versatility of the virtual CPP paradigm. The first aim of the study was to establish a preference for a virtual environment by pairing it with delivery of high monetary reward. We hypothesized that individuals would exhibit a subjective preference for the high reward room, as measured by ratings of room liking and preference as well as an objective preference as measured by time spent in the room. The second aim of the experiment was to assess retention of the preference for an extended period during a return visit to the laboratory 24h later. At this visit, we also tested persistence of the preference across several unrewarded tests. We hypothesized that individuals would retain their subjective and objective preference for the room previously paired with high monetary reward at the second visit and that it would diminish across repeated tests without reward.

Methods

Design

Men and women (aged 18–45) participated in this two-site study conducted at the University of Chicago and University of Illinois at Chicago. The study was approved by the Institutional Review Committees at each site and all subjects gave written informed consent before participation. Subjects were told that the purpose of the study was to investigate how people learn relationships. Subjects completed two experimental sessions conducted on consecutive days, in the laboratory. At the first session on day 1, subjects completed a virtual reality conditioning program in which high and low monetary reward were associated with distinct virtual environments. At the second experimental session on day 2, subjects completed the virtual reality task to assess retention of the learned association.

Virtual Conditioning Task

The virtual reality conditioning task was presented to participants on a Dell Optiplex computer with a 15”×15” monitor and accessory speakers for auditory feedback. The task was developed locally, based on the design of [6]. The virtual environment consisted of two distinct rooms (A and B) connected by a hallway (Figure 1). The rooms were equally sized and contained a similar number and variety of immobile, non-interactive objects (television, sofa, table, and bookcase) but were visually distinct in terms of the color of the walls, carpet and sofa, pictures on the walls, location and design of furniture. The computer screen presented subjects with a first-person view of the virtual environments and they navigated about the virtual space using the keyboard arrow keys and mouse.

The task consisted of a Conditioning phase and a Testing phase. During the conditioning phase, subjects completed six separate conditioning trials (2-min each). During each trial subjects were confined to just one of the rooms (each trial began with the subject already in the room), thus 3 trials took place in each room (in counterbalanced order, ABBAAB or BAABBA). Subjects were told that their task was to move about the rooms and to collect balloons as they appeared, signaled by text instructions on the screen “a balloon has appeared”. They collected balloons by approaching the balloon and jumping beneath it to touch it. They were also told that they could earn money during the task which would be paid to them at the end of testing that day. They were told that their earnings were dependent on interactions with the environment and would be displayed on a counter at the bottom of the screen. In fact, monetary rewards were delivered randomly (signaled by an auditory cue) in $0.15 increments regardless of subjects’ movements within the virtual rooms or collection of balloons, and all subjects accumulated $14.85 throughout conditioning which was paid to subjects ($15.00 in total) at the end of testing on day 1. During the trials, monetary reward accrued at a greater rate in one of the rooms, designated the High Reward Room ($4.35 delivered during each conditioning trial, total=$13.05, approximate reward rate=$0.15 every 4s), in comparison to the other room, the Low Reward Room ($0.60 delivered during each conditioning trial, total=$1.80, approximate reward rate=$0.15 every 30s). The High and Low Reward rooms were assigned in a counterbalanced order. During the testing phase, subjects completed preference tests (1-min each) in which they could explore both rooms moving freely between them. Subjects were told that they would have 1-min to explore both of the virtual rooms. Preference tests began in the hallway and subjects were instructed to enter the rooms, after which they could not re-enter the hallway, but could move between the two rooms via an interconnecting doorway (see Figure 1). No monetary rewards were delivered during preference tests and no balloons were presented.

Procedure

Figure 1 shows a timeline of procedures conducted at each experimental session.

Day 1: At the first experimental session, subjects began with a 60-s practice trial (P). This practice trial took place within a gray virtual space with no distinguishing features and served to acclimate subjects to the controls and moving within the virtual environment. Balloons also appeared during the practice trial so that participants could practice collecting them. Subjects then completed the six 2-min conditioning trials (C). At the end of the last trial, they completed a questionnaire to rate how much they liked each room and which room they preferred (see Dependent Measures). Subjects then completed a 1-min preference test (T₀) followed by a second questionnaire to rate their liking of and preference for the rooms. They were then paid $15 and allowed to leave.

Day 2: Subjects returned to the lab 24-h later for the second experimental session at which they completed four 1-min preference tests (T_1–4). After the first (T₁) and last (T₄) tests subjects completed the questionnaire to rate their liking of and preference for the rooms. Subjects were then debriefed about the aims of the study and paid for their participation.

Dependent Measures

Time spent: The computer program recorded the amount of time spent in each of the rooms during preference tests.
Subjective room liking and preference: The questionnaire consisted of a picture of each room associated with a Likert scale from 0 (“Dislike very much”) to 9 (“Like very much”). Relative preference for the rooms was assessed by a Likert scale below side-by-side pictures of the rooms, from 5 on the left-hand side (“prefer room on the left”) through 0 (“prefer neither”) to 5 on the right-hand side (“prefer room on the right”).

Data Analysis

Demographic characteristics of individuals tested at each site were compared using separate samples t-tests (for continuous variables) and Chi-squared analysis (for categorical variables). Site was included as a variable in later analyses. Data was analyzed separately for the two experimental sessions, to assess acquisition of conditioning (Day 1) and then extinction of conditioning (Day 2).

Acquisition of Conditioning: Day 1 data was analyzed to assess whether subjects acquired place conditioning for the high reward-paired room. The amount of time spent in each room during the preference test (T₀) was compared using a paired samples t-test. Subjective liking of the two rooms after conditioning (C) and T₀ was compared using a two factor (Room×Time) repeated measures analysis of variance (rmANOVA). Room preference after conditioning (C) and P₀ was compared using a one factor (Time) rmANOVA. Significant main effects of Time and interactions with Room were followed up by separate tests at each time point (paired t-test for room liking scores, one sample t-test for preference scores). We also assessed the relationship between subjective liking of and preference for the rooms after conditioning (C) and time spent in the rooms during T₀ using correlation analysis.
Extinction of Conditioning: Day 2 data were analyzed to assess extinction of place conditioning. The amount of time spent in each room across successive preference tests (T₁-T₄) was compared using two factor (Room×Time) rmANOVA. Similarly, liking of the two rooms after T₁ and T₄ was compared using a two factor (Room×Time) rmANOVA. Preference for the high reward room after T₁ and T₄ was analyzed using a one factor (Time) rmANOVA. Significant main effects of Time and interactions with Room were followed up by separate tests at each time point (paired t-test for room liking scores, one sample t-test for preference scores).

Data were analyzed using SPSS Statistics 22 for Windows. Differences were considered significant at p<0.05.

Results

Most participants were female (57.9%), white (36.8%), aged in their early-twenties (23.9±0.6 years) and had a college degree (71.9%). The demographic characteristics of participants tested at each site are shown in Table 1. Individuals tested at the University of Chicago were younger [mean difference=3.4±1.2 years, t(55)=2.9 p=0.005] and consumed less alcohol per week [mean difference=2.7±1.1 drinks, t(55)=2.5 p=0.016] than those tested at the University of Illinois at Chicago. Site was included as a variable in all later analyses but did not significantly influence the findings.

Table 1.

Demographic characteristics of participants tested at each site (UChicago=University of Chicago, UIC=University of Illinois at Chicago).

	Site 1 (UChicago)	Site 2 (UIC)
N (male/female)	31 (14/17)	26 (10/16)
Age	22.4 ± .6^**	25.8 ±1.0
Race (% White/Black/Asian/Other)	32/23/26/19	42/15/31/12
Caffeine (cups/wk)	10.6 ± 2.4	6.5 ± 1.5
Tobacco (cigarettes/wk)	5.5 ± 2.0	1.4 ± 1.3
Alcohol (drinks/wk)	1.5 ± .5^*	4.2 ± 1.0
Cannabis (times/mo)	5.0 ± 3.3	.5 ± .4

Open in a new tab

Asterisks indicate a significant difference between sites (separate samples t-test) *p<0.05 **p<0.01.

Thirty individuals were assigned to receive high monetary reward in the red room and 27 were assigned to receive high monetary reward in the blue room. Assignment of the high or low reward room (Red vs. Blue) was also included as a variable in all analyses but did not significantly influence the findings.

1) Acquisition of Place Preference

Individuals spent significantly more time in the high reward room during the first preference test after conditioning [t(56)=2.5 p=0.015, Figure 2]. There was a significant Room*Time interaction [F(1,56)=14.9 p<0.001] on ratings of room liking (Figure 3); After conditioning, participants reported that they liked the high reward room significantly more than the low reward room [t(56)=4.9 p<0.001], however this difference was significantly diminished after just one preference test [t(56)=2.1 p=0.04]. There was also a significant effect of Time [F(1,56)=9.6 p=0.003] on ratings of preference for the high reward room (Figure 4); After conditioning, participants exhibited a significant preference for the high reward room [t(56)=3.8 p<0.001] but this was no longer evident after the first preference test [t(56)=1.7 p=0.1]. Ratings of subjective liking of and preference for the rooms after conditioning were nominally related with time spent in the rooms during T₀, but did not reach statistical significance (subjective preference for and time spent in high reward room r=0.24, p=0.072).

Time spent in the high and low reward rooms during each 1-min preference test (T). Asterisks indicate a significant difference between the rooms (paired samples t-test *p<0.05).

Subjective liking ratings for the high and low reward rooms after conditioning (C) and preference (T) test. Asterisks indicate a significant difference between the rooms (paired samples t-test *p<0.05 ***p<0.001).

Subjective ratings of preference for the high reward room after conditioning (C) and preference (T) tests. Asterisks indicate a significant difference from zero (one-sample t-test ***p<0.001).

2) Extinction of Place Preference

Conditioning, as measured by time spent in each room (Figure 2), subjective room liking (Figure 3) and preference for the high reward-paired room (Figure 4), was not retained on Day 2; All main effects of Room and interactions with Time were not significant [all ps>0.1].

Discussion

In this study, we established a behavioral and subjective preference for a virtual environment associated with high monetary reward over one paired with low monetary reward among healthy individuals. Notably, this preference was evident both on the measure of time spent exploring the virtual environment after conditioning, and on liking and preference ratings of the rooms. However, the conditioned preference for the high monetary reward room was short-lived and 24h later when subjects returned to the lab, they no longer exhibited the preference. These results demonstrate the feasibility of using monetary reward to establish a place preference similar to that established in rodents using drugs. The results were comparable at each of the two sites, demonstrating the reliability of the task. Overall, our findings demonstrate that this is a reliable task that produces a transient conditioning effect that may be applied to investigate mechanisms of conditioned reward.

Our results support and extend those of previous studies of virtual reality tasks based on CPP methodology. In particular, Astur and colleagues recently used a computerized CPP task to establish a preference for a virtual room associated with secondary reinforcers (i.e., points; [8]). As in that study, here we show that we can establish a CPP for a virtual room associated with a different secondary reinforcer, in this case money. One difference to Astur’s study [8], is that we were able to establish a preference using a counterbalanced room assignment procedure i.e., the High Reward room was assigned randomly to subjects. In their study, Astur et al., established a CPP only when using a biased room assignment procedure i.e., the room associated with secondary reinforcers was the room that subjects spent least time in and liked the least at a pre-test before conditioning trials were conducted. When they used a counterbalanced room assignment procedure, subjects did not exhibit a CPP. A possible explanation for the difference in results between the two studies is that in our study we used money to establish the CPP, a secondary reinforcer with intrinsic value, whereas Astur et al. used points with no intrinsic value. Thus, the greater intrinsic value of the reinforcer used in our study may have established a stronger conditioning effect.

In this study we assessed retention and extinction of the CPP at a second visit conducted 24h after conditioning. Previously, Astur et al. [6] conducted a virtual reality CPP study with food reward in which conditioning and testing were performed on separate days. In their study, food-deprived subjects exhibited a preference for a food-paired virtual room 24h after conditioning was performed. In contrast, the results of our study indicate that the conditioning effect was relatively brief and subjects did not exhibit the CPP 24h later. One difference between the studies is that, in our study we conducted a preference test on day 1 (to examine acquisition of conditioning) while Astur et al. did not. Thus, conducting a single preference test on day 1 may have been sufficient to extinguish the preference. An alternative explanation could be that in our study the rooms were associated with high vs. low reward, whereas in the Astur et al. study the rooms were associated with reward vs. no reward. Thus, the difference in relative reward between the two rooms could also account for the more rapid extinction seen in our study. Future studies will aim to assess if the CPP can be retained for a longer period if conditioning is strengthened, either by a greater number of conditioning sessions or with greater reward during conditioning. In addition, it would be interesting to see if the current CPP could be reinstated by priming or by another means e.g. stress.

Limitations of the current study include that there was no pre-test before the conditioning trials took place, thus it is not possible to assess whether initial room preferences influenced the conditioning effect, or indeed whether the apparatus was biased i.e., whether a significant proportion of subjects preferred a given room before conditioning. Pre-existing biases are unlikely to have played a major role because the conditioning effect on day 1 was not influenced by the particular room (red vs. blue) that was paired with high monetary reward. Future studies should conduct a pre-test to resolve these questions. A second limitation is that it is not possible to distinguish whether the room preference was produced by increased frequency of reward delivery (independent of overall quantity) or the overall greater magnitude of reward received in that room. Future studies that deliver rewards of differing magnitudes but at the same frequency between rooms will be able to answer this question. Finally, the current design does not allow us to determine whether preferences were no longer apparent on day 2 because of the preference test on day 1, or simply because of the passage of time. We will be able to distinguish between these possibilities in future studies, for example by not conducting any preference tests on day 1, or by conducting multiple preference tests on day 1 to assess extinction right after conditioning.

Overall, our findings indicate that it is possible to establish a preference for a virtual environment associated with monetary reward. As highlighted by others [5, 6, 9], virtual reality CPP tasks provide the opportunity to assess the brain substrates involved in conditioned reward. Importantly, this task could also be used to assess the effects of pharmacological treatments on both the acquisition and extinction of appetitive conditioning.

Highlights.

We examined acquisition and extinction of a conditioned place preference for virtual environments associated with monetary reward in healthy young adults.
Participants exhibited a place preference for (spent significantly more time in) a virtual environment associated with high monetary reward over one paired with low monetary reward.
Participants reported that they preferred the room paired with high monetary reward and liked it significantly more than the room paired with low monetary reward.
The behavioral and subjective place preferences were transient; participants did not exhibit a place preference for the high reward room 24h later.

Acknowledgments

The authors thank Francisco Meyer, Jacob Seiden, Joseph Lutz, and Samantha Kiel for technical assistance. Funding for this study was provided by the National Institute of Drug Abuse (grant number DA02812; de Wit) and the National Institute on Alcohol Abuse and Alcoholism (grant number R01AA022961, Childs).

Footnotes

Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

References

1.Tzschentke TM. Measuring reward with the conditioned place preference (CPP) paradigm: update of the last decade. Addict Biol. 2007;12(3–4):227–462. doi: 10.1111/j.1369-1600.2007.00070.x. [DOI] [PubMed] [Google Scholar]
2.Childs E, de Wit H. Amphetamine-induced place preference in humans. Biol Psychiatry. 2009;65(10):900–904. doi: 10.1016/j.biopsych.2008.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]
3.Childs E, de Wit H. Contextual conditioning enhances the psychostimulant and incentive properties of d-amphetamine in humans. Addict Biol. 2013;18(6):985–992. doi: 10.1111/j.1369-1600.2011.00416.x. [DOI] [PMC free article] [PubMed] [Google Scholar]
4.Childs E, de Wit H. Alcohol-induced place conditioning in moderate social drinkers. Addiction. 2016 doi: 10.1111/add.13540. [DOI] [PMC free article] [PubMed] [Google Scholar]
5.Molet M, Billiet G, Bardo MT. Conditioned place preference and aversion for music in a virtual reality environment. Behav Processes. 2013;92:31–35. doi: 10.1016/j.beproc.2012.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]
6.Astur RS, Carew AW, Deaton BE. Conditioned place preferences in humans using virtual reality. Behav Brain Res. 2014;267:173–177. doi: 10.1016/j.bbr.2014.03.018. [DOI] [PubMed] [Google Scholar]
7.Astur RS, et al. Pavlovian conditioning to food reward as a function of eating disorder risk. Behav Brain Res. 2015;291:277–282. doi: 10.1016/j.bbr.2015.05.016. [DOI] [PubMed] [Google Scholar]
8.Astur RS, et al. Conditioned place preferences in humans using secondary reinforcers. Behav Brain Res. 2016;297:15–19. doi: 10.1016/j.bbr.2015.09.042. [DOI] [PubMed] [Google Scholar]
9.Radell ML, et al. The Personality Trait of Intolerance to Uncertainty Affects Behavior in a Novel Computer-Based Conditioned Place Preference Task. Front Psychol. 2016;7:1175. doi: 10.3389/fpsyg.2016.01175. [DOI] [PMC free article] [PubMed] [Google Scholar]
10.Hackenberg TD. Token reinforcement: a review and analysis. J Exp Anal Behav. 2009;91(2):257–286. doi: 10.1901/jeab.2009.91-257. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R1] 1.Tzschentke TM. Measuring reward with the conditioned place preference (CPP) paradigm: update of the last decade. Addict Biol. 2007;12(3–4):227–462. doi: 10.1111/j.1369-1600.2007.00070.x. [DOI] [PubMed] [Google Scholar]

[R2] 2.Childs E, de Wit H. Amphetamine-induced place preference in humans. Biol Psychiatry. 2009;65(10):900–904. doi: 10.1016/j.biopsych.2008.11.016. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R3] 3.Childs E, de Wit H. Contextual conditioning enhances the psychostimulant and incentive properties of d-amphetamine in humans. Addict Biol. 2013;18(6):985–992. doi: 10.1111/j.1369-1600.2011.00416.x. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R4] 4.Childs E, de Wit H. Alcohol-induced place conditioning in moderate social drinkers. Addiction. 2016 doi: 10.1111/add.13540. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R5] 5.Molet M, Billiet G, Bardo MT. Conditioned place preference and aversion for music in a virtual reality environment. Behav Processes. 2013;92:31–35. doi: 10.1016/j.beproc.2012.10.001. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] 6.Astur RS, Carew AW, Deaton BE. Conditioned place preferences in humans using virtual reality. Behav Brain Res. 2014;267:173–177. doi: 10.1016/j.bbr.2014.03.018. [DOI] [PubMed] [Google Scholar]

[R7] 7.Astur RS, et al. Pavlovian conditioning to food reward as a function of eating disorder risk. Behav Brain Res. 2015;291:277–282. doi: 10.1016/j.bbr.2015.05.016. [DOI] [PubMed] [Google Scholar]

[R8] 8.Astur RS, et al. Conditioned place preferences in humans using secondary reinforcers. Behav Brain Res. 2016;297:15–19. doi: 10.1016/j.bbr.2015.09.042. [DOI] [PubMed] [Google Scholar]

[R9] 9.Radell ML, et al. The Personality Trait of Intolerance to Uncertainty Affects Behavior in a Novel Computer-Based Conditioned Place Preference Task. Front Psychol. 2016;7:1175. doi: 10.3389/fpsyg.2016.01175. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R10] 10.Hackenberg TD. Token reinforcement: a review and analysis. J Exp Anal Behav. 2009;91(2):257–286. doi: 10.1901/jeab.2009.91-257. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Virtual reality conditioned place preference using monetary reward

Emma Childs

Robert S Astur

Harriet de Wit

Abstract

Introduction