Abstract
A fundamental implication of standard moral hazard models is overuse of low-value medical care because copays are lower than costs. In these models, the demand curve alone can be used to make welfare statements, a fact relied on by much empirical work. There is ample evidence, though, that people misuse care for a different reason: mistakes, or “behavioral hazard.” Much high-value care is underused even when patient costs are low, and some useless care is bought even when patients face the full cost. In the presence of behavioral hazard, welfare calculations using only the demand curve can be off by orders of magnitude or even be the wrong sign. We derive optimal copay formulas that incorporate both moral and behavioral hazard, providing a theoretical foundation for value-based insurance design and a way to interpret behavioral “nudges.” Once behavioral hazard is taken into account, health insurance can do more than just provide financial protection — it can also improve health care efficiency.
1. Introduction
Moral hazard is central to how we understand health insurance. Because the insured pay less for health care than it costs, they may overuse it (Arrow 1963; Pauly 1968; Zeckhauser 1970; Cutler and Zeckhauser 2000). In the standard moral hazard model, the demand curve alone is enough to quantify the inefficiency generated by insurance. We can draw welfare conclusions about changes in copays without measuring changes in health: if people optimize, health benefits equal copays at the margin. A large body of empirical work relies on this “sufficient statistic” property to make welfare calculations and policy recommendations, equating evidence of moral hazard with evidence of the price-sensitivity of demand for medical care (Feldstein 1973; Manning et al. 1987; Newhouse 1993; see Finkelstein, 2014 for a review). Yet, when it comes to health care choices, people may fail to optimize so perfectly. This paper develops a richer model of health insurance that allows people to make mistakes and implies that relying on demand data alone can lead to highly misleading welfare calculations.
Many patterns of health care use are hard to reconcile with a world in which moral hazard alone drives misutilization. Many patients underuse care with health benefits that substantially exceed costs (even accounting for possible side-effects or other non-monetary costs).1 Diabetes medications, for example, increase life span, reduce the risk of limb loss or blindness, and improve quality of life, but estimates of adherence are usually under 70% (DiMatteo 2004). There is similarly low adherence for medications that help manage other chronic conditions and for treatments such as prenatal and post-transplant care (van Dulmen et al. 2007; Osterberg and Blaschke 2005). Nor does moral hazard explain all overutilization: patients sometimes demand care that does not benefit them – or may even be harmful (Schwartz et al. 2014). For example, patients seek out antibiotics with clear risks and unclear benefits for ear infections (Spiro et al. 2006). It is hard to explain this kind of overuse solely by private benefits exceeding private costs.
This evidence is consistent with a simple narrative. People misuse care not just because the price is below the social marginal cost, but also because they make mistakes. We call this kind of misutilization behavioral hazard. Many psychologies can contribute to behavioral hazard. People may overweight salient symptoms (Bordalo, Gennaioli, and Shleifer 2012, 2013) such as back pain or underweight non-salient ones such as high blood pressure or high blood sugar (Osterberg and Blaschke 2005). They may be present-biased (Laibson 1997; O’Donoghue and Rabin 1999) and overweight the immediate costs of care, such as copays and hassle-costs of setting up appointments or filling prescriptions (Newhouse 2006). They may simply forget to take their medications or refill their prescriptions. Or they may have false beliefs about the efficacy of care (Pauly and Blavin 2008). Section 2 builds on Mullainathan, Schwartzstein and Congdon (2012) by introducing a model of behavioral hazard that nests such biases, as well as others within a broad class.
Behavioral hazard means that welfare calculations can no longer be made from demand data alone. Consider the “marginal” insurees—those who respond to a copay change. In the standard model, these consumers are trading off health benefits against the copay. Because they are optimizing, their indifference means these benefits equal the copay.2 But Section 3 shows that with behavioral hazard, this inference fails when insurees misvalue care. For example, we would not want to conclude falsely that diabetes medications are ineffective because a modest copay reduces adherence (e.g., based on Goldman et al.’s [2004] estimates), or that breast cancer patients place little value on conserving breast tissue because a modest copay induces them to switch from equally-effective breast-conserving lumpectomy to breast-removing mastectomy (e.g., based on Einav, Finkelstein, and Williams’s [2015] estimates).3 Behavioral hazard means that agents can be marginal in their choices even when health benefits far exceed the copay.
This is more than an abstract concern. First, we show that low-value and high-value care have surprisingly similar price elasticities. Second, we reexamine the results of a large-scale field experiment that eliminated some drug copays for recent heart attack victims and found large increases in drug use (Choudhry et al. 2011). Looking only at this demand response would suggest significant moral hazard and overuse of low-value drugs. But there were also substantial reductions in mortality and improvements in health. While traditional analysis would imply that eliminating drug copays led to a welfare cost, taking behavioral hazard into account implies a much larger welfare gain.
The fact that the demand curve is not a sufficient statistic also has implications for the optimal design of insurance. We show in Section 4 that the optimal copay formula now depends on both demand and health responses.4 This provides a formal foundation for “value-based insurance design” with lower cost-sharing for higher value care (Chernew, Rosen, and Fendrick 2007; Liebman and Zeckhauser 2008; and Chandra et al. 2010). Our model nests a more specific result of Pauly and Blavin (2008) that applies to the case of uninformed consumers. Perhaps surprisingly, we show that the health value of treatment should be taken into account even when behavioral hazard is unsystematic and averages to zero across the population, so long as it is variable. Once behavioral hazard is taken into account, health insurance does not just provide financial protection: it can also create incentives for more efficient treatment decisions.
Factoring in behavioral hazard can have a large effect. In Section 5 we compare the optimal copay when behavioral hazard is incorporated to that in the neo-classical model when it is not. The neo-classical model underestimates the optimal copay whenever behavioral hazard systematically drives people to overuse, and overestimates the optimal copay whenever behavioral hazard systematically drives people to underuse.5 In fact, we show that when behavioral hazard is extreme, the situations in which a neo-classical model generates particularly low copays are precisely those in which copays should be particularly high, and vice versa.
In addition to changing the calculus around optimal copays, our framework also has implications for the optimal use of nudges (such as defaults and reminders; Thaler and Sunstein 2009) to mitigate misuse or to calibrate the degree of behavioral hazard. Section 6 discusses this as well as other extensions of the basic analysis, including how we might estimate the degree of behavioral hazard when measuring health responses is difficult and what we might expect the market to deliver in equilibrium. While we focus on the patient side, clearly physicians also play an important role in determining the care that is ultimately received. We briefly discuss areas where combining patient and physician behavior into one framework could be fruitful (see also Frank [2004]). Section 7 concludes with a discussion of directions for future work.
2. A Model of Behavioral and Moral Hazard
2.1. Moral Hazard
We begin with a stylized model of health insurance. Consider an individual with wealth y. Insurance has price, or premium, P. When healthy, she has utility U(y − P) if she buys insurance. With probability q ∈ (0, 1), she can fall sick with a specific condition with varying degree of severity s that is her private information. For example, individuals may be afflicted with diabetes that varies in how much it debilitates. Assume s ~ F(s), where F has support on and . Assume further that F(s) has strictly positive density f(s) on . Severity is measured in monetary terms so that the sick agent receives utility U(y − P − s) absent treatment.
Treatment can lessen the impact of the disease. Treatment costs society c, and its benefit b(s; γ) depends on severity, where is a parameter that allows for heterogeneity across people in treatment benefits conditional on disease severity, and is also private information.6 The more severe the disease, the greater the benefits: bs > 0. We assume b(0; γ) = 0 for all γ (the unaffected get no benefit) and bs ≤ 1 (the treatment cannot make people better off than not having the disease). The benefits are put in monetary terms. It is efficient for some but not all of the sick to get treated: for all γ. We assume that the insured individual pays price or “copay” p for treatment. While the copay implicitly depends on the disease and treatment, it is independent of s and γ; we assume that both disease severity and treatment benefits cannot be contracted over because the insurer cannot perfectly measure them. The interpretation is that the copay is conditional on all information known to the insurer, but the individual may have some residual private information.7 In this way, we nest the traditional moral hazard model. An insured individual who receives treatment for his disease gets utility U(y − P − s + b(s; γ) − p).
We evaluate insurance contracts from the perspective of a benevolent social planner ranking contracts based on social welfare.8 Welfare as a function of the copay and the premium equals expected utility:
(1) |
where m(p) ∈ {0, 1} represents an individual’s demand for care at a given price and equals 1 if and only if the person demands treatment. The first term is the utility if individuals do not get sick with a specific condition: they simply pay the premium. The second term is the utility if they do get sick: the expected utility (depending on disease severity and other stochastic parameters, described in more detail below) that includes the loss due to being sick (−s) as well as the benefits of care net of costs to individuals (b − p) for the times they choose to use care (m(p) = 1). We assume that insurance must be self-funding: P = P(p) = M(p)(c − p), where equals the per-capita aggregate demand at a given copay. As a result, we can rewrite welfare solely as a function of the copay: with some abuse of notation, W(p) ≡ W(p, P(p)).
In this simple setup, the choice to receive treatment when insured is easy: the rational person gets treated whenever benefits exceed price, or b > p. This decision is the source of moral hazard. While the insurance value in insurance comes from setting price below true cost, or p < c, this subsidized price means that while individuals should efficiently get treated whenever b > c (benefits exceed social costs), they get treated whenever b > p (benefits exceed private costs), generating inefficient utilization when c > b > p. Figure I provides an illustration, where individuals are arrayed on the line according to treatment benefits. Those to the right of the cost c should receive treatment and do so. Those to the left of the price p should not receive treatment and do not. The middle region represents the problem: those individuals should not receive treatment but they do. The price subsidy inherent to insurance is the source of misutilization: Raising the price individuals face would diminish overutilization, but come at the cost of diminished insurance value.9
2.2. Behavioral Hazard
There is, however, ample evidence of misutilization that is difficult to interpret as a rational person’s response to subsidized prices. We incorporate behavioral hazard through a simple modification of the original model. Instead of deciding by comparing true benefits to copays, evaluating whether b(s; γ) > p, people choose according to whether b(s; γ) + ε(s; θ) > p, where ε is positive in the case of positive behavioral hazard (for example, seeking an ineffective treatment for back pain) and negative in the case of negative behavioral hazard (for example, not adhering to effective diabetes treatment). The parameter allows for heterogeneity across people in the degree of behavioral hazard and is not observable to the insurer. We assume that b(s; γ) + ε(s; θ) is differentiable and strictly increasing in s for all (γ, θ). The parameters (γ, θ) are distributed independently from s, according to joint distribution G(γ, θ). We let Q(s, γ, θ) = F(s)G(γ, θ) denote the joint distribution of all the possibly stochastic parameters. All expectations are taken with respect to this distribution unless otherwise noted. When U is non-linear, it will be useful to consider a “normalized” version of the behavioral error,
which essentially puts ε in utility units. (Note that ε = ε′ for linear U, so we have the approximation ε ≈ ε′ if we take U to be approximately linear).
This formulation builds on Mullainathan, Schwartzstein and Congdon (2012) and implicitly captures a divide between preference as revealed by choice and utility as it is experienced, or between “decision utility” and “experienced utility” (Kahneman et al. 1997). In our framework, b − p affects the experienced utility of taking the action. Individuals instead choose as if b + ε − p affects this utility.
This framework nests behavioral models where people misbehave because of mistakes. What it is not designed to capture are models of non-standard preferences. For example, anticipation and anxiety may alter how individuals experience benefits (Koszegi 2003): benefits will vary depending on whether taking the action (such as getting an HIV test) leads to anxiety in anticipating the outcome. In these kinds of situations, the behavioral factor may not be a bias affecting ε, but rather a force that affects the mapping between outcomes (such as getting a diagnostic test) and benefits b.
Three examples of behavioral biases that our formulation nests are presented here and summarized in Table I: present-bias, symptom salience, and false beliefs.10
Table I:
Present-Bias | Symptom Salience | False Beliefs | |
---|---|---|---|
Treatment Rule | −k(s; γ) + βθ · υ(s) > p | b(αθυ + μθn + o; γ) > p | |
Expression for ε | −(1 – βθ)υ(s) | b(αθυ + μθn + o; γ) – b(s; γ) |
Present-bias can be important because the benefits of medical care are often in the distant future while the costs appear now (Newhouse 2006). Take the canonical (β, δ) model of presentbias (Laibson 1996; O’Donoghue and Rabin 1999), where, for simplicity, δ = 1. Suppose each treatment is associated with an immediate cost but a delayed benefit. Specifically, b(s; γ) = −k(s; γ)+v(s), where k(s; γ) represents immediate costs, for example side effects, which can vary across the population even conditional on disease severity, and v(s) represents delayed benefits, which for simplicity are assumed to depend only on disease severity. The notation and language suggest that v > 0 and k > 0, but we also allow for v < 0 and k < 0, with benefits of treatment in the present and costs delayed. For example, taking a medication may lead to immediate benefits and more delayed side effects. While standard agents (for simplicity) are assumed not to discount future benefits, present-biased agents discount these benefits by factor β ∈ (0, 1). Instead of getting treated whenever b(s; γ) = −k(s; γ) + v(s) > p, present-biased agents get treated whenever −k(s; γ) + βθ · v(s) > p ⇔ b(s; γ) − (1 − βθ)v(s) > p, where here θ allows for heterogeneity in the degree to which people are present-biased. Defining εPB(s; θ) ≡ −(1 − βθ)v(s), the present-biased agent has a propensity to underuse treatment relative to what is privately optimal whenever εPB(s; θ) < 0 (corresponding to delayed treatment benefits, v > 0) and overuse whenever εPB(s; θ) > 0 (corresponding to delayed costs, v < 0).
Symptom salience can be important. Individuals appear to overweight salient symptoms and underweight less salient ones (Osterberg and Blaschke 2005), driving overuse or underuse. For example, diabetics’ symptoms of elevated glucose levels are often not salient (Rubin 2005), and it is easy to undervalue the health benefits of taking a pill whose effects cumulate slowly over time. Patients at the symptomatic stages of HIV/AIDS are more likely to be adherent to their treatment regimens than patients at the asymptomatic stage (Gao et al. 2000). Most tuberculosis treatment regimens are at least six months long, but effective therapy leads to improved symptoms after the first four weeks and there is a concurrent drop-off in adherence. Pain, on the other hand, is clearly highly salient, and patients may overweight the current pain and seek expensive treatments with potential adverse effects in the future. Stories in the popular press highlight the role of symptom salience: a recent report noted the death of an uninsured patient with a tooth infection who was prescribed an antibiotic and a painkiller and who spent his limited resources to fill the painkiller prescription rather than the potentially life-saving antibiotic (Gann, ABC News 2011).
Economists in recent years have introduced rich models to study the impact of salience on behavior (e.g., Bordalo, Gennaioli and Shleifer 2012, 2013; Koszegi and Szeidl 2013). We use a modified version of DellaVigna’s (2009) empirical model of limited attention. Suppose the severity of symptoms is the sum of three components: the severity of highly visible or painful symptoms, v, the severity of opaque or non-painful symptoms, n, and other symptoms, o, or
(2) |
The inattentive agent overweights the painful symptoms and underweights non-painful symptoms, so he acts not on true disease severity s, but on “decision severity”
(3) |
where αθ ≥ 1 and μθ ≤ 1. The magnitudes |1 − αθ| and |1 − μθ| can be thought of as parameterizing the degree to which the agent misbehaves due to symptom salience, where he acts according to the standard model when αθ = μθ = 1. The person gets treated if
(4) |
Defining εSS(s; θ) = b(αθv + μθn + o; γ) − b(v + n + o; γ), where we assume the right-hand-side is constant in γ, the person has a propensity to underuse treatment relative to what is privately optimal whenever εSS(s; θ) < 0, where non-painful symptoms are sufficiently prominent (i.e., n > v(αθ − 1)/(1 − μθ) for μθ ≠ 1), and has a propensity to overuse treatment relative to what is privately optimal whenever εSS(s; θ) > 0, where painful symptoms are sufficiently prominent (i.e., v > n(1 − μθ)/(αθ − 1) for αθ ≠ 1).
False beliefs can also play a role (e.g., Pauly and Blavin 2008).11 Tuberculosis patients may stop taking their antibiotics halfway through their drug regimen not just because salient symptoms have abated, but also because they believe the disease has disappeared. People may falsely attribute treatment benefits as well, such as when they buy an herbal medicine with no known efficacy.12 Instead of getting treated when b(s; γ) > p, agents with false beliefs get treated when , where is the decision benefit to getting treated. Defining , which for simplicity we assume is constant in γ, the person with false beliefs has a propensity to underuse treatment whenever εFB(s; θ) < 0, where they undervalue treatment , and has a propensity to overuse treatment whenever εFB(s; θ) > 0, where they overvalue treatment .
2.3. Misutilization with Behavioral Hazard
No matter the psychological micro-foundation, behavioral hazard changes how we think about the demand for treatment. We illustrate this in Figure II. We have now added a second axis to form a square instead of a line, where the vertical axis represents b + ε, which can vary by individual. The horizontal line separates the region where b + ε > p, while the vertical line separates the region where b > c. We see the ranges of misutilization are no longer clear. The people in the bottom left corner (where b + ε < p and b < c) are efficient non-users. Those in the top right corner (where b + ε > p and b > c) are efficient users. But there are now three other regions.
The bottom right area is a region of underutilization. People fail to consume care in this region because b + ε < p, but the actual benefits exceed social cost. When there is behavioral hazard, underutilization is a concern, not just overutilization due to moral hazard. Examples such as the lack of adherence to drugs treating chronic conditions, like diabetes, hypertension, and high cholestorol, illustrate such underutilization, and Online Appendix Table 1 provides further examples and references.13
The top left area illustrates overutilization. In this area, benefits of care are below cost so b < c, and the efficient outcome is for the individual not to get treated. Yet because b + ε > p the behavioral agent receives care. This area can be broken down further, according to whether b + ε > c. When this inequality holds, decision benefits are above cost even though true benefits are below cost. In this case, overutilization will not be solved by setting price at true cost. Examples such as people demanding ineffective (or possibly harmful) antibiotics for sinus or ear infections, the overtreatment of prostate cancer, and the extremely high demand for MRIs for back pain may illustrate such overutilization. Finally, the area of overutilization when b + ε ≤ c illustrates traditional overutilization due to moral hazard.
Misutilization is not solely a consequence of health insurance when there is behavioral hazard. Underuse, not just overuse, is a concern, and overuse may not be eliminated by setting prices at true cost. We next turn to the implications of these findings for the interpretation of observed demand elasticities.
3. Moral Hazard Cannot be Inferred From the Demand Curve Alone
Behavioral hazard dramatically alters standard intuitions for how we think about the welfare impact of copay changes. Reducing a copay that is less than cost has two effects. First, it raises utility for people who are sick enough that they demand treatment, generating insurance value. Second, it may lead people to choose to consume more care. The welfare impact of this increase depends on the magnitude and direction of behavioral hazard.
For a simple illustration, assume people are risk-neutral and consider the effect of reducing the copay from cost (c) to zero in Panel (a) of Figure III, which compares the welfare impact of the change in utilization when there is only moral hazard to when there is also underuse from negative behavioral hazard. The dark grey area represents the standard deadweight loss triangle — the moral hazard cost of insurance. This area is positive because people who get treated only when the price is below marginal cost must have a willingness to pay below this cost. It is also greater the flatter the demand curve: more elastic demand means a greater moral hazard cost of insurance.
An often implicit assumption underlying the standard approach is that we can equate demand or willingness to pay with the true marginal benefit of treatment. Behavioral hazard drives a wedge between these objects. For example, Panel (a) illustrates the case where all people have a propensity to underuse because of negative behavioral hazard and share the same ε < 0. In this case, the marginal benefit curve lies above the demand curve and the vertical difference equals |ε|. When the magnitude of negative behavioral hazard (|ε|) is sufficiently large, the marginal benefit of treatment outweighs the marginal cost even when the copay equals zero. In this case, reducing the copay to zero no longer generates a welfare cost of increased utilization, but rather a welfare benefit equal to the light grey area in the figure. This area is greater the flatter is demand: more elastic demand now means a greater benefit of insurance.
Panel (b) illustrates the case where all people share the same ε > 0 and shows how overuse due to behavioral hazard has different implications than overuse due to moral hazard. In particular, consider raising the copay above cost. While absent behavioral hazard this would lead to the standard deadweight loss triangle equal to the dark grey area, with positive behavioral hazard it leads to a welfare gain equal to the light grey area. When people overuse due to behavioral hazard, failing to cover or even penalizing the use of treatments can be beneficial. We next formalize the intuitions from the graphical analysis.
3.1. Analysis
With behavioral hazard, the marginal person does not necessarily value treatment at the copay
Differentiate W with respect to p subject to the break-even constraint, and convert into a money metric by normalizing the increase in welfare by the welfare gain from increasing income by 1. The following proposition details the resulting formula:
Proposition 1. The welfare impact of a marginal copay change is given by
(5) |
where
equals the insurance value to consumers (C = y − P − s + m · (b − p)), defined to equal 0 when M(p) = 0, and
equals the average size of marginal behavioral hazard at copay p.
Proof. All proofs are in Online Appendix B. ■
To interpret Proposition 1, first consider the standard model with just moral hazard, where εavg(p) = 0 for all p. In this case, the first term of (5), −M′(p)(c − p), represents the welfare gain from reducing moral hazard: it can be thought of as the number of people who are at the margin multiplied by the difference between the social cost and social value of their treatment—the marginal inefficiency—(c − p) > 0. Note that the sensitivity of demand, M′(p), is a sufficient statistic for measuring this gain, since the marginal social value is a known function of the copay when people are rational. The second term represents the reduction in insurance value for all treated individuals, where our assumptions guarantee that I(p) > 0 for all p > 0 when individuals are rational.
Behavioral hazard alters the first term because it changes who is at the margin: with behavioral hazard, the welfare impact of lower utilization equals −M′(p)(c − p + εavg(p)), which can be thought of as the number of people who are at the margin multiplied by the difference between the social cost and social value of their treatment, (c − (p − εavg(p))). As we saw in the graphical example above, the sign of this term becomes ambiguous. When behavioral hazard is on average positive at the margin, εavg(p) ≥ 0, this term is greater than with moral hazard alone: increasing the copay from an amount less than cost has an even greater benefit of decreasing overutilization. On the other hand, when behavioral hazard is on average negative at the margin, εavg(p) < 0, this term may be negative: increasing the copay can have the cost of increasing underutilization.14 15
With behavioral hazard, demand responses do not measure the extent of moral hazard
Proposition 1 also formalizes the standard intuition that when there is merely moral hazard, the overall demand response is a powerful tool for measuring the welfare impact of the changes in utilization driven by copay changes. Indeed, −M′(p) · (c − p) is necessarily increasing in |M′(p)| when p < c. But it shows that with behavioral hazard, this composite response is harder to interpret: looking at demand responses alone may provide a misleading impression, since −M′(p)·(c − p+εavg(p)) is not necessarily increasing in |M′(p)|. A high response might indicate a great deal of moral hazard (and hence a cost of providing insurance), or could indicate a great deal of negative behavioral hazard or price-responsive underutilization (and hence an additional benefit to insurance).
In practice, researchers effectively ignore behavioral hazard by focusing on aggregate demand responses in calculating the welfare impact of copay changes. For example, researchers calculated a welfare loss of $291 per person from moral hazard in 1984 dollars based on evidence from the RAND Health Insurance Experiment suggesting a demand elasticity of roughly −.2 (Manning et al. 1987; Feldman and Dowd 1991). While recent economic research has questioned whether such a single elasticity can accurately summarize how people respond to changes in non-linear health insurance contracts (Aron-Dine, Einav, and Finkelstein 2013), there has been less emphasis on reexamining the basic assumption that the price sensitivity of demand meaningfully captures the degree of moral hazard. In a recent review article of developments in the study of moral hazard in health insurance since Arrow’s (1963) original article, Finkelstein (2014) equates evidence of moral hazard with evidence of the price sensitivity of demand for medical care. Our analysis suggests this can be misleading, as does a closer look at available evidence.
Table II summarizes evidence indicating that demand for “effective care” is often as elastic as demand for “ineffective care”. Analysis of the RAND health insurance experiment found that cost-sharing induced the same 40% reduction in demand for beta blockers as it did for cold remedies—with reductions for drugs deemed “essential” on average quite similar to those for drugs deemed “less essential” (Lohr et al. 1986).16 Goldman et al. (2006) estimate that a $10 increase in copayments drives similar reductions in use of cholestorol-lowering medications among those with high risk (and thus presumably those with high health benefits) as those with much lower risk. A quasi-experimental study of the effects of small increases in copayments (rising from around $1 to around $8) among retirees in California by Chandra, Gruber, and McKnight (2010) suggests that HMO enrollees’ elasticity for “lifestyle drugs” such as cold remedies and acne medication is virtually the same as for acute care drugs such as anti-convulsants and critical disease management drugs such as beta-blockers and statins — all clustered around −0.15 (unpublished details provided by authors). To take a particularly striking example, which we discuss in greater detail below, relatively small reductions in copayments even after an event as salient as a heart attack still produce improvements in adherence (Choudhry et al. 2011). The evidence strongly suggests that the degree of moral hazard cannot be inferred from aggregate demand responses.17
Table II:
Study | Price Change | Change in Use | |
---|---|---|---|
Higher Value | Lower Value | ||
Lohr et al. (1986) | Cost-sharing vs. none in RAND | 21% ↓ in use of highly effective care; 40% ↓ in beta blockers, 44% ↓ in insulin | 26% ↓ in less effective care; 6% ↓ in hayfever treatment, 40% ↓ in cold remedies, 31% ↓ in antacids |
Goldman et al. (2006) | $10 ↑ in copay (from $10 to $20) | Compliance with cholestorol meds among high risk ↓ from 62% to 53% | Compliance with cholestorol meds among low risk ↓ from 52% to 46%; medium ↓ from 59% to 49% |
Selby et al. (1996) | Introduction of $25–$35 ER copay | 9.6% ↓ in visits for emergency conditions | 21% ↓ in visits for non-emergency conditions |
Johnson et al. (1997) | ↑ from 50% coinsurance with $25 max to 70% coinsurance with $30 max | 40% ↓ in use of antiasthmatics; 61% ↓ in thyroid hormones | 40% ↓ in non-opiate analgesics; 22% ↓ in topical anti-inflammatories |
Tamblyn et al. (2001) | Introduction of 25% coinsurance, $100 deductible, $200–$750 max for Rx (elderly population) | 9.1% ↓ in essential drugs | 15.1% ↓ in non-essential drugs |
Chandra et al. (2010) | $7 ↑ in drug copay (from ~$1 to ~$8) | Elasticity of around .15 for acute care and chronic care Rx | Elasticity of around .15 for “lifestyle” Rx |
Sources: Authors’ summary of literature (see bibliography)
So how can we systematically distinguish between behavioral hazard and moral hazard? One method is to measure health responses.
With behavioral hazard, measuring health responses helps characterize who is at the margin
Let equal the aggregate level of health given copay p, which represents the expected value of disease severity post treatment decisions at copay level p in income-equivalent units. We have the following result:
Proposition 2. Consider a copay p at which demand is price-sensitive, so M′(p) < 0, and let U be linear. The welfare impact of a marginal copay change is
Further, H′(p)/M′(p) = p if and only if εavg(p) = 0 and, more generally, εavg(p) = p − H′(p)/M′(p).
The first part of this proposition indicates that, all else equal, the welfare impact of a copay increase inversely depends on the marginal health value of care.18 This is true not only when there is behavioral hazard, but also in the rational model. Intuitively, a copay increase is less desirable when it discourages high-value care rather than low-value care. The second part clarifies why standard formulas for the welfare impact of copay changes are not expressed in terms of health responses: Absent behavioral hazard, we can equate the health response with the copay since being marginal reveals indifference. But it goes on to show that we cannot do this when there is the possibility of marginal behavioral hazard: Rather, we can infer the degree of marginal behavioral hazard from the deviation between the copay and the marginal health value of treatment.
In some of the cases described above, there are indications that the copay changes are associated with large health implications, providing further suggestive evidence for behavioral hazard in such cases. As summarized in Table III, the copay increase studied by Chandra et al. (2010) was associated with an increase in subsequent hospitalizations and Hsu et al. (2006) similarly find that the imposition of a cap on Medicare drug benefits lead to an greater nonelective hospital use. Choudhry et al. (2011) find that providing post-heart attack medications for free is associated with a reduced rate of subsequent major vascular events to an extent that, as we will discuss below, is inconsistent with plausible parameters under the standard model.
Table III:
Study | Price Change | Use Change | Health Value [illustrative fact] |
---|---|---|---|
Chandra et al. (2010) | $7 ↑ in drug copay (from ~$1 to ~$8) | Elasticities: .15 for essential drugs; .23 for asthma meds, .12 for cholesterol meds, .22 for depression meds | Offsetting 6% ↑ in hospitalization |
Hsu et al. (2006) | Imposition of $1000 annual Rx cap | ↑ in non-adherence to antihypertensives, statins, diabetes drugs by ~30% | 13% ↑ in nonelective hospital use; 3% ↑ in high blood pressure (among hypertensives); 9% ↑ in high cholesterol (among hyperlipidemics); 16% ↓ in glycemic control (among diabetics) |
Lohr et al. (1986) | Cost-sharing vs. none in RAND | ↓ in use of insulin by 44%, beta blockers by 40%, antidepressants by 36% | [Consistent filling of diabetic med prescriptions ↓ hospitalization risk from 20–30% down to 13% (Sokol et al. 2005)] |
Selby et al. (1996) | Introduction of $25–$35 ER copay | 9.6% ↓ in visits for emergency conditions | Emergency conditions included coronary arrest, heart attack, appendicitis, respiratory failure, etc. |
Choudhry et al. (2011) | Elimination of Rx copays for post-heart attack patients | 4–6 percentage point ↑ in medication adherence | Rates of total major vascular events ↓ by 1.8 ppt, heart attacks by 1.1 ppt |
Sources: Authors’ summary of literature (see bibliography); additional unpublished detail provided by Chandra et al. Notes: Health value comes from same study when available. [“Illustrative facts” come from other studies.]
A challenge to using data on health responses to calibrate the degree of behavioral hazard is that the health response may be difficult to observe or to map to hedonic benefits. It may be possible to estimate how much a pill reduces mortality risk and translate this into (money-metric) utility; it may be more difficult to estimate the unpleasantness of side-effects or the inconvenience of treatment. In some instances, however, we may have enough information to confidently bound the unobservable component, in which case we can still say something about the sign and possibly the magnitude of behavioral hazard.19 This is more likely in the case of highly effective treatments with few side effects than in treatments with non-pecuniary costs that may be experienced quite differently across people (e.g., colonoscopies). Section 6 shows that good prior knowledge of the psychology underlying behavioral hazard can help estimate the marginal degree of behavioral hazard in the latter situations.
3.2. An Illustration
We illustrate the potential importance of taking behavioral hazard into account by further drawing on Choudhry et al.’s (2011) work on the effects of eliminating copays for recent heart attack victims.20 They randomly assigned patients discharged after heart attacks to a control group with usual coverage (with copayments in the $12-$20 range) or a treatment group with no copayments for statins, beta blockers, and ACE inhibitors (drugs of known efficacy), and tracked adherence rates and clinical outcomes over the next year. Faced with lower prices, consumers used more drugs: the full coverage group was significantly more adherent to their medications, using on average $106 more worth of cardiovascular-specific prescription drugs.
Under the moral hazard model, this fact alone tells us the health consequences of eliminating copays. Rational patients forgo only care with marginal value less than their out-of-pocket price. The average patient share under usual coverage in the Choudry data is about 25%, implying that the extra care consumed when copays are eliminated has a monetized health value of at most $.25 on the dollar. Given the $106 increase in spending, the moral hazard model then predicts a health impact of at most $106 · .25 = $26.50 per patient. This in turn implies a moral hazard welfare loss from eliminating copayments of at least $106(1 − .25) = $79.50 per person. In other words, the $106 increase in spending is comprised of $26.50 of health value plus $79.50 of excess utilization. This is the kind of exercise routinely performed with demand data.21
But Choudhry et al. (2011) collected data on health impacts, which we can use to gauge the performance of the moral hazard model by comparing the implied health benefits with the observed ones. The increase in prescription drug use was associated with significantly improved clinical outcomes: patients in the full coverage group had lower rates of vascular events (1.8 percentage points), myocardial infarction (1.1 percentage points), and death from cardiovascular causes (.3 percentage points). We apply the commonly used estimate of a $1 million value of a statistical life to the reduction in the mortality to get a measure of the dollar value of health improvements.22 This implies that the elimination of copays leading to a .3 percentage point reduction in mortality generates a value of $3,000. This $3,000 improvement substantially exceeds the standard model’s prediction of $26.50, suggesting large negative behavioral hazard. Applying the traditional moral hazard calculus in this situation would imply that people place an unrealistically low valuation on their life and health.23
For welfare calculations, the theoretical analysis above highlights the need to use an estimate of the marginal private health benefit in the presence of behavioral hazard. As a rough back-of-the-envelope calculation, the $3,000 improvement in mortality minus the $106 increase in spending generates a surplus of $2,894 per person (a gross return of $28 per dollar spent). The presence of behavioral hazard thus reverses how we interpret the demand response to eliminating copayments: moral hazard implies a welfare loss, while behavioral hazard implies a gain that is over 30 times larger.24
4. Implications for Optimal Copays
We have seen that behavioral hazard can influence whether changing copays from existing levels is good policy. This section describes some features of the optimal insurance plan when behavioral hazard is taken into account.
Consider again Equation (5), which gives us the welfare impact of a marginal copay increase. Setting this equal to zero yields a candidate for the optimal copay. To limit the number of cases, we focus attention on the standard situation where some but not all sick people are treated at the optimum: an optimal copay pB satisfies M′(pB) < 0 and M(pB) > 0. This is true under our assumptions, for example, when people are not too risk averse, i.e., when −U″/U′ is sufficiently small over the relevant range of C. For presentational simplicity, we also focus on the situation where the optimal copay is unique. Defining pmin = inf {p : M(p) < q} to equal the lowest copay where not every sick person demands treatment and pmax = sup{p : M(p) > 0} to equal the highest copay where some sick person demands treatment, we assume the following.
Assumption 1. The optimal copay is unique and satisfies pB ∈ (pmin, pmax).
Proposition 3. Assuming pB ≠ 0, the optimal copay satisfies
(6) |
where η = −M′(p)p/M(p) equals the elasticity of demand for treatment, I the insurance value, and εavg the average size of marginal behavioral hazard, all evaluated at pB.
Proposition 3 expresses the optimal copay in terms of reduced-form elasticities as well as the degree of behavioral hazard and the curvature of the utility function. It says that, fixing insurance value and the cost of treatment, the optimal copay is increasing in the demand elasticity and the degree to which behavioral hazard is positive. This simple formula illustrates a number of ways in which behavioral hazard fundamentally changes how we think about optimal copays.
Optimal copays can substantially deviate from cost even when coverage generates little or no insurance value
A simple implication of Equation (6) is that health “insurance” can provide more than financial protection: it can also improve healthcare efficiency. Even when individuals are risk-neutral and there is no value to financial insurance (I = 0), Equation (6) indicates that the optimal copay can differ from cost to provide insurees with incentives for more efficient utilization decisions. In fact, when consumers are risk-neutral, the extent of behavioral hazard (at the margin) fully determines the optimal copay. In this case, the optimal copay formula reduces to pB = c+εavg(pB): the optimal copay acts like a Pigouvian tax to induce marginal insurees to fully internalize their “internality”. Unlike in the standard model, there is no clear incentive-insurance tradeoff.
Optimal copays can be extreme: It can be optimal to fully cover treatments that are ineffective for some insurees or not to cover treatments that benefit insurees
A related implication is that optimal copays can be more extreme than in a model with only moral hazard. Absent behavioral hazard, the optimal copay lies strictly between the value that provides full insurance (i.e., the value that makes I(p) = 0) and cost when insurees are risk averse and demand is elastic. Intuitively, without behavioral hazard, slightly raising the copay from the amount that provides full insurance has only a second order cost through reducing insurance value but a first order benefit through controlling moral hazard; slightly reducing the copay from cost has a second order cost through inducing moral hazard but a first order benefit through increasing insurance value. In the standard model, it cannot be optimal to deny coverage of treatments that benefit some risk averse individuals and it cannot be optimal to fully cover or subsidize treatments when people are price-sensitive at the full coverage copay.
Behavioral hazard alters these prescriptions. When behavioral hazard is sufficiently positive, the optimal copay can be above cost even when the individual is risk-averse: it can be good to let insurers discriminate against certain treatments, as suggested by Panel (b) of Figure III. When behavioral hazard is sufficiently negative, the optimal copay can be below the level that provides full financial protection, even if demand is price-sensitive at this copay: paying people to get treated can be optimal, as illustrated in Panel (a) of Figure III. In this spirit, some insurers have begun to experiment with paying patients to take their medications (Belluck 2010; Volpp et al. 2009).
Optimal copays depend on health value, not just demand elasticities
Optimal copays likely vary more across treatments than in a model with only moral hazard. The standard model says that, fixing insurance value, copays should be higher the larger the cost and elasticity of demand (Zeckhauser 1970), as can be seen from plugging εavg = 0 into Equation (6). That model suggests, for example, that copays should be lower for emergency care (where demand is less elastic) than for regular doctor’s office visits (where it is presumably more price sensitive). However, it also leads to some counterintuitive prescriptions: It suggests that copays should be similar across broad categories of drugs with similar price elasticities, even if they have very different efficacies.
Behavioral hazard alters these prescriptions as well. To see this, make the approximation ε(s; θ) ≈ ε′(s; θ) ∀ (s, θ) and plug εavg(p) ≈ p − H′(p)/M′(p) (Proposition 2 establishes that the second approximation follows from the first) into (6), yielding
(7) |
From Equation (7), all else equal copays should be decreasing in the net return to the last private dollar spent on treatment, |H′(p)|/(p|M′(p)|) − 1, so the value of treatment now enters into the determination of the optimal copay insofar as it influences H′(p). For a given demand response to copays, copays should be lower when this demand response has greater adverse effects on health.
This connects to value-based insurance design proposals (Chernew, Rosen and Fendrick 2007) where, all else equal, cost sharing should be lower for higher value care. While the marginal rather than the average value of care appears in Equation (7), knowledge of the average health value of care can provide a useful signal about the marginal health value. Consider a case where the demand curve slopes down only because of behavioral hazard: V ar(ε) > 0, but V ar(b) = 0. Then the marginal individual at any copay where demand is price-sensitive must have a marginal health value equal to the average value b, which also can be expressed as (H(pmin)−H(pmax))/(M(pmin)−M(pmax)). (Recall that pmin equals the lowest copay where some of the sick do not demand treatment and pmax equals the largest copay where some people still demand treatment.) Generalizing this example to allow for heterogeneity in private benefits in addition to heterogeneity in behavioral hazard yields the following result.
Proposition 4. Assume U is linear, M′(c) ≠ 0, and the distribution Q(s, θ, γ) is such that b(s; γ) and ε(s; θ) are independently distributed according to symmetric and quasiconcave densities with V ar(ε) > 0.
pB > c if and .
pB = c if and and .
pB < c if and and .
This shows that with behavioral hazard, the average value of care provides a useful signal for the optimal copay. So long as there is some variability in behavioral hazard across people and behavioral hazard does not systematically push people to privately overuse high-value treatments or privately underuse low-value treatments, then the optimal copay is above cost whenever the treatment is not socially beneficial on average and is below cost whenever the treatment is socially beneficial on average. Take the case where . The average value of care signals the expected direction of behavioral hazard at the margin, since—as is familiar from standard signal-extraction arguments—the marginal patient’s expected valuation lies between the copay (his “revealed” valuation if there is no behavioral hazard) and the unconditional average valuation (his valuation if being marginal was independent of true valuation).25 The marginal degree of behavioral hazard is then negative at copays below the expected value of treatment and positive at copays above the expected value of treatment. Returning to the example where V ar(b) = 0, the marginal degree of behavioral hazard satisfies b + ε = p ⇒ ε = p − b, which clearly is negative if and only if the copay is below the expected value of treatment.
These results suggest that optimal copays should depend on the value of treatment in addition to the demand response. For example, we might expect that we should have high copays for procedures that are not recommended but sought by the patient nonetheless and low copays in situations where people have asymptomatic chronic diseases for which there are effective drug regimens. While advocated by some health researchers—for example, Chernew et al. (2007)—such differential cost-sharing is uncommon in practice; we return to some possible reasons in Section 6 below.26
5. The Pitfalls of Ignoring Behavioral Hazard
Behavioral hazard modifies the central insights of the standard model. The goal of this section is to give a sense of how important it is to take behavioral hazard into account – how wrong would the analyst be if he ignored behavioral hazard?
While the optimal copay, pB, satisfies , where is defined in Equation (5), a candidate for the “neo-classical optimal copay”, pN, satisfies the following condition.
Definition 1. pN is a candidate for the neo-classical optimal copay when
and (i) in a left neighborhood of pN, (ii) in a right neighborhood of pN, and (iii) at least one of the inequalities in (i) or (ii) is strict for some p in the relevant neighborhoods.
In other words, pN is a copay that an analyst applying the standard model to estimates of the demand and insurance value schedules, (M(·), I(·)), thinks could be optimal. The neo-classical optimal and true optimal copays will clearly coincide when ε(s; θ) = 0 ∀ s, θ. The direction of the deviation between these copays is also intuitive. As established in Online Appendix A, there is a welfare benefit to raising the copay from the neo-classical optimum whenever behavioral hazard is on average positive for people at the margin, and there is a welfare benefit to reducing the copay from the neo-classical optimum whenever behavioral hazard is on average negative for people at the margin.27 Less obvious, the deviation between the neoclassical optimal and true optimal copays can be huge:
Proposition 5. Suppose U is strictly concave, , and b(s; γ) = s ∀ (s, γ, θ).
If is sufficiently large then the neo-classical analyst believes pN = 0 is a candidate for the optimal copay but the optimal copay in fact satisfies pB ≥ c.
If is sufficiently low then the neo-classical analyst believes pN = c is a candidate for the optimal copay but the optimal copay in fact satisfies pB ≤ 0.
When behavioral hazard is extreme, the neo-classical optimal copay is exactly wrong: the situations in which the neo-classical analyst believes that copays should be really low are precisely those situations where copays should be really high and vice versa.28 In the case of very positive behavioral hazard, almost everybody gets treated at p ≈ c, so the neo-classical analyst thinks there is no benefit to controlling moral hazard but there is an insurance value to reducing copays, suggesting to him an optimal copay of at most zero. In reality, however, many people who demand treatment at p = c are inefficiently doing so, yielding a large benefit to controlling behavioral hazard by raising the copay above cost. So long as people are not extremely risk averse, a copay above cost is better than any copay below cost. In the case of very negative behavioral hazard, almost nobody gets treated at p ≈ c, so the neo-classical analyst sees a huge benefit to controlling moral hazard since nobody appears to value the treatment as much as it costs. So long as people are not extremely risk averse, the neo-classical analyst believes the copay should roughly equal cost. In reality, however, even at a copay of zero, people at the margin of getting treated have a benefit above cost. There is no benefit to controlling behavior by raising the copay above zero, but there is an insurance value cost, making the optimal copay at most zero.
An immediate corollary of Proposition 5 is the following:
Corollary 1. Suppose U is strictly concave, b(s; γ) = s, , and
For sufficiently large e: pB ≥ c or pB ≤ 0, where (i) pB ≥ c if pN = 0 (but not pN = c) is a candidate for the neo-classical optimal copay and (ii) pB ≤ 0 if pN = c (but not pN = 0) is a candidate for the neo-classical optimal copay.
This corollary essentially restates Proposition 5 to say that when behavioral hazard is extreme, knowing that the neo-classical analyst believes that the copay should be very low signals that it should be very high and knowing that he believes the copay should be very high signals that it should be very low. For example, when the neo-classical optimal copay is 0, i.e. full insurance, the optimal copay is above c, i.e., no insurance.29
For a numerical illustration, take the case where utility is quadratic, s is uniformly distributed, getting treated returns a person to full health, and the degree of behavioral hazard is constant across the population. Table IV details a resulting calculation for parameter values described in the notes. This example highlights several points. First, pB > pN whenever behavioral hazard is positive, and pB < pN whenever behavioral hazard is negative. Second, the optimal copay pB is increasing in . Third, the neo-classical optimal copay pN is instead decreasing in . Fourth, and as a result of the fact that pB and pN move in opposite directions as moves away from 0, the deviation between pB and pN can be huge.30
Table IV:
Neo-classical Optimal Copay (pN) | Optimal Copay (pB) | |
---|---|---|
99.98 | .02 | |
97.95 | 97.95 | |
0 | 197.82 |
Notes: U(C) = αC − βC2 for α = 7000, β = 1/2; for ; b(s; γ) = s for all (s, γ); and for all (s; θ). We use the following values for the calculations: y = 2500, q = .1, c = 100, and . There is a unique candidate for the neo-classical optimal copay in all cases. Note that since c = 100, the copay coincides with the coinsurance rate in percentage units.
These results illustrate that setting copays under the assumption that the demand response signals the degree of moral hazard leads to very wrong policy conclusions when behavioral hazard is extreme. The example of Choudhry et al. (2011) on eliminating copays for recent heart attack victims dramatically illustrates this for the case of negative behavioral hazard: given the sizable demand response to eliminating copayments for statins, beta blockers, and ACE inhibitors, a neoclassical analyst could mistakenly conclude that this reduces welfare. There are also examples consistent with mistaken conclusions in the other direction, where the traditional model suggests low copays because insurees exhibit little price-sensitivity, while incorporating behavioral hazard might suggest higher copays because the evidence signals persistent overuse. An example is the case of low price elasticities among the elderly for drugs deemed “inappropriate” for their conditions (Costa Font et al. 2011).
6. Further Issues and Extensions
6.1. Applying Information on Psychological Underpinnings
We have drawn out the implications of behavioral hazard generally, without distinguishing among various psychologies that could underlie it. This section describes two ways in which making such distinctions can be helpful in applied work. First, it can allow us to predict the degree of behavioral hazard in situations where measuring health responses is infeasible. Second, it can suggest new policy instruments that would usefully target specific psychologies.
When it is difficult to use evidence on health responses to measure the degree of behavioral hazard, knowledge of the psychology underlying behavioral hazard can be useful (Beshears et al. 2008, Mullainathan, Schwartzstein and Congdon 2012). For example, in the case of present-bias, knowing the degree to which treament benefits or costs are delayed can predict behavioral hazard and thereby suggest which treatments should have higher or lower copays. Gruber and Koszegi (2001) follow this sort of approach in estimating the marginal internality for the case of cigarette purchase decisions.
Identifying the specific psychologies can also motivate the use of non-financial instruments or “nudges” to change behavior (Thaler and Sunstein 2008). There is substantial evidence, for example, that reminders or framing can affect utilization (Schroeder et al. 2004; Schedlbauer et al. 2010; Strandbygaard et al. 2010; Long et al. 2012).31 To incorporate such instruments into the framework, suppose there is a set of nudges available to the insurer, where a nudge is modeled as affecting demand through influencing the behavioral error ε, so for , we have ε = εn(s; θ). The direct cost to the insurer of nudge n is ψ(n) ≥ 0, where we suppose there is a “default nudge” with ψ(0) = 0. So far we have implicitly assumed that the insurer sets the default nudge and have notationally suppressed the relationship between the nudge and the behavioral error.
We can sometimes use responses to nudges to measure the magnitude of behavioral hazard, though we have to be careful in our interpretation. It is tempting to say that responses to reminders, for example, reveal the extent of inattention. But as Bordolo, Gennaioli and Shleifer (2015) point out, this inference may not be valid because people may overreact to such nudges. Nudges can be a useful tool for calibrating the degree of behavioral hazard when we have a precise sense for how nudges affect the error—that is, how εn varies in n. Proposition A.2 in Online Appendix A describes conditions under which the degree of marginal behavioral hazard can be calibrated by comparing the demand response to nudges to the demand response to prices.32 Heuristically, if we find that a nudge believed to lead to better decisions increases demand for treatment more than a $d decrease in the copay, that suggests that agents are undervaluing treatment by at least $d at the margin.33 Relatedly, if we have information that some people are likely more biased than others, we could in principle bound the degree of behavioral hazard by comparing the demand curves of the more and less biased groups.34
It is difficult to perform this exercise rigorously with existing data because few studies estimate the impact of nudges and copays simultaneously. However, the limited evidence on the effects of nudges on adherence suggests that behavioral hazard can be significant. For example, Long et al. (2012) compare peer mentoring and financial incentives to improve glucose control among African American veterans. While they did not measure impacts on drug adherence, they find that a peer mentoring program improved blood sugar control more than a $100-$200 incentive did, which is suggestive of negative behavioral hazard. (Also see Online Appendix C for evidence on the impact of nudges on hypertension.)
Nudges are also potentially useful policy tools for counteracting behavioral hazard when we know that they are reducing errors overall, since they can target behavioral hazard better than copays can. For example, if some fraction of the population exhibits negative behavioral hazard under the default nudge, ε0(s; θ) < 0 ∀s, while some fraction acts unbiased, then there does not exist a copay that leads to first-best utilization: while p = c leads to first-best utilization for the unbiased, it leads to underuse among those with negative behavioral hazard. Similarly, while some p < c may lead to efficient utilization in the population with negative behavioral hazard, it leads to overuse among the unbiased. On the other hand, if there is a “perfectly de-biasing nudge” n* that eliminates behavioral hazard, εn*(s; θ) = 0 ∀(s, θ), then using that nudge leads to first-best utilization when p = c.35 Of course, it is implausible that a perfectly de-biasing nudge exists, and it is unclear how effective many nudges are at reducing behavioral hazard. Studying the optimal mix of nudges and copays in the design of health insurance, taking such uncertainty into account, is an interesting topic for future research.
6.2. Incorporating Testing Decisions
The existence of behavioral hazard can have system-level ripple effects, particularly if early utilization such as diagnostic testing has cascade effects for downstream care. In the traditional model, the additional information yielded by low-cost tests should only improve patient welfare, but with behavioral hazard, subsequent misbehavior can add large costs. We briefly extend the model to allow for a testing stage that reveals s, and suppose the person gets treated only if he is tested. To illustrate through a specific example, further suppose that U″ = 0, s ~ U[0, 1], c = 3/4, and ε = 3/4 for everybody. Without testing, nobody gets treated and welfare is . With testing, everybody gets treated if p ≤ c, and welfare equals −3/4 · q < −1/2 · q. So tests have a substantially negative return if behavioral hazard is uncontrolled (p = pN = c) and it is better not to test: the return on testing equals (−3/4 − (−1/2)) · q = (−1/4) · q. But if behavioral hazard is perfectly counteracted (say, in this example, with copay pB = c + ε), then tests would have a positive return: the return on testing is then (−15/32 − (−1/2)) · q = (1/32) · q. So taking health responses into account in setting copays for treatment decisions can be doubly beneficial: not only does it lead to better decisions at the treatment stage, but it can feed back to better policy at the testing stage.
The medical community has designed testing guidelines that implicitly acknowledge the imperfections in the downstream decision-making of patients and physicians alike. The “Choosing Wisely” initiative launched by internal medicine physicians makes blanket recommendations against some common screenings and tests, not because the tests themselves are cost-ineffective conditional on optimal downstream care, but explicitly because of the probability of downstream care that is likely to harm patient health but is delivered nonetheless (Morden et al. 2014). These tests yield useful information that should be acted upon for a subset of identifiable patients, but many other patients end up receiving care (whether because of the disutility of “doing nothing” or mistaken beliefs about efficacy) that is at best useless and at worst harmful, making the expected value of performing the test negative.36 This downstream sub-optimal care is of course the product of both patient and physician decision-making, and there may be ample parallel opportunity to redesign physician incentives to take into account the psychology of their decision-making (as well as train them to help take into account the behavioral decision-making of their patients).37
6.3. Analyzing What the Market Will Provide
Given the welfare benefits of counteracting behavioral hazard, one question is why existing plans do not seem to do so. As shown in Online Appendix Table 2, typical health insurance plans have a copay structure that varies little within broad categories of treatments (e.g., physician office visits, inpatient services, branded drugs, generic drugs), while counteracting behavioral hazard requires a more nuanced structure where copayments are a function of the health benefit associated with the care for a particular patient. In an earlier version of this paper (Baicker, Mullainathan and Schwartzstein 2013) we extend the model to consider what a competitive market will provide in equilibrium under the simplifying assumption of a representative insuree, which requires additional assumptions on the degree to which insurees are sophisticated about behavioral hazard. We show how even though optimal insurance could reduce behavioral hazard, market-provided insurance may not. If enrollees were sophisticated, perfectly predicting their behavioral hazard and insuring accordingly, then market-provided insurance would be optimally designed to counteract behavioral hazard. But with naive enrollees, insurers have less incentive to mitigate underuse since naive consumers will not fully value copays designed to counteract their biases.38 These problems are especially severe when insurers have nudges as well as copays available and when enrollees may be insured with them for limited time horizons (and thus insurers do not bear the full cost of current enrollees’ future health care use). In particular, if insurees do not appreciate the impact of nudges, then we would expect the insurer to supply nudges that minimize costs rather than maximize surplus.
Thus, with naive insurees, insurers only have an incentive to counteract biases when it saves them money. The gains from reducing copays for antidiabetics, beta blockers, and other care prone to negative behavioral hazard may not accrue to insurers, limiting their incentive to incorporate behavioral hazard into their copayment structures.39 For example, Beaulieu et al. (2003) estimate that investment in diabetes disease management produces very small monetary gains for insurers over a 10 year horizon, but would produce $30,000 per patient in improved quality of life and longevity.40 The potential for market failure suggests that government interventions may be welfare-improving over market outcomes even in the absence of selection or externalities.41 An important direction for future work is to develop a better understanding of how sophisticated consumers are about their own behavioral hazard, and how the government could intervene to help counteract behavioral hazard.
7. Discussion
There is ample evidence that moral hazard alone cannot explain the patterns of misutilization observed throughout the health care system. We build a framework for analyzing the relationship between insurance coverage and health care consumption that includes behavioral hazard. With only moral hazard, lowering copays increases the insurance value of a plan but reduces its efficiency by generating overuse. With the addition of behavioral hazard, lowering copays may potentially both increase insurance value and increase efficiency by reducing underuse. Having an estimate of the demand response is no longer enough to set optimal copays; the health response needs to be considered as well. This provides a theoretical foundation for value-based insurance design, where copays should be lower both when price changes have small effects on demand and when they have large effects on health. Ignoring behavioral hazard can lead to welfare estimates that are both wrong in sign and off by an order of magnitude.
The finding that optimal insurance features depend crucially on how prices affect both demand and health highlights the value of having empirical estimates of both responses (see also Lee et al. 2013). There is limited data linking insurance to clinical outcomes—and performing an exhaustive list of experiments and calculations would certainly be daunting—but a small number of conditions account for a large share of health spending. Patients with circulatory system conditions like high blood pressure and high cholesterol are responsible for 17% of total health care spending; mental health conditions like depression for another 9%; respiratory conditions like asthma another 6%; and endocrine conditions like diabetes another 4% (Roehrig et al. 2009). Studies that link changes in price, demand, and health would thus be particularly valuable—and a limited number could go a long way.42
This framework is amenable to a number of extensions. While our model is static, it could be extended to consider how behavioral hazard affects optimal copays for preventive care, where initial underuse could generate expensive future overuse. Further exploration of consumer sophistication and intertemporal incentives for insurers may help explain insurance offerings and the potential role of public policy. Work could also draw out more nuanced implications for the use of nudges versus copayments. And while we highlight that the main results do not depend on the underlying psychological forces generating behavioral hazard, greater understanding of those forces could help estimate the degree of behavioral hazard in different settings and inform the optimal design of nudges. Finally, physicians are of course important drivers of the health care that is ultimately delivered, and, like patients, their choices are driven by both financial incentives and behavioral biases; future work could analyze the interplay of these channels.
Failing to incorporate behavioral hazard into models of health insurance can not only generate very wrong answers – but very wrong answers with substantial import for millions of people. Much of our health spending is on care that is sensitive to copay changes, and much of that care seems to have an impact on health that differs from what is implied by moral hazard alone. Simply assuming that the demand curve reveals the health value of care can generate misleading policy prescriptions; incorporating both moral hazard and behavioral hazard can inform insurance designs that better balance insurance protection with efficient resource use.
Supplementary Material
Acknowledgments
We thank Dan Benjamin, David Cutler, John Friedman, Drew Fudenberg, Ben Handel, Ted O’Donoghue, Matthew Rabin, Jesse Shapiro, Andrei Shleifer, Jonathan Skinner, Chris Snyder, Douglas Staiger, Glen Weyl, Heidi Williams, Danny Yagan, and the anonymous referees for helpful comments. Baicker thanks National Institute on Aging (NIA), Grant Number P30-AG012810 and Schwartzstein thanks NIA, Grant Number T32-AG000186 for financial support.
Footnotes
In principle, it is possible to argue that unobserved costs of care, such as side effects, drive what seems to be underuse. However, in practice, this argument is difficult to make for many of the examples we review. The underuse we focus on is very different from the underuse that can arise in dynamic moral hazard models. In such models, patients may underuse preventive care that generates monetary savings for the insurer (Goldman and Philipson 2007; Ellis and Manning 2007). Here, we focus on the underuse of care whose benefits outweigh costs to the consumer. For example, though the underuse of diabetes medications does generate future health care costs, the uninsurable private costs to the patient alone (e.g. higher mortality and blindness) make non-adherence likely to be a bad choice even if she is fully insured against future health care costs. We also abstract from underuse due to health externalities (e.g. the effect of vaccination on the spread of disease).
Technically, we can only equate the marginal private utility benefit with the copay, but presumably much of this benefit derives from the health effects.
The standard revealed preference assumption that Einav, Finkelstein and Williams (2015) use effectively assumes that the demand curve reveals the distribution of patients’ relative valuation for having a lumpectomy over a mastectomy. There is some suggestive evidence that challenges this assumption, for example that providing decision aids to inform breast cancer patients about the relevant trade-offs increases demand for the less invasive option (Waljee et al. 2007). One possibility is that patients may start from a false belief that the more invasive procedure is more effective at preventing cancer relapses.
Spinnewijn (2014) analyzes the optimal design of unemployment insurance when job-seekers have biased beliefs and similarly predicts that policies implementing standard sufficient statistics formulas become suboptimal when agents make errors.
The idea that the optimal copay is below the neo-classical optimal copay when behavioral hazard drives systematic underuse parallels findings on self-commitment devices for present-biased agents. For example, DellaVigna and Malmendier (2004) show that sophisticated present-biased agents value gym memberships that reduce the price of going to the gym below the social marginal cost, since this reduction counteracts internalities that result from the overweighting of immediate costs relative to long-term benefits.
As standard in the literature, we implicitly assume that consumers would face the social marginal cost of treatment without insurance, which abstracts from another rationale for why insurance coverage can be welfare-improving even for risk-neutral consumers: when treatment suppliers have market power, subsidizing treatment can bring copays closer to the social marginal cost of care (Lakdawalla and Sood 2009).
We focus on a single specific condition for presentational simplicity, but the analysis is qualitatively similar if the person can fall sick with different conditions and the specific condition she falls sick with is observable and verifiable to the insurer, so the insurer can set different copays across conditions.
See, e.g., Bernheim and Rangel (2009) for a discussion of a choice-based approach to recovering consistent welfare functions from inconsistent choice behavior.
For simplicity, we are assuming away income effects or issues of affordability. In a standard framework, insurance could lead to more efficient decisions insofar as it makes high-value, high-cost procedures affordable to consumers (Nyman 1999). However, in this framework, insurance cannot lead consumers to make more efficient decisions on the margin. Abstracting from income effects serves to highlight this well-known fact (Zeckhauser 1970). Also, many of the examples we focus on involve low cost treatments such as prescription drugs where any income effects are likely to be small.
As we discuss below, differentiating among these biases could help in designing non-price or behavioral interventions, but our focus is largely on the role of more standard price levers. Chetty (2009a) proposes a model of salience and taxation in a similar way: he derives empirically implementable formulas for the incidence and efficiency costs of taxation that are robust across positive theories for why agents may fail to incorporate taxes into choice. Our analysis is in the spirit of the “sufficient statistics” approach to public finance (Chetty 2009b), which develops formulas for the welfare consequences of policies that are functions of reduced-form elasticities.
False beliefs may result from a variety of factors. Patients may have incomplete information; they may have faulty mental models; they may not interpret evidence as Bayesians; they may be inattentive to available evidence. Section 6.1 highlights ways that distinguishing between such factors can be helpful, though we suspect that often a combination of factors are at play.
Estimates suggest that the majority of antibiotics prescribed for adult respiratory infections were for conditions where an antibiotic would not be helpful, such as for a viral infection (Gonzales et al. 2001)—although, as discussed below, this may be attributable to a combination of patient and physician psychology.
Underuse is of course not restricted to prescription drug non-adherence. Patients do not receive recommended care across a wide range of categories, with only 55 percent receiving recommended preventive care including screenings (e.g., colonoscopies) and follow-up care for conditions ranging from diabetes and asthma management to post-hip-fracture care (Ness et al. 2000; McGlynn et al. 2003; Denberg et al. 2005).
While not the focus of our analysis, with behavioral hazard the sign on the insurance value term is also ambiguous. In the standard model, the sick who demand treatment are worse off than the sick who do not, even post treatment, so long as p > 0. Since this may not hold with behavioral hazard, stronger conditions (for example, that q is sufficiently small) are necessary to guarantee that the people who demand treatment on average have higher marginal utility than those who do not and consequently that I(p) > 0 for p > 0.
Note that what matters for calculating the welfare impact of a marginal copay change is the average marginal size of behavioral hazard at copay p, , rather than the average unconditional size, . To see why, consider a situation where some people simply forget to get treated (e.g. forget a prescription refill) with some probability ϕ, but otherwise make an accurate cost-benefit calculation. In our framework, this can be captured by assuming that ε(s; θ) is very negative with probability ϕ and otherwise equals zero. While the average degree of behavioral hazard in this example can be quite negative, behavioral hazard does not influence who is at the margin, since anyone who responds to a copay change is someone who makes an accurate cost-benefit calculation. Indeed, in this case the marginal degree of behavioral hazard is zero.
While we have framed the analysis in terms of the insurer setting a copayment for a specific disease and treatment, we could re-interpret the model as being about an insurer who sets the same copayment across a set of treatments with common cost c. For example, we could think of the insurer as setting the copay for drugs within some formulary tier. Under this interpretation, γ indexes observable conditions that the insurer does not distinguish between in setting copays. The analysis would proceed in a similar fashion, but under this interpretation an analyst can disaggregate the demand response into the response for each condition γ, which can provide information on the degree to which the total response reflects some combination of behavioral hazard and moral hazard when there is a prior sense of the marginal value of different treatments.
It is important to note that there are also examples of behavior consistent with the traditional model of moral hazard, including from RAND and the decades since. Taubman et al. (2014), for example, show that gaining insurance coverage (and the associated drop in prices) increased emergency department visits particularly for less urgent or more discretionary conditions.
The assumption of linear utility simplifies the presentation by allowing us to abstract from the insurance value term. It also simplifies the relationship between εavg(p) and H′(p)/M′(p). Otherwise, εavg(p) ≈ p − H′(p)/M′(p) when U is approximately linear.
We use this particular study because it measures not only demand responses, but also a rich variety of health responses. While the setting is admittedly quite specific, we believe the qualitative conclusions are illustrative for broader populations and treatments. Online Appendix C provides a stylized example using the case study of treatment for high blood pressure, though it is difficult to perform a rigorous analysis given data limitations.
Given the assumption that people have linear demand curves, we can derive a tighter lower bound on the welfare loss under the standard model. In this case, the moral hazard model implies a welfare loss of at least $106(1−.25/2) = $92.75 (see, e.g., Feldman and Dowd 1991).
This calculation is admittedly crude, but provides an illustrative example. Estimates of the value of a statistical life clearly vary based on the age at which death is averted and the life expectancy gained—averting the death of a young healthy worker might be valued at $5 million—and mortality is only one aspect of the potential changes in health. While the estimated reduction in mortality is not statistically significant at conventional levels, the other health impacts are. We focus on the mortality reduction because it is easiest to monetize in this illustration.
It seems unlikely that the cost of unobserved side-effects of statins, beta blockers, and ACE-inhibitors is anywhere near $2894 for a given patient in a year, so taking these effects into account should not reverse the conclusion that eliminating copayments leads to a welfare gain.
As in basic moral hazard calculations, this analysis ignores substitution between treatments. In this example, total spending (prescription drug plus nondrug spending) went down by a small, non-statistically significant amount when copayments were eliminated on preventive medications after heart attack, as did insurer costs. Taking these non-significant offset effects at face value would imply that welfare goes up even before taking behavioral hazard into account (Glazer and McGuire 2012), though it raises a puzzle as to why private insurers did not reduce copays on their own. Even in this case, however, incorporating behavioral hazard substantially changes the analysis by providing a much stronger rationale for reducing the copay. More generally, evidence suggests that reducing copays on high-value care does not generate cost savings over short (1–3 year) horizons (Lee et al. 2013).
The assumptions that b and ε are independently distributed according to symmetric and non-degenerate quasiconcave distributions guarantee that lies in between p and (see, e.g., Chambers and Healy 2012). In a different context, Spinnewijn (2014) similarly shows that even when people make mean-zero errors in deciding whether to purchase insurance (which are independent of true insurance value), a selection argument implies that the demand curve systematically overestimates the insurance value for the insured and systematically underestimates the insurance value for the uninsured.
A potential concern is that insurees may have difficulty understanding or acting on insurance contracts that specify different copays across services. While ultimately an empirical question, we note that this is an issue in existing contracts as well and that insurers and providers have tools to highlight important copay differences. For example, participants in the Choudhry et al. (2011) study were informed by mail and phone that copays for certain drugs had been waived, and Choudhry et al. document a sizable demand response to the targeted copay changes.
A somewhat more subtle point can be seen by focusing on the case where behavioral hazard is either systematically positive or negative, meaning that ε(s; θ) (weakly) shares the same sign across (s, θ). In this context, Proposition A.1 in Online Appendix A implies that optimal copays exceed the neo-classical optimal copay so long as some marginal individuals exhibit positive behavioral hazard, as in this case εavg(p) > 0, and is below the neo-classical optimal copay so long as some marginal individuals exhibit negative behavioral hazard. Consider the case of positive behavioral hazard. Increasing the copay by a small amount from p = pN has the welfare benefit of counteracting the behavioral hazard of some individuals, and the welfare cost of raising the copay above the optimum for people who behave according to the standard model. This result says that the welfare benefit of counteracting behavioral hazard wins out. The intuition, similar to that in O’Donoghue and Rabin’s (2006) analysis of optimal sin taxes, is that since pN is the optimal copay for standard agents, any small change from p = pN only has a second-order cost on their welfare. On the other hand, since people with positive behavioral hazard are inefficiently using too much care at p = pN, a small reduction in the amount of care they receive has a first-order welfare benefit. While the presence of people who behave according to the standard model can impact the magnitude of the deviation between the optimal copay and the neo-classical optimum, it does not influence the direction of this deviation.
Strict concavity matters for this result. With linear utility the neo-classical analyst believes pN = c is a candidate for the optimal copay, independent of . The assumption that b(s; γ) = s simplifies matters by guaranteeing that there is always a non-negative candidate for the neo-classical optimal copay because it implies that a zero copay (rather than a negative one) maximizes insurance value when all the sick are treated.
We can also see that when behavioral hazard is extreme, there is always a candidate for the neo-classical optimal copay satisfying |pB − pN| ≥ c: the degree to which the optimal copay can vary in response to behavioral hazard is larger than the degree to which the neo-classical optimal copay can vary in response to more standard considerations, like the elasticity of demand or the degree of risk aversion. Indeed, without behavioral hazard, the optimal copay always lies in [0, c] under the assumption that b(s; γ) = s.
The case where provides an illustrative example of the last point. This is a situation where there is a lot of overuse due to behavioral hazard, and patients are reasonably risk averse. The analyst who looks for behavioral hazard will understand that copays should be really high to counteract overuse due to behavioral hazard: pB = 197.92, which is well above the cost of treatment, c = 100. The neo-classical analyst who believes that everybody accurately trades off costs and benefits in making treatment decisions will observe that everybody gets treated when the price is less than or equal to 99 and half the population gets treated when the price is 299/2. Since the cost of treatment is c = 100, it looks to the analyst like there is very little benefit to controlling moral hazard: almost everybody seems to value the treatment at more than its cost, and the extremely small fraction who do not still seem to value the treatment at 99% of its cost. On the other hand, since people are risk averse, there is a benefit to reducing copays. In fact, the marginal insurance benefit appears to exceed the marginal moral hazard cost at a copay of 99. Further, since the marginal moral hazard cost is zero at all lower copays (everybody is already getting treated), the neo-classical analyst believes the copay should go all the way down to zero when in fact optimally it should be almost double the cost!
To illustrate, Schedlbauer et al. (2010) find that nudges focusing on reminders and reinforcements were particularly promising ways to improve statin adherence, with four out of six trials reviewed producing significant increases in adherence ranging from 6–24 percentage points. Similarly, simplifying the dosing schedules for blood pressure medication can lead to a 10–20% increase in adherence (Schroeder et al. 2004).
Chetty, Looney, and Kroft (2009) similarly describe conditions under which the marginal internality resulting from inattention to non-salient taxes can be recovered by comparing the demand response to taxes to the demand response to prices.
It is not only important that the analyst have prior knowledge that the nudge leads to better decisions, but also that the nudge influences demand by impacting ε. To illustrate, suppose people forget to demand treatment with some probability ϕ that may be dependent on nudges (e.g., reminders) but is independent of prices. If there are no other biases, then the marginal degree of behavioral hazard with respect to the price lever is zero, since anybody who is marginal with respect to price is remembering and making an accurate cost-benefit calculation. However, we may still see a sizable demand response to reminders. So, in this case, we cannot infer the marginal degree of behavioral hazard by comparing nudge and demand responses. Allcott and Taubinsky (2014) make a related point that analyzing the demand response to nudges can give a misleading impression of the average marginal bias under certain forms of heterogeneity in the bias across the population.
This is reminiscent of the empirical strategies of Bronnenberg et al. (2015) and Handel and Kolstad (2015) who compare the demand curves (e.g., for branded drugs) of “experts” (e.g., pharmacists) to “non-experts” (e.g., average consumers).
It is tempting to suppose that nudging in a way that eliminates errors is always optimal when such nudging is possible and does not have direct costs. However, when people are risk averse then such debiasing can undermine broader social welfare in cases when it increases demand and, with it, the moral hazard cost of insurance (Mullainathan, Schwartzstein and Congdon 2012; Pauly and Blavin 2008).
Ding et al. (2011) and others have also calculated the economic costs associated with “incidental” findings — abnormalities detected in asymptomatic patients in the course of a separate evaluation.
One study found, for example, that physicians’ propensity to prescribe contraindicated antibiotics for their patients rose over the duration of their shift—a pattern attributed to physicians’ diminishing capacity to resist patient requests for prescriptions (Linder et al. 2014).
Insurers will face some incentive to counteract negative behavioral hazard since naive consumers over-estimate their demand in this case, which creates some benefit to reducing copays. But, in general, negative behavioral hazard will not be efficiently managed in equilibirum. The discussion of market-provided insurance connects with the literature on behavioral contract theory and industrial organization. Ellison (2006) and Spiegler (2011) provide nice reviews of this literature.
Nevertheless, insurers could increase profits by promoting adherence to medications and treatments that save money over a reasonably short horizon (relative to the typical tenure of their enrollees), and the model suggests that insurers will invest in encouraging care in such instances. For example, flu shots are often given for free and at enrollees’ convenience. Perhaps for a similar reason, insurers are also funding research into promoting adherence to certain treatments (Belluck 2010) and in-house programs aimed at improving adherence, such as Humana’s “RxMentor” or United’s “Refill and Save” programs. Aetna’s tracking suggests that its program targeting chronic disease patients has improved adherence (Sipkoff 2009). FICO, known for its widely-purchased credit score, is now even selling medication adherence scores to insurers, intended to predict how likely patients are to adhere (Parker-Pope 2011).
Additionally, while it could be efficient to provide negative prices (subsidies) to use high-value care (Volpp. et. al. 2009), practical considerations may make this infeasible. Current regulatory restrictions may also limit the ability of insurers to counteract behavioral hazard. For example, there are limits to the ability of insurers to offer plans with different copayments for the same drug to different patients, or for plans operating within Medicare to offer negative prices (cash incentives) to enrollees. As noted above, such plans might also be more complex.
We abstract from ex ante heterogeneity among consumers and issues of selection. Of course, adverse selection provides a rationale for government intervention even in the standard model. Sandroni and Squintani (2007, 2013), Jeleva and Villeneuve (2004), and Spinnewijn (2013, 2014) explore how ex ante heterogeneity in risk perceptions or overconfidence alter equilibrium insurance contracts, the relationship between risk and insurance coverage, and the welfare impact of various government policies, e.g., insurance mandates.
Even absent such multi-step evidence, however, we can make some back-of-the-envelope estimates of the potential benefits of optimal copayment design in a model with both moral hazard and behavioral hazard from the literature that looks at particular steps in this chain. This requires applying results from one particular step (e.g. the effect of a copay change on one measure of adherence, such as refilling prescriptions) generated from a particular marginal population to the next step (e.g. effect of a different measure of adherence, such as missed pills, on heart attacks) generated in a different setting with a different population, over a different time frame, etc. Online Appendix C provides a very stylized example using the case study of treatment for high blood pressure. High blood pressure afflicts 68 million adults in the U.S. (CDC Vital Signs 2011) and is an important driver of overall health care costs. Using existing estimates, we show that small reductions in copays increase compliance with anti-hypertensive therapy and that better compliance generates substantial health gains. This implies a large net return on the marginal social dollar spent on improving blood-pressure medication adherence and suggests that the failure of existing plans to address behavioral hazard could be generating large welfare costs. Rosen et al. (2005) perform a similar exercise to predict the cost-effectiveness of first-dollar coverage of ACE inhibitors for Medicare beneficiaries with diabetes.
Contributor Information
Katherine Baicker, Harvard University.
Sendhil Mullainathan, Harvard University.
Joshua Schwartzstein, Dartmouth College.
References
- Allcott Hunt and Taubinsky Dmitry. “The Lightbulb Paradox: Evidence from two Randomized Experiments.”, 2014. Working Paper, NYU. [Google Scholar]
- Aron-Dine Aviva, Einav Liran, and Finkelstein Amy. “The RAND Health Insurance Experiment, Three Decades Later.” Journal of Economic Perspectives 27.1 (2013): 197–222. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Arrow Kenneth J. “Uncertainty and the Welfare Economics of Medical Care.” The American Economic Review 53.5 (1963): 941–973. [Google Scholar]
- Baicker Katherine, Mullainathan Sendhil, and Schwartzstein Joshua. “Behavioral Hazard in Health Insurance.” National Bureau of Economic Research Working Paper.w18468 (2013). [PMC free article] [PubMed] [Google Scholar]
- Bailey CJ and Kodack Michael. “Patient Adherence to Medication Requirements for Therapy of Type 2 Diabetes.” International Journal of Clinical Practice 65.3 (2011): 314–322. [DOI] [PubMed] [Google Scholar]
- Beaulieu Nancy, Cutler David M, Ho Katherine, Isham George, Lindquist Tammie, Nelson Andrew, and O’Connor Patrick. “The Business Case for Diabetes Disease Management for Managed Care Organizations.” Frontiers in Health Policy Research 9.1 (2006):. [Google Scholar]
- Belluck Pam. “For Forgetful, Cash Helps the Medicine go Down.” The New York Times. (2010). [Google Scholar]
- Bernheim B Douglas and Rangel Antonio. “Beyond Revealed Preference: Choice-Theoretic Foundations for Behavioral Welfare Economics.” Quarterly Journal of Economics 124.1 (2009): 51–104. [Google Scholar]
- Beshears John, Choi James J, Laibson David, and Madrian Brigitte C. “How are Preferences Revealed?” Journal of Public Economics 92.8 (2008): 1787–1794. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Bordalo Pedro, Gennaioli Nicola, and Shleifer Andrei. “Salience Theory of Choice Under Risk.” The Quarterly Journal of Economics 127.3 (2012): 1243–1285. [Google Scholar]
- ———. “Salience and Consumer Choice.” Journal of Political Economy 121.5 (2013): 803–843. [Google Scholar]
- ———. “Memory, Attention, and Choice.”, 2015. Mimeo, Harvard University. [Google Scholar]
- Bronnenberg Bart J, Dube Jean-Pierre, Genzkow Matthew, and Shapiro Jesse M. “Do Pharmacists Buy Bayer? Informed Shoppers and the Brand Premium.” Quarterly Journal of Economics. (Forthcoming). [Google Scholar]
- CDC. “High Blood Pressure and Cholesterol.” National Center for Chronic Disease Prevention and Health Promotion. (2011). [Google Scholar]
- Chambers Christopher P and Healy Paul J. “Updating Toward the Signal.” Economic Theory 50.3 (2012): 765–786. [Google Scholar]
- Chandra Amitabh, Gruber Jonathan, and McKnight Robin. “Patient Cost-Sharing and Hospitalization Offsets in the Elderly.” The American Economic Review 100.1 (2010): 193–213. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Chernew Michael E, Rosen Allison B, and Fendrick A Mark. “Value-Based Insurance Design.” Health Affairs 26.2 (2007): w195–w203. [DOI] [PubMed] [Google Scholar]
- Chetty Raj. “The Simple Economics of Salience and Taxation.”, 2009a. Mimeo, Harvard University. [Google Scholar]
- ———. “Sufficient Statistics for Welfare Analysis: A Bridge Between Structural and Reduced-Form Methods.” Annual Review of Economics 1.1 (2009b): 451–488. [Google Scholar]
- Chetty Raj, Looney Adam, and Kroft Kory. “Salience and Taxation: Theory and Evidence.” American Economic Review 99.4 (2009): 1145–1177. [Google Scholar]
- Choudhry Niteesh K, Avorn Jerry, Glynn Robert J, Antman Elliott M, Schneeweiss Sebastian, Toscano Michele, Reisman Lonny, Fernandes Joaquim, Spettell Claire, Lee Joy L, et al. “Full Coverage for Preventive Medications After Myocardial Infarction.” New England Journal of Medicine 365.22 (2011): 2088–2097. [DOI] [PubMed] [Google Scholar]
- Costa Font Joan and Gemmill Toyama Marin. “Does Cost Sharing Really Reduce Inappropriate Prescriptions Among the Elderly?” Health Policy 101.2 (2011): 195–208. [DOI] [PubMed] [Google Scholar]
- Cutler David M and Zeckhauser Richard J. “The Anatomy of Health Insurance.” Handbook of Health Economics 1. (2000): 563–643. [Google Scholar]
- DellaVigna Stefano. “Psychology and Economics: Evidence from the Field.” Journal of Economic Literature 47.2 (2009): 315–372. [Google Scholar]
- DellaVigna Stefano and Malmendier Ulrike. “Contract Design and Self-Control: Theory and Evidence.” The Quarterly Journal of Economics 119.2 (2004): 353–402. [Google Scholar]
- Denberg Thomas D, Melhado Trisha V, Coombes John M, Beaty Brenda L, Berman Kenneth, Byers Tim E, Marcus Alfred C, Steiner John F, and Ahnen Dennis J. “Predictors of Nonadherence to Screening Colonoscopy.” Journal of General Internal Medicine 20.11 (2005): 989–995. [DOI] [PMC free article] [PubMed] [Google Scholar]
- DiMatteo M Robin. “Variations in Patients’ Adherence to Medical Recommendations: A Quantitative Review of 50 Years of Research.” Medical Care 42.3 (2004): 200–209. [DOI] [PubMed] [Google Scholar]
- Ding Alexander, Eisenberg Jonathan D, and Pandharipande Pari V. “The Economic Burden of Incidentally Detected Findings.” Radiologic Clinics of North America 49.2 (2011): 257–265. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Einav Liran, Finkelstein Amy, and Williams Heidi. “Paying on the Margin for Medical Care: Evidence from Breast Cancer Treatments.”, 2015. Mimeo, MIT. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Ellis Randall P and Manning Willard G. “Optimal Health Insurance for Prevention and Treatment.” Journal of Health Economics 26.6 (2007): 1128–1150. [DOI] [PubMed] [Google Scholar]
- Ellison Glenn. “Bounded Rationality in Industrial Organization.” Advances in Economic and Econometrics: Theory and Applications. eds. Blundell Richard, Newey Newey, and Persson Tortsen. Cambridge University Press, 2006. [Google Scholar]
- Feldman Roger and Dowd Bryan. “A New Estimate of the Welfare Loss of Excess Health Insurance.” The American Economic Review 81.1 (1991): 297–301. [PubMed] [Google Scholar]
- Feldstein Martin S. “The Welfare Loss of Excess Health Insurance.” The Journal of Political Economy 81.2 (1973): 251–280. [Google Scholar]
- Finkelstein Amy. Moral Hazard in Health Insurance. Columbia University Press, 2014. [Google Scholar]
- Frank Richard G. “Behavioral Economics and Health Economics.”, 2004. National Bureau of Economic Research Working Paper. [Google Scholar]
- Gann Carrie. “Man Dies From Toothache, Couldn’t Afford Meds.” ABC News. (2011). [Google Scholar]
- Gao Xin, Nau DP, Rosenbluth SA, Scott V, and Woodward C. “The Relationship of Disease Severity, Health Beliefs and Medication Adherence Among HIV Patients.” AIDS Care 12.4 (2000): 387–398. [DOI] [PubMed] [Google Scholar]
- Glazer Jacob and McGuire Thomas G. “A Welfare Measure of “Offset Effects” in Health Insurance.” Journal of Public Economics 96.5 (2012): 520–523. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldman Dana and Philipson Tomas J. “Integrated Insurance Design in the Presence of Multiple Medical Technologies.” American Economic Review 97.2 (2007): 427–432. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Goldman Dana P, Joyce Geoffrey F, Escarce Jose J, Pace Jennifer E, Solomon Matthew D, Laouri Marianne, Landsman Pamela B, and Teutsch Steven M. “Pharmacy Benefits and the Use of Drugs by the Chronically Ill.” Journal of the American Medical Association 291.19 (2004): 2344–2350. [DOI] [PubMed] [Google Scholar]
- Goldman Dana P, Joyce Geoffrey F, Karaca-Mandic Pinar, et al. “Varying Pharmacy Benefits with Clinical Status: The Case of Cholesterol-Lowering Therapy.” American Journal of Managed Care 12.1 (2006): 21–28. [PubMed] [Google Scholar]
- Gonzales Ralph, Malone Daniel C, Maselli Judith H, and Sande Merle A. “Excessive Antibiotic Use for Acute Respiratory Infections in the United States.” Clinical Infectious Diseases 33.6 (2001): 757–762. [DOI] [PubMed] [Google Scholar]
- Gruber Jonathan and Koszegi Botond. “Is Addiction “Rational”? Theory and Evidence.” The Quarterly Journal of Economics 116.4 (2001): 1261–1303. [Google Scholar]
- Handel Benjamin R and Kolstad Jonathan T. “Health Insurance for “Humans”: information Frictions, Plan Choice, and Consumer Welfare.” American Economic Review. (Forthcoming). [DOI] [PubMed] [Google Scholar]
- Hsu John, Price Mary, Huang Jie, Brand Richard, Fung Vicki, Hui Rita, Fireman Bruce, Newhouse Joseph P, and Selby Joseph V. “Unintended Consequences of Caps on Medicare Drug Benefits.” New England Journal of Medicine 354.22 (2006): 2349–2359. [DOI] [PubMed] [Google Scholar]
- Jarvik Jeffrey G, Hollingworth William, Martin Brook, Emerson Scott S, Gray Darryl T, Overman Steven, Robinson David, Staiger Thomas, Wessbecher Frank, Sullivan Sean D, et al. “Rapid Magnetic Resonance Imaging vs Radiographs for Patients with Low Back Pain: A Randomized Controlled Trial.” Journal of the American Medical Association 289.21 (2003): 2810–2818. [DOI] [PubMed] [Google Scholar]
- Jeleva Meglena and Villeneuve Bertrand. “Insurance Contracts with Imprecise Probabilities and Adverse Selection.” Economic Theory 23.4 (2004): 777–794. [Google Scholar]
- Johnson Richard E, Goodman Michael J, Hornbrook Mark C, and Eldredge Michael B. “The Impact of Increasing Patient Prescription Drug Cost Sharing on Therapeutic Classes of Drugs Received and on the Health Status of Elderly HMO Member.” Health Services Research 32.1 (1997): 103–122. [PMC free article] [PubMed] [Google Scholar]
- Kahneman Daniel, Wakker Peter P, and Sarin Rakesh. “Back to Bentham? Explorations of Experienced Utility.” The Quarterly Journal of Economics 112.2 (1997): 375–405. [Google Scholar]
- Koszegi Botond. “Health Anxiety and Patient Behavior.” Journal of Health Economics 22.6 (2003): 1073–1084. [DOI] [PubMed] [Google Scholar]
- Koszegi Botond and Szeidl Adam. “A Model of Focusing in Economic Choice.” The Quarterly Journal of Economics 128.1 (2012): 53–104. [Google Scholar]
- Laibson David. “Golden Eggs and Hyperbolic Discounting.” The Quarterly Journal of Economics 112.2 (1997): 443–477. [Google Scholar]
- Lakdawalla Darius and Sood Neeraj. “Innovation and the Welfare Effects of Public Drug Insurance.” Journal of Public Economics 93.3 (2009): 541–548. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lee Joy L, Maciejewski Matthew L, Raju Shveta S, Shrank William H, and Choudhry Niteesh K. “Value-Based Insurance Design: Quality Improvement but no Cost Savings.” Health Affairs 32.7 (2013): 1251–1257. [DOI] [PubMed] [Google Scholar]
- Liebman Jeffrey and Zeckhauser Richard. “Simple Humans, Complex Insurance, Subtle Subsidies.”, 2008. Mimeo, Harvard University. [Google Scholar]
- Linder Jeffrey A, Doctor Jason N, Friedberg Mark W, Nieva Harry Reyes, Birks Caroline, Meeker Daniella, and Fox Craig R. “Time of Day and the Decision to Prescribe Antibiotics.” JAMA Internal Medicine 174.12 (2014): 2029–2031. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Lohr Kathleen N, Brook Robert H, Kamberg Caren J, Goldberg George A, Leibowitz Arleen, Keesey Joan, Reboussin David, and Newhouse Joseph P. “Use of Medical Care in the RAND Health Insurance Experiment: Diagnosis-and Service-Specific Analyses in a Randomized Controlled Trial.” Medical Care 24.9 (1986): S1–S87. [PubMed] [Google Scholar]
- Long Judith A, Jahnle Erica C, Richardson Diane M, Loewenstein George, and Volpp Kevin G. “Peer Mentoring and Financial Incentives to Improve Glucose Control in African American Veterans: A Randomized Trial.” Annals of Internal Medicine 156.6 (2012): 416–424. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Manning Willard G, Newhouse Joseph P, Duan Naihua, Keeler Emmett B, and Leibowitz Arleen. “Health Insurance and the Demand for Medical Care: Evidence from a Randomized Experiment.” The American Economic Review 77.3 (1987): 251–277. [PubMed] [Google Scholar]
- McGlynn Elizabeth A, Asch Steven M, Adams John, Keesey Joan, Hicks Jennifer, DeCristofaro Alison, and Kerr Eve A. “The Quality of Health Care Delivered to Adults in the United States.” New England Journal of Medicine 348.26 (2003): 2635–2645. [DOI] [PubMed] [Google Scholar]
- Morden Nancy E, Colla Carrie H, Sequist Thomas D, and Rosenthal Meredith B. “Choosing Wisely-The Politics and Economics of Labeling Low-Value Services.” New England Journal of Medicine 370.7 (2014): 589–592. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Mullainathan Sendhil, Schwartzstein Joshua, and Congdon William J. “A Reduced-Form Approach to Behavioral Public Finance.” Annual Review of Economics 4. (2012): 511–540. [Google Scholar]
- Ness Reid M, Holmes Ann M, Klein Robert, and Dittus Robert. “Cost-Utility of One-Time Colonoscopic Screening for Colorectal Cancer at Various Ages.” The American Journal of Gastroenterology 95.7 (2000): 1800–1811. [DOI] [PubMed] [Google Scholar]
- Newhouse Joseph P. Free for All?: Lessons from the RAND Health Insurance Experiment. Harvard University Press, 1993. [Google Scholar]
- Newhouse Joseph P. “Reconsidering the Moral Hazard-Risk Avoidance Tradeoff.” Journal of Health Economics 25.5 (2006): 1005–1014. [DOI] [PubMed] [Google Scholar]
- Nyman John A. “The Value of Health Insurance: The Access Motive.” Journal of Health Economics 18.2 (1999): 141–152. [DOI] [PubMed] [Google Scholar]
- O’Donoghue Ted and Rabin Matthew. “Doing it Now or Later.” American Economic Review 89.1 (1999): 103–124. [Google Scholar]
- ———. “Optimal Sin Taxes.” Journal of Public Economics 90.10 (2006): 1825–1849. [Google Scholar]
- Osterberg Lars and Blaschke Terrence. “Adherence to Medication.” New England Journal of Medicine 353.5 (2005): 487–497. [DOI] [PubMed] [Google Scholar]
- Parker-Pope Tara. “Keeping Score on How You Take Your Medicine.” The New York Times. (2011). [Google Scholar]
- Pauly Mark V. “The Economics of Moral Hazard: Comment.” The American Economic Review 58.3 (1968): 531–537. [Google Scholar]
- Pauly Mark V and Blavin Fredric E. “Moral Hazard in Insurance, Value-Based Cost Sharing, and the Benefits of Blissful Ignorance.” Journal of Health Economics 27.6 (2008): 1407–1417. [DOI] [PubMed] [Google Scholar]
- Roehrig Charles, Miller George, Lake Craig, and Bryant Jenny. “National Health Spending by Medical Condition, 1996–2005.” Health Affairs 28.2 (2009): w358–w367. [DOI] [PubMed] [Google Scholar]
- Rosen Allison B, Hamel Mary Beth, Weinstein Milton C, Cutler David M, Fendrick A Mark, and Vijan Sandeep. “Cost-Effectiveness of Full Medicare Coverage of Angiotensin-Converting Enzyme Inhibitors for Beneficiaries with Diabetes.” Annals of Internal Medicine 143.2 (2005): 89–99. [DOI] [PubMed] [Google Scholar]
- Rubin Richard R. “Adherence to Pharmacologic Therapy in Patients with Type 2 Diabetes Mellitus.” The American Journal of Medicine 118.5 (2005): 27–34. [DOI] [PubMed] [Google Scholar]
- Sandroni Alvaro and Squintani Francesco. “Overconfidence, Insurance, and Paternalism.” The American Economic Review 97.5 (2007): 1994–2004. [Google Scholar]
- ———. “Overconfidence and Asymmetric Information: The Case of Insurance.” Journal of Economic Behavior & Organization 93. (2013): 149–165. [Google Scholar]
- Schedlbauer Angela, Davies Philippa, and Fahey Tom. “Interventions to Improve Adherence to Lipid Lowering Medication.” Cochrane Heart Group. (2010). [DOI] [PubMed] [Google Scholar]
- Schroeder Knut, Fahey Tom, and Ebrahim Shah. “How Can We Improve Adherence to Blood Pressure–Lowering Medication in Ambulatory Care?: Systematic Review of Randomized Controlled Trials.” Archives of Internal Medicine 164.7 (2004): 722–732. [DOI] [PubMed] [Google Scholar]
- Schwartz Aaron L, Landon Bruce E, Elshaug Adam G, Chernew Michael E, and McWilliams J Michael. “Measuring Low-Value Care in Medicare.” JAMA Internal Medicine 174.7 (2014): 1067–1076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Selby Joe V, Fireman Bruce H, and Swain Bix E. “Effect of a Copayment on Use of the Emergency Department in a Health Maintenance Organization.” New England Journal of Medicine 334.10 (1996): 635–642. [DOI] [PubMed] [Google Scholar]
- Sipkoff Martin. “Improved Adherence Highlights Specialty Pharmacy’s Potential.” Managed Care 18.10 (2009): 17–18. [PubMed] [Google Scholar]
- Spiegler Ran. Bounded Rationality and Industrial Organization. Oxford University Press, 2011. [Google Scholar]
- Spinnewijn Johannes. “Insurance and Perceptions: How to Screen Optimists and Pessimists.” The Economic Journal 123.569 (2013): 606–633. [Google Scholar]
- ———. “Heterogeneity, Demand for Insurance and Adverse Selection.”, 2014. Mimeo, LSE. [Google Scholar]
- Spiro David M, Tay Khoon-Yen, Arnold Donald H, Dziura James D, Baker Mark D, and Shapiro Eugene D. “Wait-and-see Prescription for the Treatment of Acute Otitis Media: a Randomized Controlled Trial.” Journal of the American Medical Association 296.10 (2006): 1235–1241. [DOI] [PubMed] [Google Scholar]
- Strandbygaard Ulla, Thomsen Simon Francis, and Backer Vibeke. “A Daily SMS Reminder Increases Adherence to Asthma Treatment: A Three-Month Follow-up Study.” Respiratory Medicine 104.2 (2010): 166–171. [DOI] [PubMed] [Google Scholar]
- Tamblyn Robyn, Laprise Rejean, Hanley James A, Abrahamowicz Michael, Scott Susan, Mayo Nancy, Hurley Jerry, Grad Roland, Latimer Eric, Perreault Robert, et al. “Adverse Events Associated with Prescription Drug Cost-Sharing Among Poor and Elderly Persons.” Journal of the American Medical Association 285.4 (2001): 421–429. [DOI] [PubMed] [Google Scholar]
- Taubman Sarah L, Allen Heidi L, Wright Bill J, Baicker Katherine, and Finkelstein Amy N. “Medicaid Increases Emergency-Department Use: Evidence from Oregon’s Health Insurance Experiment.” Science 343.6168 (2014): 263–268. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Thaler Richard H and Sunstein Cass R. Nudge: Improving Decisions About Health, Wealth, and Happiness. Penguin, 2008. [Google Scholar]
- van Dulmen Sandra, Sluijs Emmy, van Dijk Liset, de Ridder Denise, Heerdink Rob, and Bensing Jozien. “Patient Adherence to Medical Treatment: A Review of Reviews.” BMC Health Services Research 7.1 (2007): 55. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Volpp, Kevin G, Pauly, Mark V, Loewenstein George, and Bangsberg David. “P4P4P: An Agenda for Research on Pay-for-Performance for Patients.” Health Affairs 28.1 (2009): 206–214. [DOI] [PMC free article] [PubMed] [Google Scholar]
- Waljee Jennifer F, Rogers Mary AM, and Alderman Amy K. “Decision Aids and Breast Cancer: Do They Influence Choice for Surgery and Knowledge of Treatment Options?” Journal of Clinical Oncology 25.9 (2007): 1067–1073. [DOI] [PubMed] [Google Scholar]
- Zeckhauser Richard. “Medical Insurance: A Case Study of the Tradeoff Between Risk Spreading and Appropriate Incentives.” Journal of Economic Theory 2.1 (1970): 10–26. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.