When Should There Be Vertical Choice in Health Insurance Markets?

Victoria R Marone; Adrienne Sabety

doi:10.1257/aer.20201073

. Author manuscript; available in PMC: 2022 Jan 21.

Published in final edited form as: Am Econ Rev. 2022 Jan;112(1):304–342. doi: 10.1257/aer.20201073

When Should There Be Vertical Choice in Health Insurance Markets?

Victoria R Marone ^§, Adrienne Sabety ^†

PMCID: PMC8782442 NIHMSID: NIHMS1768818 PMID: 35068489

Abstract

We study the welfare effects of offering choice over coverage levels—“vertical choice”—in regulated health insurance markets. We emphasize that heterogeneity in efficient coverage level is not sufficient to motivate choice. When premiums cannot reflect individuals’ costs, it may not be in consumers’ best interest to select their efficient coverage level. We show that vertical choice is efficient only if consumers with higher willingness-to-pay have a higher efficient level of coverage. We investigate this condition empirically and find that as long as a minimum coverage level can be enforced, the welfare gains from vertical choice are either zero or economically small.

Keywords: risk protection, moral hazard, health insurance, D82, G22, I13

I. Introduction

Choice over vertically differentiated financial coverage levels—which we term “vertical choice”—is widely available in U.S. health insurance markets. A notable example is the metal-tiered plans (e.g., Gold, Silver, Bronze) offered on Affordable Care Act exchanges. In contrast, national health insurance schemes often offer only a single level of coverage. For example, Britons are automatically enrolled in the level of coverage provided by the National Health Service, without a choice. In both contexts, regulation plays a central role in determining the extent of vertical choice. But to date, the economics literature has provided limited guidance to regulators on this topic. This paper aims to fill that gap.

The basic argument in favor of vertical choice is the standard argument in favor of product variety: with more options, consumers can more closely match with their socially efficient product by revealed preference (Dixit and Stiglitz, 1977). This argument, however, relies critically on the condition that privately optimal choices align with socially optimal choices. In competitive markets in which costs are independent of consumers’ private valuations, this alignment is standard. But in markets with selection, like health insurance markets, this alignment may not be possible. In these markets, costs are inextricably related to private valuations, and asymmetric information (or regulation) prevents prices from reflecting marginal costs (Akerlof, 1970; Rothschild and Stiglitz, 1976). We show that even if health insurance markets are competitive, regulated, and populated by informed consumers, whether vertical choice can increase welfare is theoretically ambiguous.

Our welfare metric derives from a seminal literature on optimal insurance, which holds that the efficient level of coverage equates the marginal benefit of risk protection and the marginal social cost of utilization induced by insurance (Arrow, 1965; Pauly, 1968, 1974; Zeckhauser, 1970). We focus attention on how this central tradeoff between the “value of risk protection” and the “social cost of moral hazard” plays out on a consumer-by-consumer basis. The aim is to design a plan menu that reflects these social incentives by inducing consumers to select their efficient level of coverage. But in doing so, the designer must contend with private incentives: namely that consumers with higher willingness to pay for insurance will select higher coverage. The key challenge is that consumers with higher willingness to pay are not necessarily the consumers with a higher efficient coverage level. It is precisely this statement that captures the theoretical ambiguity of whether it is optimal to offer a vertical choice.

We consider the menu design problem facing a market regulator that can offer vertically differentiated plans and can set premiums.¹ The regulator’s objective is to maximize allocative efficiency with respect to consumers and plans. As is standard in employer-sponsored insurance and national health insurance schemes, the regulator need not break even plan by plan, nor in aggregate. If more than one plan is demanded from the regulator’s chosen menu, we say it has offered vertical choice. Extending the widely used graphical framework of Einav, Finkelstein and Cullen (2010), we show that the key condition determining whether the optimal menu features vertical choice is whether consumers with higher willingness to pay have a higher efficient level of coverage. The principal empirical focus of this paper is to determine whether this is likely to be true.

We begin by presenting a model of consumer demand for health insurance, building on Cardon and Hendel (2001) and Einav et al. (2013). The model has two stages. In the first, consumers make a discrete choice over plans under uncertainty about their health. In the second, upon realizing their health, consumers make a continuous choice of healthcare utilization. We use the model to show that willingness to pay for insurance can be partitioned into two parts: one that is both privately and socially relevant (the value of risk protection), and one that is only privately relevant (the expected reduction in out-of-pocket spending). Because a portion of the private value of insurance is a transfer, higher willingness to pay does not necessarily imply higher social surplus. For example, allocating higher coverage to a sick but risk-neutral consumer delivers her a private benefit, but generates no social benefit; more of her expected healthcare spending is simply shifted to others. If she consumes more healthcare than she values in response to higher coverage, the regulator would prefer she had lower coverage.

We estimate the model using data from the population of public school employees in Oregon. The data contain health insurance plan menus, plan choices, and the subsequent healthcare utilization of nearly 45,000 households over the period 2008 to 2013. Crucially for identification, we observe plausibly exogenous variation in the plan menus offered to employees. The variation is driven by the fact that plan menus are set independently by each of 187 school districts in the state, which in turn select plans from a common superset determined at the state level. In addition, we observe several coverage levels offered by the same insurer with the same provider network, providing isolated variation along our focal dimension. Our model incorporates observed and unobserved heterogeneity across households along the key dimensions of health status, propensity for moral hazard, and risk aversion. We use the model to recover the joint distribution of household types in this population.

Modeling the structural primitives that underlie demand and costs in the market allows us to evaluate these objects for any level of coverage a regulator might wish to consider. The key advantage of this strategy is that we can look beyond the set of contracts observed in our particular setting (Einav, Finkelstein and Cullen, 2010). This flexibility also raises the distinct design question of how “closely spaced” to permit contracts to be. Should the regulator permit choice over contracts differentiated by only $100 in deductible? We do not directly model the potential costs of this type of choice (such as fixed costs of offering contracts), but we do evaluate the potential benefits. As discussed below, in practice we find that even under ideal conditions, the returns to allowing very closely-spaced contracts are economically small.

Our estimates imply that all households have a fairly high efficient level of coverage, ranging between a high-deductible contract (with a $10,000 deductible and full coverage thereafter) and full insurance. Contracts outside this range can be ruled out from the optimal menu (as they deliver lower social surplus for every household). Within this range, we find that households with higher willingness to pay are primarily motivated by a greater expected reduction in out-of-pocket spending, rather than by a greater value of risk protection. These households are highly likely to spend past $10,000, and therefore face little out-of-pocket cost uncertainty under any contract in the relevant range of coverage levels. Although there are competing factors, this negative relationship between willingness to pay and “relevant risk” quantitatively dominates. As a result, we find at best only a weak relationship between willingness to pay and efficient coverage level.

We solve for the optimal menu under a baseline requirement that contracts be no closer than $2,500 out-of-pocket maximum intervals. We find that the optimal menu consists of a single contract. Introducing any other contract, at any price, leads to over- or under-insurance (on average) among households that would choose the alternative. We then increase the permissible density of contracts by a factor of 10 (to $250 out-of-pocket maximum intervals). Here, we find that it is efficient to offer a vertical choice: the optimal menu features four contracts, clustered around the original optimal single contract. However, because social surplus is quite flat across coverage levels near the optimum, the welfare gains are small. Offering a choice increases welfare by only $5 per household per year relative to what is achieved by a single contract.

It is important to emphasize that these results may not reflect the value of vertical choice in all health insurance markets. Indeed, robustness analysis in Section V reveal that doubling average risk aversion results in the optimal menu featuring vertical choice, even in the “sparse” contract space. When risk protection accounts for a larger part of the variation in willingness to pay for higher coverage, the revelation of private information (through choice) becomes more valuable. On the other hand, there are some reasons to think that our findings may be reflected in other settings. The negative relationship between willingness to pay and “relevant risk” is a central driver of our results, and this relationship follows from a pair of factors: (i) variation in willingness to pay is primarily driven by consumers’ information about their upcoming health needs, and (ii) the lowest relevant level of coverage is reasonably high. The first factor implies that the highest willingness-to-pay consumers are the sickest, and the second implies that these consumers would face little out-of-pocket cost uncertainty even in the lowest relevant coverage level. These factors seem particularly plausible in settings in which contracts span a short time, and in which exposure to substantial out-of-pocket spending risk is not efficient for anyone.

In the last part of the paper (Section VI), we evaluate the welfare and distributional implications of the optimal plan menu relative to a status quo with vertical choice. Relative to an outcome with vertical choice, the optimal menu (the single contract) increases welfare by $315 per household per year. But these gains are not shared evenly in the population. Sicker and larger households fare best under the single contract, while healthier and smaller households fare best under vertical choice. Our results suggest that one reason for the persistence of vertical choice in settings such as employer-sponsored insurance could be to limit redistribution across these groups.

Beyond the work noted above, our theoretical approach is most closely related to Azevedo and Gottlieb (2017), who also model demand for health insurance in a setting with vertically differentiated contracts and multiple dimensions of consumer heterogeneity. While their focus is on competitive equilibria, their numerical simulations also consider optimal pricing. They document that under certain distributions of consumer types, offering choice is optimal, while under others it is not. Our paper focuses directly on why this is the case, and brings to bear an empirical approach that permits substantially more flexibility in the distribution of consumer types.

Our paper also closely relates to work that evaluates allocational efficiency in health insurance markets (Cutler and Reber, 1998; Lustig, 2008; Carlin and Town, 2008; Dafny, Ho and Varela, 2013; Kowalski, 2015; Tilipman, 2018), and more specifically to the growing literature on menu design in these markets.² In the context of insurer choice, Bundorf, Levin and Mahoney (2012) investigate the optimal allocation of consumere to insurers, and find that it cannot be achieved by uniform pricing. Our paper is similar in spirit (and in findings), but focuses instead on the financial dimension of insurance. In this context, Ericson and Sydnor (2017) also consider the question of whether choice is welfare-improving. A key difference of our work is that we consider a setting in which contract characteristics are endogenous and premiums are exogenous, as opposed to the reverse. In a similar spirit, Ho and Lee (2021) study optimal menu design from the perspective of an employer. Like us, they find that the gains from offering a choice over coverage levels are small. Our contribution relative to these papers is to provide a conceptual characterization of when choice over financial coverage levels is and is not valuable. We view this characterization as a tool that can be used to reexamine the design of existing health insurance markets through a new lens. Our empirical analysis demonstrates the relevance of the prediction that vertical choice may not be valuable, and links it to the distribution of fundamentals—risk aversion, propensity for moral hazard, and distributions of health outcomes—in a population.

Finally, we view our work as complementary to the large literature documenting the fact that consumers have difficultly optimizing over health insurance products (Abaluck and Gruber, 2011, 2016; Ketcham et al., 2012; Handel and Kolstad, 2015; Bhargava, Loewenstein and Sydnor, 2017), which has recently also focused on ways in which consumers can be nudged into doing so (Abaluck and Gruber, 2016, 2017; Gruber et al., 2019; Bundorf et al., 2019; Samek and Sydnor, 2020). Importantly, if privately and socially optimal allocations do not align, more diligent consumers may just as well lead to less desirable outcomes (as is found by Handel, 2013). A central aim of the present paper is to inform the design of health insurance markets in such a way that better-informed consumers always lead to better allocations.

The paper proceeds as follows. Section 2 presents our theoretical model and derives the objects relevant to describe private and social incentives. Section 3 describes our data and the variation it provides. Section 4 presents the empirical implementation of our model. Section 5 presents the model estimates and main results. Section 6 evaluates welfare and distributional outcomes. Section 7 concludes.

II. Theoretical Framework

II.A. Model

We consider a model of a health insurance market in which consumers are heterogeneous along multiple dimensions and the set of traded contracts is endogenous. We assume that premiums may not vary with consumer characteristics, that claims may be contingent only on healthcare utilization, and that each consumer will select a single contract.³

We denote a set of potential contracts by X = {x₀, x₁, …,x_n}, where x₀ is a null contract that provides no insurance. Within X, contracts are vertically differentiated by the financial level of coverage provided. Consumers are characterized by type θ = {F, ψ, w}, where F is a distribution over potential health states, $ψ \in ℝ_{+ +}$ is a risk aversion parameter, and w is a parameter that governs consumer preferences for healthcare utilization (and ultimately captures the degree of moral hazard). A population is defined by a distribution G(θ).

Demand for Health Insurance and Healthcare Utilization.

Consumers are subject to a stochastic health state l, drawn from their distribution F. Given their health state, consumers decide the money amount $m \in ℝ_{+}$ of healthcare utilization (“spending”) to consume, a decision which in part depends on their insurance contract. Contracts are characterized by an increasing and concave out-of-pocket cost schedule $c_{x} : ℝ_{+} \to ℝ_{+}$ , where c_x(m) ≤ m ∀ m.

Consumers value healthcare spending m and residual income y. Preferences are represented by u_ψ(y + b(m; l, w)), where b is a money-metric valuation of healthcare utilization, and u_ψ and b(·; l, w) are each strictly increasing and concave. Upon realizing their health state, consumers choose their healthcare utilization by trading off its benefit with its out-of-pocket cost: m*(l, w, x) = argmax_m (b(m; l, w) − c_x(m)). Privately optimal utilization implies indirect benefit b*(l, w, x) = b(m*(l, w, x); l, w) and indirect out-of-pocket cost $c_{x}^{*} (l, ω, x) = c_{x} (m^{*} (l, ω, x))$ . Before the health state is realized, expected utility is given by

U (x, p, θ) = E [u_{ψ} (\hat{y} - p - c_{x}^{*} (l, ω, x) + b^{*} (l, ω, x)) ∣ l ~ F],

(1)

where p is the contract premium and $\hat{y}$ is initial income.

Private vs. Social Incentives.

Absent insurance, consumers pay the full cost of healthcare utilization, m. Socially optimal healthcare utilization therefore coincides with privately optimal utilization absent insurance.⁴ The difference between privately optimal spending m*(l, w, x) and socially optimal spending m*(l, w, x₀) determines the social cost of insurance. Since insurance reduces the price consumers pay for healthcare, m*(l, w, x) typically exceeds m*(l, w, x₀). We refer to this induced utilization as “moral hazard spending.”⁵ A consumer’s net payoff from moral hazard spending is given by

v (l, ω, x) = \underset{\begin{matrix} B e n e f i t o f m o r a l \\ h a z a r d s p e n d i n g \end{matrix}}{\underset{︸}{b^{*} (l, ω, x) - b^{*} (l, ω, x_{0})}} - \underset{\begin{matrix} O u t - o f - p o c k e t c o s t o f \\ m o r a l h a z a r d s p e n d i n g \end{matrix}}{\underset{︸}{(c_{x}^{*} (l, ω, x) - c_{x}^{*} (l, ω, x_{0}))}},

where b*(l, w, x₀) is the indirect benefit of uninsured behavior, and $c_{x}^{*} (l, ω, x_{0})$ is the out-of-pocket cost of uninsured behavior at insured prices. Note that since any change in behavior is voluntary, v(l, w, x) is weakly positive.

Calculations in Appendix A.1 show that if u_ψ features constant absolute risk aversion, willingness to pay for contract x relative to the null contract x₀ can be expressed as⁶

WTP (x, θ) = \underset{\begin{matrix} E x p e c t e d r e d u c t i o n i n o u t - o f - p o c k e t \\ c o s t h o l d i n g b e h a v i o r f i x e d \end{matrix}}{\underset{︸}{E_{l} [c_{x_{0}}^{*} (l, ω, x_{0}) - c_{x}^{*} (l, ω, x_{0})]}} + \underset{\begin{matrix} E x p e c t e d p a y o f f f r o m \\ m o r a l h a z a r d s p e n d i n g \end{matrix}}{\underset{︸}{E_{l} [v (l, ω, x)]}} + \underset{\begin{matrix} V a l u e o f r i s k \\ p r o t e c t i o n \end{matrix}}{\underset{︸}{Ψ (x, θ)}} .

(2)

Willingness to pay is composed of three terms: the expected reduction in out-of-pocket cost holding behavior fixed (at uninsured behavior), the expected payoff from moral hazard spending, and the value of risk protection. The first term captures the transfer from the consumer to the insurer of the expected healthcare spending liability that exists even absent moral hazard. It will be an equal and opposite cost to the insurer. The second and third terms, in contrast, depend on consumers preferences and are relevant to social welfare. Consumers partially value the additional healthcare they consume when they have higher coverage, as well as the ability to smooth consumption across health states.

Insurer costs are given by k_x(m), where m = k_x(m) + c_x(m). A reduction in out-of-pocket cost is an increase in insurer cost, so $c_{x_{0}}^{*} (l, ω, x_{0}) - c_{x}^{*} (l, ω, x_{0}) = k_{x}^{*} (l, ω, x_{0})$ . The social surplus generated by allocating a consumer of type θ to contract x (relative to allocating the same consumer to the null contract) is the difference between WTP(x, θ) and expected insured cost $E_{l} [k_{x}^{*} (l, ω, x)]$ , which after simplying is:

S S (x, θ) = \underset{\begin{matrix} V a l u e o f r i s k \\ p r o t e c t i o n \end{matrix}}{\underset{︸}{Ψ (x, θ)}} - \underset{\begin{matrix} S o c i a l c o s t \\ o f m o r a l h a z a r d \end{matrix}}{\underset{︸}{E_{l} [k_{x}^{*} (l, ω, x) - k_{x}^{*} (l, ω, x_{0}) - v (l, ω, x)]}} .

(3)

Because the insurer is risk neutral, it bears no extra cost from uncertain payoffs. If there is moral hazard, the consumer’s value of her expected healthcare spending falls below its cost, generating a welfare loss from insurance. The welfare loss equals the portion of the expected increase in healthcare spending that is not valued.

The socially optimal contract for each type of consumer optimally trades off the value of risk protection and the social cost of moral hazard: $x^{e f f} (θ) = {argmax}_{x \in X} S S (x, θ)$ . Given premium vector p = {p_x}_x∈X, the privately optimal contract optimally trades off private utility and premium: $x^{*} (θ, p) = {argmax}_{x \in X} (W T P (x, θ) - p_{x})$ .

Supply and Regulation.

We suppose contracts are supplied by a regulator, which can observe the distribution of consumer types and can set premiums on all contracts except x₀, which has zero premium. The regulator need not break even on any given contract, nor in aggregate. It can effectively remove any non-null contract from the set of contracts on offer by setting a premium of infinity. It can effectively remove x₀ from offer by setting the premium of any non-null contract to zero. This simple model of supply is isomorphic to a more complicated model involving perfect competition among private insurers and a regulator that can strategically tax or subsidize contracts. Precisely such a model is formalized in Section 6 of Azevedo and Gottlieb (2017).

The regulator sets premiums p in order to maximize social welfare, given by

W (p) = \int S S (x^{*} (θ, p), θ) d G (θ) .

Our question is whether, or when, the regulator’s solution will involve vertical choice. That is, we ask whether the optimal feasible allocation features enrollment in more than one contract.

II.B. Graphical Analysis

We characterize the answer graphically for the case of a market with only two potential contracts. This case conveys the basic intuition and can be depicted easily using the graphical framework introduced by Einav, Finkelstein and Cullen (2010).

First, it is useful to recognize that moral hazard, risk aversion, and consumer heterogeneity are necessary conditions for vertical choice to be efficient. If there were not moral hazard, the highest coverage contract would be socially optimal for all consumers, and the optimal menu would involve only this contract. If there were not risk aversion, the same would be true with the lowest coverage contract. If there were not consumer heterogeneity, all consumers would again have the same socially optimal contract, and the optimal menu would again feature a single contract. In the following, we explore the more interesting (and more realistic) cases in which consumers do not all have the same socially optimal contract.

Two Contract Example.

Suppose there are two potential contracts, x_H and x_L, where x_H provides higher coverage than x_L. Figure 1 depicts the market for x_H in two populations. If a consumer does not choose x_H, they receive x_L; x₀ is excluded by setting p_L to zero. As x_H provides higher coverage, WTP(x_H, θ) ≥ WTP(x_L, θ) for all consumers. Each panel shows the demand curve D for contract x_H, representing marginal willingness to pay for x_H relative to x_L. The vertical axis plots the marginal premium p = p_H − p_L at which the contracts are offered. The horizontal axis plots the fraction q of consumers that choose x_H.

Each panel also shows the marginal cost curve MC and the marginal social surplus curve SS. The marginal cost curve measures the expected cost of insuring consumers under x_H relative to $x_{L} : E_{l} [k_{x_{H}}^{*} (l, ω, x_{H}) - k_{x_{L}}^{*} (l, ω, x_{L})]$ . Because consumers with the same willingness to pay can have different costs, MC represents the average marginal cost among all consumers at a particular point on the horizontal axis (a particular level of marginal willingness to pay). The social surplus curve SS plots the vertical difference between D and MC, or equivalently, the average value of SS(x_H, θ) − SS(x_L, θ) among all consumers at a particular point on the horizontal axis.

Though vertical differentiation implies D and MC must be weakly positive, the presence of moral hazard means that SS need not be. It is possible for consumers to be over-insured. Moreover, our precondition that all consumers do not have the same socially optimal contract guarantees that in both populations, marginal social surplus will be positive for some consumers and negative for others.⁷ The key difference between populations G^A(θ) and G^B(θ) is whether high or low willingness-to-pay consumers have a higher efficient level of coverage. In population G^A(θ), marginal social surplus is increasing in marginal willingness to pay. The premium p* can therefore sort consumers with on-average positive SS into x_H, and on-average negative SS into x_L. In population G^B(θ), meanwhile, such a premium does not exist.

In population G^B(θ), any interior allocation results in some amount of “backward sorting,” meaning that there is a group of consumers enrolled in x_H who would be more efficiently enrolled in x_L, and vice versa. Consequently, any allocation with enrollment in both contracts is dominated by an allocation with enrollment in only one. No sorting dominates backward sorting because it is always possible to prevent “one side” of the backward sort. To see this, consider the allocation $\tilde{q}$ at the point where SS intersects zero. Any allocation to the right of $\tilde{q}$ strictly dominates, as more consumers with positive marginal social surplus now enroll in x_H. The same logic applies to the left of $\tilde{q}$ . The only allocations that cannot easily be ruled out as suboptimal are the endpoints, at which all consumers enroll in the same contract. In the example shown, the integral of SS is negative, meaning that the population would on average be over-insured in x_H. p* is therefore anything high enough to induce all consumers to choose x_L.

Remarks.

The limitation of choice as a screening mechanism is directly related to the idea that a single (community-rated) price may not be able to efficiently sort consumers that vary in cost (Einav, Finkelstein and Levin, 2010; Glazer and McGuire, 2011; Bundorf, Levin and Mahoney, 2012; Shepard, 2016; Geruso, 2017). Consumers select a contract based on the available consumer surplus, CS = WTP − p, while efficiency relies on a comparison with costs, SS = WTP − MC. When CS and SS diverge (when p ≠ MC), the efficiency of choice turns on whether they are at least positively related. If they are not, choice can only result in some degree of “backward sorting.”

In the simple case of two contracts and a social surplus curve that crosses zero at most once, vertical choice is efficient if and only if it crosses from above. In a more general case with multiple potential contracts and arbitrary social surplus curves, this necessary and sufficient condition is still directly informative. If consumers all have the same socially optimal contract (or more plausibly, if the same contract is socially optimal at all levels of willingness to pay), there will be no crossing in the upper envelope of social surplus curves, and the optimal menu will feature this single contract. If instead there is crossing in the upper envelope of social surplus curves, one must assess whether the higher-coverage contracts cross from above, or in other words, whether or not choice would lead to backward sorting.

Taken together, the procedure for evaluating the efficiency of vertical choice can be summarized by a test for the condition of whether consumers with higher willingness to pay have a higher efficient coverage level, where we emphasize that higher, in both instances, is to be evaluated strictly. This condition itself is complex. It is both theoretically ambiguous and, by our own assessment, not obvious. If healthy consumers change their behavior more in response to insurance, as is suggested by findings in Brot-Goldberg et al. (2017), this would tend toward positively aligning willingness to pay and efficient coverage level. If healthy consumers are more risk averse, as is suggested by findings in Finkelstein and McGarry (2006), this would tend toward negatively aligning them.

There is a question of what characteristics drive variation in willingness to pay, and in turn how those characteristics determine the efficient level of coverage. The net result depends on the joint distribution of expected health spending, uncertainty in health spending, risk aversion, and moral hazard in the population. Moreover, it depends on how these primitives map into marginal willingness to pay and marginal insurer cost across nonlinear insurance contracts, as are common around the world and present in the empirical setting we study. Ultimately, whether consumers with higher private valuations of higher coverage also generate a higher social value from higher coverage is an open empirical question.

III. Empirical Setting

III.A. Data

Our data are derived from the employer-sponsored health insurance market for public school employees in Oregon between 2008 and 2013 (OEBB, 2018). The market is operated by the Oregon Educators Benefit Board (OEBB), which administers benefits for the employees of Oregon’s 187 school districts. Each year, OEBB contracts with insurers to create a state-level “master list” of plans and associated premiums that school districts can offer to their employees. During our time period, OEBB contracted with three insurers, each of which offered a selection of plans. School districts then independently select a subset of plans from the state-level menu and set an “employer contribution” toward plan premiums. Between 2008 and 2010, school districts could offer at most four plans; after 2010, there was no limit, but many still offered only a subset.

The data contain employees’ plan menus, realized plan choices, plan characteristics, and medical and pharmaceutical claims for all insured individuals. We observe detailed demographic information about employees and their families, including age, gender, zip code, health risk score, family type, and employee occupation type.⁸ An employee’s plan menu consists of a plan choice set and plan prices. Plan prices consist of the subsidized premium, potential contributions to a Health Reimbursement Arrangement (HRA) or a Health Savings Account (HSA), and potential contributions toward a vision or dental insurance plan.⁹

The decentralized determination of plan menus provides a plausibly exogenous source of variation in both prices and choice sets. While all plan menus we observe are quite generous, in that the plans are generally high-coverage and are highly subsidized, there is substantial variation across districts in the range of coverage levels offered and in the exact nature of the subsidies.¹⁰ Moreover, school districts can vary plan menus by family type and occupation type, resulting in variation both within and across districts. Plan menu decisions are made by benefits committees consisting of district administrators and employees, and subsidy designs are influenced by bargaining agreements with local teachers’ unions. Between 2008 and 2013, we observe 13,661 unique combinations of year, school district, family type, and occupation type, resulting in 7,835 unique plan menus.

Plan Characteristics.

During our sample period, OEBB contracted with three insurers: Kaiser, Moda, and Providence. Kaiser offered HMO plans that require enrollees to use only Kaiser healthcare providers and obtain referrals for specialist care. Moda and Providence offered PPO plans with broad provider networks. Each insurer used a single provider network and offered multiple plans. Within insurer, plans were differentiated only by financial coverage level.

Table 1 summarizes the state-level master list of plans made available by OEBB in 2009. The average employee premium represents the average annual premium employees would have had to pay for each plan. The full premium reflects the per-employee premium paid to the insurer. This premium varies formulaically by family type; the one shown is for an employee plus spouse. The difference between the employee premium and the full premium is the contribution by the school district. Plan cost-sharing features vary by whether the household is an individual (the employee alone) or a family (anything else). The deductible and out-of-pocket maximum shown are for a family household.¹¹

Table 1.

Plan Characteristics, 2009

Plan	Actuarial Value	Avg. Employee Premium ($)	Full Premium ($)	Deductible ($)	OOP Max. ($)	Market Share

Kaiser - 1	0.97	688	10,971	0	1,200	0.07
Kaiser - 2	0.96	554	10,485	0	2,000	0.11
Kaiser - 3	0.95	473	10,163	0	3,000	< 0.01
Moda - 1	0.92	1,594	12,421	300	500	0.27
Moda - 2	0.89	1,223	11,839	300	1,000	0.05
Moda - 3	0.88	809	11,174	600	1,000	0.11
Moda - 4	0.86	621	10,702	900	1,500	0.10
Moda - 5	0.82	428	9,912	1,500	2,000	0.13
Moda - 6	0.78	271	8,959	3,000	3,000	0.04
Moda - 7	0.68	92	6,841	3,000	10,000	0.01
Providence - 1	0.96	2,264	13,217	900	1,200	0.07
Providence - 2	0.95	1,995	12,895	900	2,000	0.02
Providence - 3	0.94	1,825	12,683	900	3,000	0.01

Open in a new tab

Notes: The table shows the state-level master list of plans available in 2009. Actuarial value is the ratio of the sum of insured spending across all households to the sum of total spending across all households. The average employee premium is taken across all employees, even those who did not choose a particular plan. The full premium reflects the premium negotiated by OEBB and the insurer; the one shown is for an employee plus spouse. The deductible and out-of-pocket maximum shown are for in-network services for a family household.

As a way to summarize and compare plan coverage levels, we construct each plan’s actuarial value. This measure reflects the share of total population spending that would be insured under a given plan.¹² Full insurance would have an actuarial value of one; less generous plans have lower actuarial values. In later years, the distribution of coverage levels looks qualitatively similar, with the notable exception that Providence was no longer available in 2012 and 2013. Table A.1 provides corresponding information for the plans offered in other years.

Household Characteristics.

We restrict our analysis sample to households in which the oldest member is not older than 65, the employee is not retired, and all members are enrolled in the same plan for the entire year. Further, because a prior year of claims data is required to estimate an individual’s prospective health risk score, we require that households have one year of data prior to inclusion; this means our sample begins in 2009. These restrictions leave us with 44,562 households, representing 117,934 individuals. Table A.2 provides additional details on sample construction.

There is a clear bifurcation of our sample between Kaiser and non-Kaiser households. That is, 78 percent of households always chose either Moda or Providence, 19 percent always chose Kaiser, and only 3 percent at some point switched between. This pattern is not necessarily surprising. Kaiser offers a substantially different type of insurance product, and persistent consumer preference heterogeneity along this dimension would be a reasonable expectation. That said, modeling the choice over insurer type somewhat distracts from our focus on choice over financial coverage level. We therefore take advantage of this division in the data and conduct our primary analysis on the set of households that never enrolled with Kaiser. We consider the full sample in a robustness analysis in Section V.C.

Table 2 provides summary statistics on our panel of households. The first column describes the full sample, while the second column describes the subset of households that never enrolled in a Kaiser plan. Focusing on the non-Kaiser sample, 49 percent of households have children, and 74 percent of households are “families” (anything other than the employee alone). The average employee is age 47.9, and the average enrollee (employees and their covered dependents) is age 40.4. Households on average have 2.6 enrollees.

Table 2.

Household Summary Statistics

	Full Sample	Excluding Kaiser

Number of households	44,562	34,606
Number of enrollees	117,934	92,244
Pct. of households with children	0.49	0.49
Enrollees per household, mean (med.)	2.57 (2)	2.60 (2)
Enrollee age, mean (med.)	39.8 (37.8)	40.4 (38.7)
Premiums
Employee premium ($), mean (med.)	880 (0)	843 (0)
Full premium ($), mean (med.)	11,500 (11,801)	11,582 (11,801)
Household healthcare spending
Total spending ($), mean (med.)	10,754 (4,620)	11,689 (5,173)
Out-of-pocket ($), mean (med.)	1,694 (1,093)	2,054 (1,540)
Switching (pct. of household-years)
Forced to switch plan	0.20	0.21
insurer	0.01	0.02
Unforced, switched plan	0.17	0.20
insurer	0.03	0.03

Open in a new tab

Notes: Enrollees are employees plus their covered dependents. Sample statistics are calculated across all years, 2009–2013. Premiums statistics are for households’ chosen plans, as opposed to for all possible plans. Sample medians are shown in parentheses.

Employees received large subsidies toward the purchase of health insurance. The average household paid only $843 per year for their chosen plan; the median household paid nothing. Meanwhile, the average full premium paid to insurers was $11,582, meaning that the average household received an employer contribution of $10,739. Households had average out-of-pocket spending of $2,054 and average total healthcare spending of $11,689.

Households were highly likely to remain in the same plan and with the same insurer they chose the previous year. That said, OEBB could adjust the state-level master list of available plans, and school districts could adjust choice sets over time. Because their prior choice was no longer available, such adjustments forced 21 percent of household-years to switch plans, and 2 percent to switch insurers. When the prior choice was available, 20 percent of household-years voluntarily switched plans and only 3 percent voluntarily switched insurers. The presence of both forced and unforced switching is important in our empirical model for identifying the extent of “inertia” in households’ choice of plan and insurer.

III.B. Variation in Plan Menus

For the purposes of the present research, the two most important features of our setting are the isolated variation along the dimension of coverage level and the plausibly exogenous variation in plan menus. Variation in coverage level exists primarily among the plans offered by Moda. Variation in plan menus stems from the decentralized determination of employee health benefits. Both are central to identification of our empirical model.

To provide a sense of this variation, Figure 2 shows the relationship between healthcare spending and plan actuarial value (AV) for households that chose Moda in 2009. In the left panel, households are grouped by their chosen plan. The plot shows average spending among households in each of the seven Moda plans, weighting each plan by enrollment. Unsurprisingly, households that enrolled in more generous plans had higher spending, reflecting adverse selection, moral hazard, or both.

The right panel groups households by their plan menu. It plots the actuarial value that an average household would be most likely to choose if offered a given plan menu, against the average spending of all households presented with that menu. This measure of plan menu generosity captures both the facts that a level of coverage can only be chosen if it is offered, and is more likely to be chosen if it is cheaper.¹³ Each point on the plot represents the set of plan menus that share the same predicted actuarial value. Points are then weighted by the number of households represented. The resulting pattern indicates that households that were offered a more generous plan menu had higher spending. The patterns in both panels persist when we control for observables, suggesting the presence of adverse selection on unobservables, and of moral hazard.

Identification of our structural model proceeds in much the same way as the above arguments. A key identifying assumption is that plan menus are independent of household unobservables, conditional on household observables. An important threat to identification is that school districts chose plan menu generosity in response to unobservable information about employees that would also drive healthcare spending. To the extent that districts with unobservably sicker households provided more generous health benefits, this would lead us to overstate the extent of moral hazard.¹⁴

We investigate this possibility by attempting to explain plan menu generosity with observable household characteristics, in particular health. We argue that if plan menus were not responding to observable information about household health, it is unlikely that they were responding to unobservable information. We find this argument compelling because we almost certainly have better information on household health (through health risk scores) than did school districts at the time they made plan menu decisions. Table A.5 presents this exercise. Conditional on family type, we find no correlation between plan menu generosity and household risk score. Appendix B.2 describes these results in greater detail. It also presents additional tests for what does explain variation in plan menus. We find that, among other things, plan menu generosity is higher for certain union affiliations, lower for substitute teachers and part-time employees, decreasing in district average house price index, and decreasing in the percentage of registered Republicans in a school district. None of these relationships are inconsistent with our understanding of the process by which district benefits decisions are made.

We exploit this identifying variation within our structural model, but can also use it in a more isolated way to produce reduced-form estimates of moral hazard. Appendix B.3 presents an instrumental variables analysis using two-stage least squares. The estimates yield a moral hazard “elasticity” that can be directly compared with others in the literature. We estimate that the elasticity of demand for healthcare spending with respect to its average end-of-year out-of-pocket cost is −0.27, broadly similar to the benchmark estimate of −0.2 from the RAND experiment (Manning et al., 1987; Newhouse, 1993). We also find suggestive evidence of heterogeneity in moral hazard effects, which is an important aspect of our structural model and of our research question.

IV. Empirical Model

IV.A. Parameterization

We parameterize household utility and the distribution of health states, allowing us to represent our theoretical model fully in terms of data and parameters to estimate. We extend the theoretical model to account for the fact that in our empirical setting, there are multiple insurers, consumers are households consisting of individuals, a dollar in premiums may be valued differently than a dollar in out-of-pocket spending, and consumers make repeated plan choices over time.

Household Utility.

Following Cardon and Hendel (2001) and Einav et al. (2013), we parameterize the value of healthcare spending to be quadratic in its distance from the health state. Household k’s valuation of spending level m given health state realization l is given by

b (m; l, ω_{k}) = (m - l) - \frac{1}{2 ω_{k}} {(m - l)}^{2},

(4)

where w_k governs the curvature of the benefit of spending and, ultimately, the degree to which optimal spending varies across coverage levels. Given out-of-pocket cost function c_jt(m) for plan j in year t, privately optimal healthcare spending is $m_{j t}^{*} (l, ω_{k}) = {argmax}_{m} (b (m; l, ω_{k}) - c_{j t} (m))$ .¹⁵

This parameterization is attractive because it produces reasonable predicted behavior under nonlinear insurance contracts, and is tractable enough to be used inside an optimization routine.¹⁶ Additionally, w_k can be usefully interpreted as the incremental spending induced by moving a household from no insurance to full insurance. Substituting for m*, we denote the benefit of optimal utilization as $b_{j t}^{*} (l, ω_{k})$ and the associated out-of-pocket cost as $c_{j t}^{*} (l, ω_{k})$ . Households face uncertainty in payoffs only through uncertainty in $b_{j t}^{*} (l, ω_{k}) - c_{j t}^{*} (l, ω_{k})$ .

Household k in year t derives the following expected utility from plan choice j:

U_{k j t} = \int - \exp (- ψ_{k} z_{k j t} (l)) d F_{k f t} (l),

(5)

where ψ_k is a coefficient of absolute risk aversion, z_kjt is the payoff associated with realization of health state l, and F_kft is the distribution of health states faced if the plan belongs to insurer f(j). The payoff associated with health state realization l is given by

z_{k j t} (l) = - p_{k j t} + α^{O O P} (b_{j t}^{*} (l, ω_{k}) - c_{j t}^{*} (l, ω_{k})) + δ_{k j}^{f (j)} + γ_{k j t}^{i n e r t i a} + β X_{k j t} + σ_{ϵ} ϵ_{k j t},

(6)

where p_kjt is the household’s plan premium (net of the employer contribution); $b_{j t}^{*} (l, ω_{k}) - c_{j t}^{*} (l, ω_{k})$ is the payoff from optimal utilization measured in units of out-of-pocket dollars; $δ_{k j}^{f (j)}$ are insurer fixed effects interacted with household observables, $γ_{k j t}^{i n e r t i a}$ are a set of fixed effects for both the plan and the insurer a household was enrolled in the previous year, interacted with household observables; and X_kjt is a set of additional covariates that can affect household utility.¹⁷ The payoff z_kjt is measured in units of premium dollars. Out-of-pocket costs may be valued differently from premiums through parameter α^OOP. Finally, ϵ_kjt represents a household-plan-year idiosyncratic shock, with magnitude σ_ϵ to be estimated. We assume these shocks are independent and distributed Type 1 Extreme Value, and that households chose the plan that maximized expected utility from among the set of plans $J_{k t}$ available to them: $j_{k t}^{*} = {argmax}_{j \in J_{k t}} U_{k j t}$ .

Distribution of Health States.

We assume that individuals face a lognormal distribution of health states, and that households face the sum of health state draws across all individuals in the household. Because there is no closed-form expression for the distribution of the sum of draws from lognormal distributions, we represent a household’s distribution of health states using a lognormal that approximates. We derive the parameters of the approximating distribution using the Fenton-Wilkinson method. This novel means of modeling the house-hold distribution of health states allows us to fully exploit the large amount of heterogeneity in household composition that exists in our data. It also allows us to closely fit observed spending distributions using a smaller number of parameters than would be required if demographic covariates were aggregated to the household level. Our method is to estimate individuals’ health state distributions, allowing parameters to vary with individual-level demographics. Appendix C.1 provides additional details.

An individual i faces uncertain health state ${\tilde{l}}^{i}$ , which has a shifted lognormal distribution with support (−κ_it, ∞):

\log ({\tilde{l}}^{i} + κ_{i t}) ~ N (μ_{i t}, σ_{i t}^{2}) .

The shift is included to capture a mass of individuals with zero spending. If κ_it is positive, negative health states are permitted, which may imply zero spending. Parameters μ_it, σ_it, and κ_it are in turn projected onto individual demographics (such as health risk score), which can vary over time.

A household k faces uncertain health state $\tilde{l}$ , which has a shifted lognormal distribution with support $(- κ_{k t}, \infty) : \log (\tilde{l} + κ_{k t}) ~ N (μ_{k t}, σ_{k t}^{2})$ . Under the approximation, household-level parameters μ_kt, σ_kt, and κ_kt are a function of individual-level parameters μ_it, σ_it, and κ_it. Variation in μ_kt, σ_kt, and κ_kt across households, as well as within households over time, arises from variation in household composition: the number of individuals and each individual’s demographics. In addition to this observable heterogeneity, we incorporate unobserved heterogeneity in household health though parameter μ_kt. Households can in this way hold private information about their health that can drive both plan choices and spending outcomes.

Finally, we introduce an additional set of parameters ϕ_f to serve as “exchange rates” for monetary health states across insurers. These parameters are intended to capture differences in total healthcare spending that are driven by differences in provider prices across insurers, conditional on health state.¹⁸ For example, the same physician office visit might lead to different amounts of total spending across insurers simply because each insurer paid the physician a different price. We do not want such variation to be attributed to differences in underlying health. Our approach is to estimate insurer-level parameters that multiply realized health states, transforming them from underlying “quantities” of healthcare utilization into the monetary spending amounts we observe in the claims data. We model a household’s money-metric health state l as the product of an insurer-level “price” multiplier ϕ_f and the underlying “quantity” health state $\tilde{l}$ , where $\tilde{l}$ is lognormally distributed depending only on household characteristics. Taken together, the distribution F_kft is defined by

l = ϕ_{f} \tilde{l}, \log (\tilde{l} + κ_{k t}) ~ N (μ_{k t}, σ_{k t}^{2}) .

IV.B. Identification

Our aim is to recover the joint distribution across households of willingness to pay, risk protection, and the social cost of moral hazard associated with different levels of coverage. Variation in these objects arises from variation in either household preferences (the risk-aversion and moral-hazard parameters) or in households’ distributions of health states. Our primary identification concerns are (i) distinguishing preferences from private information about health, (ii) distinguishing taste for out-of-pocket spending (α^OOP) from risk aversion, and (iii) identifying heterogeneity in the risk-aversion and moral-hazard parameters. We provide informal identification arguments addressing each concern.

We first explain how w, which captures moral hazard, is distinguished from unobserved variation in μ_kt, which captures adverse selection on unobservables. In the data, there is a strong positive correlation between plan generosity and total healthcare spending (see Figure 2a). A large part of this relationship can be explained by observable household characteristics, but even conditional on observables, there is still residual positive correlation. This residual correlation could be attributable to either the effect of lower out-of-pocket prices driving utilization (moral hazard) or private information about health affecting both utilization and coverage choice (adverse selection). The key to distinguishing between these explanations is the variation in plan menus.

Both within and across school districts, we observe similar households facing different menus of plans. As a result, some households are more likely to choose higher coverage only because of the plan menu they are offered. The amount of moral hazard is identified by the extent to which households facing more generous plan menus also have higher healthcare spending. On the other hand, we also observe similar households facing similar menus of plans, but still making different plan choices. This variation identifies the degree of private information about health, as well as the magnitude of the idiosyncratic shock σ_ϵ.Conditional on observables and the predicted effects of moral hazard, if households that inexplicably choose more generous coverage also inexplicably realize higher healthcare spending, this variation in plan choice will be attributed to private information about health. Any residual unexplained variation in plan choice will be attributed to the idiosyncratic shock.

Both risk aversion (ψ) and the relative valuation of premiums and out-of-pocket spending (α^OOP) affect households’ taste for higher coverage, but do not affect healthcare spending. To distinguish between them, we rely on cases in which observably different households face similar plan menus. Risk aversion is identified by the degree to which households’ taste for higher coverage is positively related to uncertainty in out-of-pocket spending, holding expected out-of-pocket spending fixed. α^OOP is identified by the rate at which households trade off premiums with expected out-of-pocket spending, holding uncertainty in out-of-pocket spending fixed.

Unlike the preceding arguments, identification of unobserved heterogeneity in risk aversion and the moral hazard parameter relies on the panel nature of our data. Plan menus, household characteristics, and plan characteristics change over time. We therefore observe the same households making choices under different circumstances. If we had a large number of observations for each household and sufficient variation in circumstances, the preceding arguments could be applied household by household, and we could nonparametrically identify the distribution of ψ and ω in the population. In reality, we have at most five observations for each household. We ask less of this data by assuming that the unobserved heterogeneity is normally distributed. The variance and covariance of the unobserved components of household types are identified by the extent to which different households consistently act in different ways. For example, if some households consistently make choices that reflect high risk aversion and other (observationally equivalent) households consistently make choices that reflect low risk aversion, this will be interpreted as unobserved heterogeneity in risk-aversion.

IV.C. Estimation

We project the parameters of the individual health state distributions μ_it, σ_it, and κ_it onto time-varying individual demographics:

μ_{i t} = β^{μ} X_{i t}^{μ}, σ_{i t} = β^{σ} X_{i t}^{σ}, κ_{i t} = β^{κ} X_{i t}^{κ} .

(7)

Covariate vectors $X_{i t}^{μ}$ , $X_{i t}^{σ}$ , and $X_{i t}^{κ}$ contain indicators for the 0–30th, 30–60th, 60–90th, and 90–100th percentiles of individual health risk scores each year. $X_{i t}^{μ}$ and $X_{i t}^{κ}$ also contain a linear term in risk score, separately for each percentile group. $X_{i t}^{μ}$ also contains an indicator for whether the individual is a female between the ages of 18 and 35 and for whether the individual is under 18 years old.

Using the derivations shown in Appendix C.1, the parameters of households’ health state distributions are a function of individual-level parameters:

σ_{k t}^{2} = \log [1 + {[\sum_{i \in I_{k}} \exp (μ_{i t} + \frac{σ_{i t}^{2}}{2})]}^{- 2} \sum_{i \in I_{k}} (\exp (σ_{i t}^{2}) - 1) \exp (2 μ_{i t} + σ_{i t}^{2})], {\bar{μ}}_{k t} = - \frac{σ_{k t}^{2}}{2} + \log [\sum_{i \in I_{k}} \exp (μ_{i t} + \frac{σ_{i t}^{2}}{2})], κ_{k t} = \sum_{i \in I_{k}} κ_{i t},

(8)

where $I_{k}$ represents the set of individuals in household k. Private information about health is reflected in normally distributed unobservable heterogeneity in μ_kt. The household-specific mean of μ_kt is given by ${\bar{μ}}_{k t}$ , and its variance is given by $σ_{μ}^{2}$ . A large $σ_{μ}^{2}$ means that households appear to have substantially more information about their health than the econometrician.

We assume that μ_kt, w_k, and log(ψ_k) are jointly normally distributed:

[\begin{matrix} μ_{k t} \\ ω_{k} \\ \log (ψ_{k}) \end{matrix}] ~ N ([\begin{matrix} {\bar{μ}}_{k t} \\ β^{ω} X_{k}^{ω} \\ β^{ψ} X_{k}^{ψ} \end{matrix}], [\begin{matrix} σ_{μ}^{2} \\ σ_{ω, μ}^{2} & σ_{ω}^{2} \\ σ_{ψ, μ}^{2} & σ_{ω, ψ}^{2} & σ_{ψ}^{2} \end{matrix}]) .

(9)

There is both observed (through the mean vector) and unobserved (through the covariance matrix) heterogeneity in each parameter. Covariates $X_{k}^{ω}$ and $X_{k}^{ψ}$ include an indicator for whether the household has children and a constant.¹⁹

We model inertia at both the plan and insurer level: $γ_{k j t}^{i n e r t i a} = γ_{k}^{p l a n} 1_{k t, j = j (t - 1)} + γ_{k}^{i n s} 1_{k t, f = f (t - 1)}$ . We allow $γ_{k}^{p l a n}$ to vary by whether a household has children. To capture whether sicker households face higher barriers to switching insurers (and therefore provider networks), we allow $γ_{k}^{i n s}$ to vary linearly with household risk score. Insurer fixed effects $δ_{k}^{f (j)}$ can vary by household age and whether a household has children, and we allow the intercepts to vary by geographic region in order to capture the relative attractiveness of insurer provider networks across different parts of the state (as well as other sources of geographical heterogeneity in insurer preferences).²⁰ We normalize the fixed effect for Moda to be zero. As the parameters of individual health state distributions can vary freely, the “provider price” parameters require normalization: ϕ_Moda is normalized to one.

We estimate the model via maximum likelihood. Our estimation approach follows Revelt and Train (1998) and Train (2009), with the distinction that we model a discrete/continuous choice. Our construction of the discrete/continuous likelihood function follows Dubin and McFadden (1984). The likelihood function for a given household is the conditional density of its observed sequence of total healthcare spending, given its observed sequence of plan choices. We use Gaussian quadrature to approximate the jointly normal distribution of unobserved heterogeneity, as well as to approximate the lognormal distributions of household health states. Additional details on the estimation procedure are provided in Appendix C.2.

V. Results

V.A. Model Estimates

Table 3 presents our parameter estimates. Column 3 presents our primary specification, as described in the previous section. Columns 1 and 2 present simpler specifications that are useful in understanding and validating the model. The table excludes insurer fixed effects and health state distribution parameters; these can be found in Table A.8.

Table 3.

Parameter Estimates

	(1)		(2)		(3)
Variable	Parameter	Std. Err.	Parameter	Std. Err.	Parameter	Std. Err.

Employee Premium ($000s)	−1.000^†		−1.000^†		−1.000^†
Out-of-pocket spending, –α^OOP	−1.628	0.023	−1.661	0.024	−1.469	0.019
HRA/HSA contributions, α^HA	0.255	0.021	0.259	0.020	0.259	0.020
Vision/dental contributions, α^VD	1.341	0.024	1.302	0.022	1.209	0.021
Plan inertia intercept, γ^plan	4.763	0.060	4.431	0.056	4.630	0.063
Plan inertia * 1[Children], γ^plan	−0.129	0.039	−0.102	0.037	−0.138	0.038
Insurer inertia intercept, γ^ins	2.605	0.107	2.509	0.102	2.413	0.097
Insurer inertia * Risk score, γ^ins	−0.074	0.083	−0.120	0.078	−0.037	0.080
Narrow net. plan, ν^NarrowNet	−2.440	0.155	−2.286	0.145	−2.334	0.151
Providence utiliz. multiplier, ϕ_P	1.022	0.018	1.072	0.017	1.063	0.002
Risk aversion intercept, β^ψ	−0.706	0.046	−1.018	0.059	−0.251	0.052
Risk aversion * 1[Children], β^ψ	0.005	0.031	−0.367	0.083	−0.361	0.050
Moral hazard intercept, β^ω					1.028	0.038
Moral hazard * 1[Children], β^ω					0.671	0.008
Std. dev. of private health info., σ_μ	0.683	0.002	0.331	0.064	0.225	0.005
Std. dev. of log risk aversion, σ_ψ	0.701	0.062	1.140	0.012	0.833	0.021
Std. dev. of moral hazard, σ_ω					0.281	0.013
Corr(μ, ψ), ρ_μ,ψ	0.130	0.018	−0.365	0.049	0.227	0.005
Corr(ψ, ω), ρ_ψ,ω					−0.137	0.042
Corr(μ, ω), ρ_μ,ω					0.062	0.017
Scale of idiosyncratic shock, σ_ε	2.313	0.025	2.160	0.023	2.116	0.024

Insurer * {Region, Age, 1[Child.]}	Yes		Yes		Yes
Observable heterogeneity in health			Yes		Yes
Number of observations	451,268		451,268		451,268

Open in a new tab

Notes: The table presents estimates for selected parameters; Table A.8 presents the remaining estimates. Standard errors are derived from the analytical Hessian of the likelihood function. Column 3 presents our primary estimates, while columns 1 and 2 present alternative specifications. All models are estimated on an unbalanced panel of 34,606 households, 11 plans, and 5 years. The utilization multiplier for Moda (ϕ_M) is normalized to one.

^†

By normalization.

Column 1 presents a version of the model in which there is no moral hazard and no observable heterogeneity in individuals’ health. That is, w is fixed at zero, and we do not allow μ_it, σ_it, or κ_it to vary with individual demographics. Unobservable heterogeneity in household health (through σ_μ) is still permitted. In column 2, we introduce observable heterogeneity in health. A key difference across columns 1 and 2 is the magnitude of the adverse selection parameter σ_μ, which falls by more than half. When rich observable heterogeneity in health is introduced to the model, the estimated amount of unobservable heterogeneity in health falls substantially. In column 3, we introduced moral hazard. Here, an important difference is the increase in the estimated amount of risk aversion. With moral hazard as an available explanation, the model can explain a larger part of the dispersion in spending for observably similar households. This implies that households are facing less uncertainty in their health state than previously thought, and that more risk aversion is necessary to explain the same plan choices. Because estimated risk aversion increases, the relative valuation of premiums and out-of-pocket costs (α^OOP) falls.

Using column 3, we estimate an average moral hazard parameter w of $1,001 among individual households and $1,478 among families.²¹ Recall that w represents the additional total spending induced by lowering marginal out-of-pocket cost from one to zero. Our estimates imply that moving a household from a plan where their health state was below the deductible to a plan where their health state would put them past the out-of-pocket maximum would increase total spending by 15.8 percent of mean spending for individuals and 11.4 percent for families.

Our estimates imply a mean (median) coefficient of absolute risk aversion of 0.92 (0.85).²² Put differently, to make households indifferent between (i) a payoff of zero and (ii) an equalodds gamble between gaining $100 and losing $X, the mean (median) value of $X in our population is $91.7 ($92.1).²³ We note, however, that our estimates of risk aversion are with respect to both financial risk and risk in the value derived from healthcare utilization (through $b_{j t}^{*}$ ), so they are not directly comparable to estimates that consider only financial risk. The standard deviation of the uncertain portion of payoffs $(b_{j t}^{*} - c_{j t}^{*})$ with respect to the distribution of health states is $1,152 on average across household-plan-years. The standard deviation of out-of-pocket costs alone $(c_{j t}^{*})$ is on average $1,280. To avoid a normally distributed lottery with mean zero and standard deviation $1,152 ($1,280), the median household would be willing to pay $489 ($544).

The importance of unobserved heterogeneity varies for health, risk aversion, and moral hazard.²⁴ Once we account for the full set of household observables and moral hazard, the estimated amount of private information about health is fairly small: Unobserved heterogeneity in μ_kt accounts for only 11 percent of the total variation in μ_kt across household-years. On the other hand, unobserved heterogeneity in risk aversion accounts for 93 percent of its total variation across households. Unobserved heterogeneity in the moral hazard parameter accounts for 18 percent of its total variation.

Conditional on observables, we find that households that are idiosyncratically risk averse are also idiosyncratically less prone to moral hazard (ρ_ψ,w< 0) and also tend to have private information that they are unhealthy (ρ_μ,ψ > 0). We find that households with private information that they are unhealthy are also idiosyncratically more prone to moral hazard (ρ_μ,w > 0). Accounting for both unobservable and observable variation, our estimates imply that households’ expected health state $E [\tilde{l}]$ has a correlation of 0.22 with risk aversion, and a correlation of 0.25 with the moral hazard parameter. The risk aversion and moral hazard parameters have a correlation of only 0.01. Figure A.3 plots the full joint distribution of these three key dimensions of household type.

Our estimates imply substantial disutility from switching insurer or plan. The average disutility from switching insurer is $2,408, and from switching plans (but not insurers) is $4,562. We estimate that insurer inertia is decreasing in household risk score, and that plan inertia is on average $138 lower for households with children. The exceptionally large magnitudes of our inertia coefficients reflect, in large part, the infrequency with which households switch plan or insurer, as shown in Table 2. Only 3 percent of household-years ever voluntarily switch insurer, and only 20 percent of household-years ever voluntarily switch plan.

Finally, the estimates in column 3 indicate that households weight out-of-pocket expenditures 46.9 percent more than plan premiums. We believe this could be driven by a variety of factors, including (i) household premiums are tax deductible, while out-of-pocket expenditures are not, and (ii) employee premiums are very low (at the median, zero), perhaps rendering potential out-of-pocket costs in the thousands of dollars relatively more salient. We also find that households value a dollar in HSA/HRA contributions on average 75 percent less than a dollar of premiums. This is consistent with substantial hassle costs associated with these types of accounts, as documented by Reed et al. (2009) and McManus et al. (2006).

Model Fit.

We conduct two procedures to evaluate model fit, corresponding to the two stages of the model. First, we compare households’ predicted plan choices with those observed in the data. Figure 3 displays the predicted and observed market shares for each plan, pooled across all years in our sample. Shares are matched exactly at the insurer level due to the presence of insurer fixed effects, but are not matched exactly plan by plan. Predicted choice probabilities over plans within an insurer are driven by plan prices, inertia, and households’ valuation of different levels of coverage through their expectation of out-of-pocket spending, their value of risk protection, and their value of healthcare utilization. Given the relative inflexibility of the model with respect to household plan choice within an insurer, the fit is quite good.

Figure 3. — *Notes*: The figure shows predicted and observed market shares at the plan level. All years are pooled, so an observation is a household-year. Predicted shares are calculated using the estimates in column 3 of Tables 3 and A.8.

Second, we compare the predicted distributions of households’ total healthcare spending to the distributions of total healthcare spending observed in the data. In a given year, each household faces a predicted distribution over health states and, due to moral hazard, a corresponding plan-specific distribution of total healthcare spending. To construct the predicted distribution of total spending in a population of households, we take a random draw from each household’s predicted spending distribution corresponding to their chosen plan. Figure 4 presents kernel density plots of the predicted and observed distributions of total healthcare spending. We assess fit separately by tertile of household risk score. Vertical lines in each plot represent the mean of the respective distribution. Overall, average total healthcare spending is observed to be $11,689 and predicted to be $11,632. The standard deviation of total healthcare spending is observed to be $20,803 and predicted to be $20,174. The spending distributions fit well both overall and in subsamples of households, reflecting our flexible approach to modeling household health state distributions.

Figure 4. — *Notes*: The figure shows kernel density plots of the predicted and observed distributions of total healthcare spending on a log scale, separately by tertile of household health risk score, conditional on predicted/observed spending greater than $10. All years are pooled, so an observation is a household-year. Vertical lines represent the mean of the respective distribution. Predicted distributions are based on estimates from column 3 in Tables 3 and A.8. Overall, the observed probability of household spending less than $10 is 2.9 percent, and the predicted probability is 2.8 percent.

V.B. Evaluating Vertical Choice

We now construct the ingredients needed to evaluate the optimal plan menu: each household’s willingness to pay for different levels of coverage, and the social surplus generated by allocating each household to different levels of coverage. We first specify the contracts under consideration.

Potential Contracts.

We consider concave, piecewise linear contracts that are vertically different iated and well-ordered by coverage level.²⁵ While our numerical simulations consider all coverage levels between the null contract and full insurance, we limit attention in our graphical analysis to the range of coverage levels that are ultimately relevant given our parameter estimates. The lowest level of coverage we consider graphically is a contract with a deductible and out-of-pocket maximum of $10,000. The highest level of coverage remains full insurance. We begin by considering five contracts spanning this range, and refer to them as Catastrophic, Bronze, Silver, Gold, and full insurance. The contracts’ actuarial values are 0.53, 0.61, 0.72, 0.86, and 1.00. Their out-of-pocket cost functions are depicted in Figure A.4a.²⁶ We revisit the specification of potential contracts in Section V.C.

Willingness to Pay.

We make several simplifications to our empirical model in order to map it from the setting in Oregon back to our theoretical model, maintaining parameterizations and the estimated distribution of consumer types. To start, we put aside intertemporal variation in household health and focus on the first year each household appears in the data. We also use the provider price parameter ϕ =1 (corresponding to that of Moda). This leaves each household with a single type θ_k = {F_k, ψ_k, w_k}, where F_k is a shifted lognormal distribution described by parameters {μ_k,σ_k, κ_k}. With respect to payoffs (equation 6), we (i) hold all non-financial features fixed, so any insurer fixed effects cancel; (ii) suppose households choose from the new menu of contracts for the first time, removing any effects of inertia; (iii) set α^OOP to one so that premiums and out-of-pocket costs are valued one-for-one; and (iv) assume the idiosyncratic shock is not welfare-relevant.²⁷

With attention restricted to the dimension of coverage level, we can use equation 2 to express willingness to pay under our parameterization:²⁸

WTP (x, θ_{k}) = \underset{\begin{matrix} E x p e c t e d r e d u c t i o n i n o u t - o f - p o c k e t \\ c o s t h o l d i n g b e h a v i o r f i x e d \end{matrix}}{\underset{︸}{E_{l} [c_{x_{0}} (l) - c_{x} (l)]}} + \underset{\begin{matrix} E x p e c t e d p a y o f f f r o m m o r a l \\ h a z a r d s p e n d i n g \end{matrix}}{\underset{︸}{E_{l} [\frac{ω_{k}}{2} {(1 - c_{x}^{'} (m^{*} (l, ω_{k}, x)))}^{2}]}} + \underset{\begin{matrix} V a l u e o f r i s k \\ p r o t e c t i o n \end{matrix}}{\underset{︸}{Ψ (x, θ_{k})}} .

As before, willingness to pay is composed of three parts: the “transfer” of expected out-of-pocket costs holding behavior fixed (at uninsured behavior), the expected payoff from moral hazard spending, and the value of risk protection. Recall that only the latter two components are relevant to social welfare.

Figure 5 presents the distribution of willingness to pay among family households.²⁹ Whereas our point of reference in the two-contract example was x_L, our reference contract now is the Catastrophic contract. We hereinafter refer to “willingness to pay” for a given contract, but emphasize that this is marginal willingness to pay with respect to this particular reference. Figure 5, as well as the figures that follow, is composed of connected binned scatter plots: Households are ordered on the horizontal axis according to their willingness to pay, those at each percentile are binned together, and the average value of the vertical axis variable is plotted for each bin.³⁰ These 100 points are then connected with a line. The left panel shows the willingness to pay curves for our candidate contracts. As contracts are vertically differentiated, all households are willing to pay more for higher coverage. The highest willingness-to-pay households are willing to pay $10,000 more for full insurance than for the Catastrophic contract. Figure A.5 provides demographic information about households across the distribution of willingness to pay. Higher willingness-to-pay households tend to be older, have more family members, be more risk averse, and most strikingly, have higher expected healthcare spending.

The right panel shows, for one contract, the decomposition of willingness to pay. We find that the transfer represents the majority of willingness to pay for most households, but that this varies across the distribution of willingness to pay. For households with low willingness to pay, about half is made up by the transfer. For households with high willingness to pay, nearly all is made up by the transfer. The highest willingness-to-pay households are willing to pay $7,500 more for Gold than for Catastrophic only in order to avoid paying an additional $7,500 in out-of-pocket costs. Importantly, this means that allocating them to higher coverage generates almost no additional social surplus.

Social Surplus.

As in equation 3, the social surplus generated by allocating a household to a given contract is the difference between willingness to pay and expected insurer cost, which under our parameterization is equal to:

S S (x, θ_{k}) = \underset{\begin{matrix} V a l u e o f r i s k \\ p r o t e c t i o n \end{matrix}}{\underset{︸}{Ψ (x, θ_{k})}} - \underset{\begin{matrix} S o c i a l c o s t \\ o f m o r a l h a z a r d \end{matrix}}{\underset{︸}{E_{l} [\frac{ω_{k}}{2} {(1 - c_{x}^{'} (m^{*} (l, ω_{k}, x)))}^{2}]}} .

(10)

The value of risk protection varies in the population to the extent there is variation in risk aversion and in the amount of uncertainty about out-of-pocket costs. The social cost of moral hazard varies in the population to the extent there is variation in the moral hazard parameter and in consumers’ expected marginal out-of-pocket cost.

To understand the contribution of each of these components to the overall relationship between willingness to pay and social surplus, we first plot them separately. Figure 6a shows the distribution across households of the marginal value of risk protection generated by each contract, relative to the Catastrophic contract. We find that the majority of the social welfare gains from more generous coverage are driven by households with intermediate levels of willingness to pay. This pattern is driven by the concavity of the contracts we consider. High willingness-to-pay households are likely to realize health states that put them above the out-of-pocket maximum of every contract, leaving them little uncertainty about out-of-pocket costs. Among the top fifth of households by willingness to pay, the probability of spending more than $10,000, even without moral hazard, is 65 percent. Figure A.6 shows the spending distributions faced by households at different levels of willingness to pay. Variation in out-of-pocket uncertainty only becomes meaningful for households for whom much of the density of their spending distribution lies in the range $0-$10,000, within which marginal out-of-pocket cost varies across contracts.

Figure 6b shows the distribution of the marginal social cost of moral hazard. It provides two important insights. First, high willingness-to-pay households on average barely change their behavior across this range of coverage levels.³¹ For similar reasons as with risk protection, the majority of the social cost of more generous coverage is driven by households with lower willingness to pay. The second insight is that the Gold contract recovers about half of the social cost of moral hazard induced by full insurance. The $1,125 deductible is high enough to deter excess spending, but low enough to sacrifice only a small amount of risk protection.

Finally, Figure 7 shows the resulting social surplus curves, equal to the vertical differences between the curves in Figures 6a and 6b. The social surplus curves represent the average social surplus acheived by allocating all households at a given percentile of willingness to pay to a given contract, relative to allocating them to the Catastrophic contract. Since households can be screened only by their willingness to pay, the figure permits a direct assessment of the optimal menu.

First, note that all curves lay everywhere above zero, meaning the Catastrophic contract (and any lower level of coverage) should be unambiguously excluded from the optimal menu. Any lower level of coverage can be ruled out because its social surplus curve will lay everywhere below that of the Catastrophic contract (c.f. Proposition 3 in Appendix A.2). Among the remaining contracts, the social surplus curves of Bronze, Silver, and full insurance lay everywhere below that of the Gold contract, which delivers higher average social surplus at every level of willingness to pay. Households with higher willingness to pay should therefore not have a higher level of coverage than households with lower willingness to pay: they should, on average, have the same level of coverage.³² It follows that offering choice over these contracts is not efficient in this population. Numerical optimization confirm this result. The optimal menu consists of only the Gold contract, and this allocation achieves social surplus (relative to allocating all households to Catastrophic) equal to the integral of the Gold social surplus curve: $1,514 per household.

V.C. Robustness

More Contracts.

A natural question is how the optimal menu would change if more contracts were available. Figure 7 indicates that the Silver contract is everywhere too little coverage, and that full insurance is everywhere too much coverage, but it says nothing about the potential gains of offering additional contracts within this range. We explore this question by expanding the number of contracts in the Silver-to-full insurance range from one (the Gold contract) to 20. The out-of-pocket cost functions for this denser set of potential contracts are depicted in Figure A.4b.

We find that when efficient coverage level can be measured more finely, high willingness-to-pay households do have a slightly higher efficient level of coverage. In a small neighborhood of the Gold contract, it is therefore efficient to offer a choice. The optimal menu features four contracts.³³ This allocation achieves social surplus, relative to allocating all households to the Catastrophic contract, of $1,528 per household. This represents a gain of $14 over what is achieved by the Gold contract alone, and of only $5 over what can be achieved by a single contract in the denser set.

Different Contracts.

We next explore whether our results are robust to alternative contract designs. We have so far considered one particular design, as depicted in panels (a) and (b) of Figure A.4. We now consider three alternatives, as depicted in panels (c)-(e). These are: (c) removing deductibles, (d) removing the coinsurance region, and (e) extending the coinsurance region. Within each alternate contract design, we consider a set of five vertically differentiated contracts.

We solve for the optimal menu within each new set of contracts. These results are presented in Table A.9. We find that among contracts without a deductible and without a coinsurance region, the optimal menu again features a single contract. For much the same reasons as this was true among the original contracts, higher willingness-to-pay consumers do not have a higher efficient level of coverage. We also find that among contracts with an extended coinsurance region, the optimal menu does feature vertical choice. Because it takes longer to reach the out-of-pocket maximum, households are less likely to hit it, and high willingness-to-pay households face much more variation in risk across contracts.

Our findings suggest that the contract dimension most relevant to the question of vertical choice is the stop-loss point, i.e., the level of total spending at which the out-of-pocket maximum is reached. Namely, our results suggest that if a regulator wanted consumers to pay on the margin for only a short time (a low stop-loss point), vertical choice may not offer meaningful welfare gains. If instead a regulator wanted consumers to pay on the margin for a long time (a high stop-loss point), our results suggest it may be useful to offer consumers a choice.³⁴

Different Consumers.

We next explore the robustness of our findings to different populations of consumers. We do this in two ways: (i) by re-estimating our model in the full sample of households that includes Kaiser enrollees, and (ii) by adjusting individual parameter estimates to establish their isolated effects on results.

As discussed in Section III.A, we exclude Kaiser enrollees from our primary analysis sample in order to focus on the vertical choice across coverage levels, as opposed to the horizontal choice across plan types (HMO vs. PPO). Kaiser enrollees are on average slightly younger and healthier, and 3 percent of households did at one point switch between a Kaiser and non-Kaiser plan. We investigate how these factors impact our results by re-estimating our model using the full sample of households. Table A.10 presents these parameter estimates. Figure A.8 presents the corresponding willingness to pay and social surplus curves. Though the shapes and levels of the resulting social surplus curves are slightly different than under our original estimates, our qualitative results and the underlying mechanisms are unchanged. The optimal menu remains the Gold contract alone.

Second, we explore how specific perturbations of our parameter estimates affect our results. We explore nine cases, including raising and lowering the moral hazard parameter, raising and lowering risk aversion, and increasing heterogeneity in risk aversion and the moral hazard parameter. We also present three cases in which households vary only in their preferences: risk aversion and/or the moral hazard parameter. Our findings are summarized in Table A.11.³⁵ For each case, the table shows the percent of households enrolled in each contract under the optimal menu. Intuitively, we find that the optimal menu is more likely to feature a choice when risk aversion plays a larger role in driving variation in willingness to pay. At the extreme, when households vary only in their risk aversion, nearly perfect screening is possible as private and social incentives are directly aligned.

Table A.11 also reports the welfare gains available from a denser contract space, and whether or not a choice would be efficient in that context. We find that while choice is almost always efficient in the denser contract space, the welfare gains available are consistently small (never exceeding $16 per household per year). In the extreme case in which households vary only in their moral hazard parameter, private and social incentives are directly misaligned, and choice is not efficient even among the dense set of contracts. Across all nine cases, the welfare gains from vertical choice relative to what can be achieved by a single contract do not exceed $10 per household. In a broad neighborhood of our parameter estimates, the efficiency loss from forgoing vertical choice is therefore either zero or economically small.

VI. Counterfactual Pricing Policies

Returning to our focal set of metal-tier contracts and the estimated distribution of consumer types, we compare outcomes under five pricing policies: (i) regulated pricing with community rating, (ii) regulated pricing with type-specific prices, (iii) competitive pricing with community rating and a mandate, (iv) competitive pricing with type-specific prices and a mandate, and (v) premiums to support vertical choice. Under regulated pricing, premiums are set to maximize social surplus. Under competitive pricing, premiums are endogenous and must equal average costs on a plan-by-plan basis. A mandate enforces a minimum level of coverage at the Catastrophic contract. Under premiums to support vertical choice, premiums are set to support the availability of (read: enrollment in) every contract.

We consider two pricing policies, (ii) and (iv), in which premiums can vary by consumer attributes. If observable dimensions of household type are predictive of efficient coverage level, tailoring plan menus to observables may improve allocations. We divide households into four groups: childless households under age 45, childless households over age 45, households under age 45 with children, and households over age 45 with children.³⁶ We use age and whether the household has children because these are used in practice on ACA exchanges and are also important observables with which the parameters of our model may vary.

VI. A. Welfare Outcomes

Table 4 summarizes outcomes under each of these five pricing policies. It shows the percent of households Q enrolled in each contract at the optimal feasible allocation, the percent of first-best social surplus achieved, and the expected per-household insurer cost AC among households enrolled in each contract (in thousands of dollars). We benchmark outcomes against the first best allocation of households to contracts (as depicted in Figure A.7), which cannot be supported by prices unless premiums can vary by all aspects of consumer type. The first best allocation generates $1,542 in social surplus per household, relative to allocating all households to the Catastrophic contract. Expected total healthcare spending per household at the first best allocation is $12,400, and expected insurer cost per household is $10,351.

Table 4.

Outcomes Under Alternative Pricing Policies

	Policy	% of First Best Surplus	Potential Contracts
	Policy	% of First Best Surplus		Full	Gold	Silver	Bronze	Ctstr.

*	First best	1.000	Q:	0.06	0.75	0.19	< 0.01	–
*	First best	1.000	AC:	18.35	9.43	11.48	39.18	–
(i)	Regulated pricing with community rating	0.982	Q:	–	1.00	–	–	–
(i)	Regulated pricing with community rating	0.982	AC:	–	10.62	–	–	–
(ii)	Regulated pricing with type-specific prices	0.989	Q:	–	0.98	0.02	–	–
(ii)	Regulated pricing with type-specific prices	0.989	AC:	–	10.71	0.75	–	–
(iii)	Competitive pricing with community rating	0.000	Q:	–	–	–	–	1.00
(iii)	Competitive pricing with community rating	0.000	AC:	–	–	–	–	6.30
(iv)	Competitive pricing with type-specific prices	0.075	Q:	–	–	0.05	–	0.95
(iv)	Competitive pricing with type-specific prices	0.075	AC:	–	–	4.95	–	6.41
(v)	Premiums to support vertical choice	0.796	Q:	0.01	0.07	0.63	0.28	0.01
(v)	Premiums to support vertical choice	0.796	AC:	61.04	31.91	8.47	1.75	0.28

Open in a new tab

Notes: The table summarizes outcomes under five pricing policies as well as the first best allocation, among the 25,636 family households. Q represents the percent of households enrolled in each contract. AC represents average expected insurer cost (in thousands of dollars) among households enrolled in each contract. Social surplus is measured relative to the Catastrophic contract. At the first best allocation, social surplus is $1,542 per household and expected insurer cost is $10,351 per household.

Policy (i) is the baseline policy considered in this paper, in which the regulator can set premiums but is restricted to community rating. As indicated by Figure 7, it is welfare maximizing to offer only the Gold contract.³⁷ Interestingly, although 25 percent of households are misallocated, this policy is almost equally as efficient as the first best allocation. That is, the ability to perfectly discriminate among consumers would increase welfare by only $28 per household per year. Driving this result is the fact that social welfare is quite flat across the top contracts, and particularly so among the households who are misallocated under policy (i). Among these households, the social surplus at stake between the Silver and Gold contracts is on average only $26; among all households, it is $112.

Because pricing policy (i) is almost as efficient as the first best outcome, there is little scope for improvement by varying prices by consumer type. Even so, under policy (ii) we do find that allowing the regulator to discriminate can improve allocational efficiency by a small amount. To young households under with children, it is efficient to offer a choice between Gold and Silver. To the other three sets of households, it is still efficient to offer only Gold. It becomes possible to productively offer lower coverage to young households with children because the other households, to whom it is not efficient to provide such low coverage, can now be excluded.

Policy (iii) considers competitive pricing with community rating and a mandated minimum level of coverage at the Catastrophic contract. We calculate the competitive equilibrium proposed by Azevedo and Gottlieb (2017). We find that a separating equilibrium above minimum coverage cannot be supported in this population, and the market unravels. Though choice is permitted, the market cannot deliver it. Policy (iv) allows the market to be segmented. We find that among young households with children, a separating equilibrium between the Silver and Catastrophic contracts can be supported. The other three market segments unravel.

The first four policies are natural benchmarks, but none turn out to feature the same degree of vertical choice that is observed in many U.S. health insurance markets, including the market we study. A major difference between these real markets and our benchmark policies is that the former feature a complex set of taxes and subsidies that affect consumer premiums in ways not replicated by our benchmark policies. To mimic this status quo outcome, policy (v) implements premiums that can support enrollment in every contract. We target enrollment shares that match the true metal-tier shares observed on ACA exchanges in 2018.³⁸ Because households with intermediate willingness to pay (for whom social surplus increases steeply at low coverage levels; see Figure 7) now choose Silver instead of Bronze or Catastrophic, this allocation substantially increases welfare relative to the competitive outcome.

VI.B. Distributional Outcomes

The population faces an unavoidable healthcare spending bill of $11,723 per household. It is unavoidable because it arises even if all households have the minimum allowable coverage (Catastrophic). While full insurance provides the benefit of additional risk protection, it also raises the population’s healthcare spending bill by 8 percent due to moral hazard, to $12,695 per household.

The spending bill is funded by a combination of out-of-pocket costs and insured costs. Insured costs are in turn funded by premiums and, to the extent optimal premiums imply an aggregate deficit, by taxes. Different coverage levels imply different divisions between out-of-pocket costs and insured costs. For example, if all households had Catastrophic coverage, in expectation 47 percent of the spending bill would be paid out-of-pocket, and 53 percent would be insured. If all households had full insurance, 100 percent of the spending bill would be insured. There are therefore large differences across policies in the source of funding for the population healthcare spending bill, and thereby in how evenly the spending bill is shared in the population.

Figure 8 shows distributional outcomes under three of our candidate policies: (i) regulated pricing (“All Gold”), (iii) competitive pricing (“All Catastrophic”), and (v) premiums to support vertical choice (“Vertical Choice”). The left panel shows the distribution of healthcare spending bills across households. Each household’s healthcare spending bill equals the premium plus expected out-of-pocket cost associated with their chosen contract, plus any tax assessed on all consumers. For simplicity (and because we lack information on income), we assess taxes equally across households. Households are again ordered on the horizontal axis according to their willingness to pay. Under “All Catastrophic,” there is a premium-plus-tax of $6,298. The highest willingness-to-pay households then also pay an expected out-of-pocket cost of $9,708, implying a healthcare spending bill of $16,006. The lowest willingness-to-pay households pay an expected out-of-pocket cost of only $1,500, implying a healthcare spending bill of only $7,798. When the population has higher coverage, as under the other two pricing policies, the healthcare spending bill is shared more evenly in the population.

The right panel evaluates the distribution of consumer surplus, incorporating preferences over risk and healthcare utilization in addition to just spending outcomes. In typical markets, consumer surplus is measured relative to the absence of a product. As we enforce a minimum level of coverage, here it is measured relative to the absence of a better product. Under each policy, consumer surplus is the difference between a household’s marginal willingness to pay for their chosen plan, and the marginal premium-plus-tax associated with that choice. The integral of each consumer surplus curve equals the social welfare generated by that policy, relative to the “All Catastrophic” outcome. The difference between the “All Gold” consumer surplus curve in Figure 8b and the Gold contract’s social surplus curve in Figure 7 is that the former shows who receives the surplus, while the latter shows who generates it. The integrals of the two curves are the same.

Figure 8b depicts a classic feature of insurance markets with adverse selection. The optimal feasible allocation (“All Gold”) results in higher coverage and greater social welfare gains, while the competitive outcome (“All Catastrophic”) results in lower coverage but a more even distribution of welfare gains. At the competitive outcome, no one is made worse off than they were absent the market. Regulatory intervention can offer substantial efficiency gains, at the cost of making some households worse off.

Dynamic Considerations.

These static gains from trade, and the distribution thereof, are evaluated at a point in time at which households are aware of their endowed type, θ. In the spirit of Hendren (2020) and Ghili et al. (2020), we can also consider welfare from the perspective of an “unborn” consumer, who, prior to participating in our spot insurance market, faces a lottery over types.³⁹ Note that a consumer’s type θ uniquely determines their willingness to pay, and thus their position on the horizontal axes of Figure 8. Instead of considering a lottery over types, we can therefore directly consider the lottery over levels of willingness to pay. Under each policy, the lottery over types faced by an unborn consumer is equivalent to the uniformly distributed lottery over consumer surplus outcomes shown in Figure 8b.

The question then becomes where to normalize utility across consumers. In Figure 8b, we have assumed consumers are equally well off absent the market. But from “behind the veil of ignorance,” it may be more natural to assume they are equally well off when fully insured. Such a renormalization would be reflected in Figure 8b by rotating the consumer surplus curves counter-clockwise, until an “All full insurance” consumer surplus curve were horizontal (as depicted in Figure A.9). In this case, it becomes clear that the “All Gold” policy delivers the least-risky distribution of surplus in the population, consistent with the intuition that higher coverage provides greater dynamic risk protection (Handel, Hendel and Whinston, 2015). Among the three candidate policies, the “All Gold” policy therefore delivers the most efficiency and the most equity in the spot market.⁴⁰

VII. Conclusion

This paper presents a framework for evaluating optimal menus of coverage levels in regulated health insurance markets. Our framework incorporates consumer heterogeneity along multiple dimensions, endogenous healthcare utilization, and menus of nonlinear insurance contracts among which traded contracts are endogenous. We show how willingness to pay for insurance can be decomposed into a component that is only privately relevant and a component that is also socially relevant, the latter of which gives insurance value beyond as a redistributive tool. We emphasize how the privately relevant, redistributive component plays a central role in determining feasible allocations. When premiums must be uniform, it may not be possible to align the private incentive to maximize one’s own transfer with the social incentive to mitigate residual uncertainty. The presence of moral hazard means that the problem is more complicated than simply mandating full insurance for all.

We show that the efficiency of vertical choice hinges on whether consumers with higher willingness to pay have higher efficient levels of coverage. In reverse, this condition implies that a lowest-coverage plan should only be offered if the lowest willingness-to-pay consumers are the intended recipients. In the setting we study, we find that lowest willingness-to-pay consumers are sufficiently risk averse, and facing sufficient risk, to warrant coverage as least as high as the Silver contract. At the other end, our key condition implies that a highest-coverage plan should only be offered if the highest willingness-to-pay consumers are the intended recipients. The highest coverage we consider is full insurance, and we find that it would be more efficient for the highest willingness-to-pay consumers to have lower coverage. Between these bounds, we find that private values for higher coverage are not strongly positively correlated with social values, and thus that offering a choice cannot provide economically meaningful welfare gains. We also find that the welfare stakes of misallocation are low. Relative to what can be achieved by a single contract, the ability to perfectly screen consumers would increase welfare by only $28 per household per year.

An important limitation of this paper is that we assume the socially optimal level of healthcare utilization is the level a consumer would choose absent insurance. If healthcare providers charge supracompetitive prices, or if there are positive externalities of healthcare utilization, it may well be the case that using insurance to induce additional utilization is desirable. In addition, important considerations our model does not address arise when consumers face liquidity constraints (Ericson and Sydnor, 2018) or are protected from large losses by limited liability in addition to by insurance (Gross and Notowidigdo, 2011; Mahoney, 2015). Finally, a central simplification in our model is that healthcare is a homogenous good over which consumers choose only the quantity to consume. In reality, healthcare is multidimensional, and the time and space over which utilization decisions are made is complex. We see the extension of our model in these directions to be a fruitful direction for future research.

Supplementary Material

Appendix

NIHMS1768818-supplement-Appendix.pdf^{(2MB, pdf)}

Acknowledgments

We are grateful to Vivek Bhattacharya, David Cutler, Leemore Dafny, David Dranove, Amy Finkelstein, Tal Gross, Igal Hendel, Gaston Illanes, Matthew Leisten, Thomas McGuire, Matt Notowidigdo, Chris Ody, Rob Porter, Elena Prager, Mar Reguant, Bill Rogerson, Mark Shepard, Amanda Starc, Bob Town, Tom Wiseman, and Gabriel Ziegler, as well as to the co-editor, Liran Einav, and three anonymous referees for advice and suggestions that greatly benefited this research. We also thank many participants in seminars at Northwestern, Kellogg, BI Norwegian Business School, University of Chicago, Princeton, MIT, Washington University in St. Louis, Yale SOM, Rochester Simon, NYU, MIT Sloan, Chicago Booth, Wisconsin, UT Austin, NBER Summer Institute, and the 2018 ASHEcon Conference for helpful comments. Finally, we thank Jason Abaluck and Jon Gruber for access to the data and for their support of this research project.

Appendix A Derivations and Proofs

A.1. Derivation of Willingness to Pay

The expected utility of a type-θ consumer with initial income $\hat{y}$ for contract x at premium p is given by U(x, p, θ), as defined in Equation 1 and repeated here:

U (x, p, θ) = E_{l} [u_{ψ} (\hat{y} - p - c_{x}^{*} (l, ω, x) + b^{*} (l, ω, x))] .

The corresponding certainty equivalent CE(x, p, θ) solves u(CE(x, p, θ)) = U(x, p, θ). It can be expressed as:

C E (x, p, θ) = u_{ψ}^{- 1} (U (x, p, θ)) = E V (x, θ) + \hat{y} - p + u_{ψ}^{- 1} (U (x, p, θ)) - E V (x, θ) - \hat{y} + p = E V (x, θ) + \hat{y} - p - R P (x, p, θ),

where $E V (x, θ) + \hat{y} - p$ is the expected payoff and RP(x, p, θ) is the risk premium associated with the lottery. In particular,

E V (x, θ) = E_{l} [b^{*} (l, ω, x) - c_{x}^{*} (l, ω, x)] = E_{l} [b^{*} (l, ω, x_{0}) - c_{x}^{*} (l, ω, x_{0}) + v (l, ω, x)], and R P (x, p, θ) = E V (x, θ) + \hat{y} - p - u_{ψ}^{- 1} (U (x, p, θ)),

(A.1)

where as before $v (l, ω, x) = b^{*} (l, ω, x) - b^{*} (l, ω, x_{0}) - (c_{x}^{*} (l, ω, x) - c_{x}^{*} (l, ω, x_{0}))$ . A consumer’s willingness to pay for contract x relative to the null contract x₀ is equal to $\tilde{p}$ that solves:

C E (x, \tilde{p}, θ) = C E (x_{0}, 0, θ) E V (x, θ) + \hat{y} - \tilde{p} - R P (x, \tilde{p}, θ) = E V (x_{0}, θ) + \hat{y} - R P (x_{0}, 0, θ) \tilde{p} = E V (x, θ) - E V (x_{0}, θ) + R P (x_{0}, 0, θ) - R P (x, \tilde{p}, θ) .

To obtain a closed-form expression for willingness to pay, we assume constant absolute risk aversion, and thus that the risk premium RP does not depend on residual income.¹ In this case, marginal willingness to pay for contract x relative to the null contract is given by:

W T P (x, θ) = E V (x, θ) - E V (x_{0}, θ) + R P (x_{0}, θ) - R P (x, θ) = E_{l} [c_{x_{0}}^{*} (l, ω, x_{0}) - c_{x}^{*} (l, ω, x_{0}) + v (l, ω, x)] + Ψ (x, θ),

where Ψ(x, θ) = RP(x₀, θ) − RP(x, θ). If the null contract provides a riskier distribution of payoffs than contract x, Ψ(x, θ) will be positive.

A.2. Definitions and Proofs

Assumptions.

Consider the model in Section II.A. Suppose contracts x ∈ X are characterized by increasing, continuous, and concave out-of-pocket cost functions $c_{x} : ℝ_{+} \to ℝ_{+}$ , where c_x(m) ≤ m ∀ m and which are differentiable almost everywhere with $c_{x}^{'} \in [0, 1]$ , where $c_{x}^{'}$ denotes the derivative wherever it exists. Suppose consumers have type $θ = (F, ω, ψ) \in Δ^{c} (ℝ) \times ℝ_{+ +} \times ℝ_{+ +} = : Θ$ .² Given health state realization $l \in ℝ$ , contract premium p, and initial income $\hat{y}$ , suppose consumers value healthcare spending $m \in ℝ_{+}$ according to $u_{ψ} (\hat{y} - p + b (m; l, ω) - c_{x} (m))$ , where $b (m; l, ω) = (m - l) - \frac{1}{2 ω} {(m - l)}^{2}$ and where u_ψ(x) = − exp(−ψx).

Under these assumptions, social surplus is given by SS(x, θ) = ψ(x, θ) − scmh(x, θ), where

Ψ (x, θ) = R P (x_{0}, θ) - R P (x, θ)

$where R P (x, θ) = ψ^{- 1} \log (\underset{l ~ F}{E} [\exp (- ψ (z_{x} (l, θ) - {\bar{z}}_{x} (θ)))]),$

S C M H (x, θ) = \underset{l ~ F}{E} [\frac{ω}{2} {(1 - c_{x}^{'} (m^{*} (l, ω, x)))}^{2}],

and where $z_{x} (l, θ) = \hat{y} - p + b (m^{*} (l, ω, x); l, ω) - c_{x} (m^{*} (l, ω, x))$ and ${\bar{z}}_{x} (θ) = E_{l ~ F} [z_{x} (l, θ)]$ . Appendix C.2 solves for privately optimal spending $m^{*} (l, ω, x) = {argmax}_{m} (b (m; l, ω) - c_{x} (m))$ when contracts are piecewise linear with a deductible, coinsurance rate, and out-of-pocket maximum. As m* never falls on a kink, $c_{x}^{'} (m^{*})$ always exists. The indirect benefit from privately optimal spending is given by $b (m^{*} (l, ω, x); l, ω) = \frac{ω}{2} (1 - c_{x}^{'} {(m^{*} (l, ω, x))}^{2})$ . Willingness to pay is given by $W T P (x, θ) = {\bar{z}}_{x} (θ) - {\bar{z}}_{x_{0}} (θ) + Ψ (x, θ)$ .

Definitions.

We say that a given contract is “higher coverage” than another if it provides both a higher certainty equivalent payoff WTP(x, θ) as well as greater risk protection Ψ(x, θ). This notion of coverage level is slightly stronger that what is implied by vertical differentiation alone. We use it because it allows our model to have the following desirable properties:

the value of risk protection is increasing in coverage level;
the social cost of moral hazard is increasing in coverage level;
efficient coverage level is increasing in risk aversion;
efficient coverage level is decreasing in the moral hazard parameter.

Definitions 1 and 2 formalize the distinction between vertical differentiation and coverage level ordering. Propositions 1 and 2 provide the conditions on contracts that yield each ordering. Briefly, vertical differentiation requires only a relation on contracts’ level of out-of-pocket costs, while coverage level ordering (as defined) also requires a relation on contracts’ marginal out-of-pocket costs. A higher-coverage contract must have an out-of-pocket cost function that is everywhere below and everywhere flatter than a lower-coverage contract.

Implications.

The most important reason we use this notion of coverage level is that it allows extrapolation of social surplus across coverage levels. Namely, it implies that social surplus is single peaked in coverage level. Proposition 3 states this formally. Single-peakedness allows one to infer, for example, that if a given contract is less-than-socially-optimal coverage for all households, the same would be true of any lower level of coverage.

Proofs are provided below. Of the four stated properties of the model, property (i) is true by definition, property (ii) is established in the proof of Proposition 3, and properties (iii) and (iv) are proved in Lemmas 2 and 3, respectively.

Definition 1.

Contracts x′, x ∈ X are vertically differentiated (with x′ preferred) if and only if WTP(x′, θ) ≥ WTP(x, θ) ∀ θ ∈ Θ.

Definition 2.

Given x′, x ∈ X, contract x′ is higher coverage than contract x if and only if WTP(x′, θ) ≥ WTP(x, θ) ∀ θ ∈ Θ and Ψ(x′, θ) ≥ Ψ(x, θ) ∀ θ ∈ Θ. We denote this relationship by writing x′ ≥ x.

Proposition 1.

Contracts x′, x ∈ X are vertically differentiated (with x′ preferred) if and only if $c_{x^{'}} (m) \leq c_{x} (m) \forall m$ .

Proposition 2.

Given x′, x ∈ X, contract x′ is higher coverage than contract x if and only if $c_{x^{'}} (m) \leq c_{x} (m) \forall m$ and $c_{x^{'}}^{'} (m) \leq c_{x}^{'} (m)$ almost everywhere.

Proposition 3.

Social surplus is single peaked in coverage level. That is, fixing θ ∈ Θ and x, x′, x″ ∈ X where x ≤ x′ ≤ x″: if SS(x″, θ) ≥ SS(x′, θ), then SS(x′, θ) ≥ SS(x, θ).

Proof of Proposition 1: Contracts x′, x ∈ X are vertically differentiated (with x′ preferred) if and only if $c_{x^{'}} (m) \leq c_{x} (m) \forall m$ .

Fix θ ∈ Θ. Let $Z_{x} = : z_{x} (l, θ)$ be the random payoff induced by health state distribution F. At any health state l, lower out-of-pocket costs deliver higher payoffs:

Z_{x} = z_{x} (l, θ) = \hat{y} - p + b (m^{*} (l, ω, x); l, ω) - c_{x} (m^{*} (l, ω, x)) \leq \hat{y} - p + b (m^{*} (l, ω, x); l, ω) - c_{x^{'}} (m^{*} (l, ω, x)) \leq \hat{y} - p + b (m^{*} (l, ω, x^{'}); l, ω) - c_{x^{'}} (m^{*} (l, ω, x^{'})) = z_{x^{'}} (l, θ) = Z_{x^{'}},

where the second inequality holds by the optimality of $m^{*} (l, ω, x^{'}) . [\Leftarrow] Z_{x^{'}}$ therefore firstorder stochastically dominates Z_x, and the result follows because u_ψ is increasing. [⟹] If $c_{x^{'}} (\tilde{m}) > c_{x} (\tilde{m})$ for some $\tilde{m}$ , the first inequality fails to hold for consumer type $\tilde{ω}$ at health state realization $\tilde{l}$ at which $m^{*} (\tilde{l}, \tilde{ω}, x) = \tilde{m}$ . Such a consumer type exists for any $\tilde{m}$ we might choose because as w approaches zero, privately optimal utilization approaches the health state, meaning any m can be approached arbitrarily closely. As c_x is continuous, if $c_{x^{'}} (\tilde{m}) > c_{x} (\tilde{m})$ , the same will be true in a neighborhood of $\tilde{m}$ . A consumer with health state distribution $\tilde{F}$ degenerate on $\tilde{l}$ would strictly prefer contract x. By continuity, a consumer with a health state distribution that is sufficiently concentrated at $\tilde{l}$ would also prefer contract x. □

Proof of Proposition 2: Contract x′ is higher coverage than contract x if and only if $c_{x^{'}} (m) \leq c_{x} (m) \forall m$ and $c_{x^{'}}^{'} (m) \leq c_{x}^{'} (m)$ almost everywhere.

By Proposition 1, $c_{x^{'}} (m) \leq c_{x} (m) \forall m$ is necessary and sufficient for the contracts to be vertically differentiated. It remains to show that $c_{x^{'}}^{'} (m) \leq c_{x}^{'} (m)$ almost everywhere is necessary and sufficient for Ψ (x′, θ) > Ψ(x, θ). Fix θ ∈ Θ. Let ${\dot{Z}}_{x} = : z_{x} (l, θ) - {\bar{z}}_{x} (θ)$ be the mean-zero random payoff induced by health state distribution F. Differentiating ${\dot{Z}}_{x}$ with respect to the health state realization l:

\frac{d {\dot{Z}}_{x}}{d l} = \frac{\partial b}{\partial l} (m^{*} (l, ω, x); l, ω) \leq \frac{\partial b}{\partial l} (m^{*} (l, ω, x^{'}); l, ω) = \frac{d {\dot{Z}}_{x^{'}}}{d l} \leq 0.

That is, the payoff is weakly decreasing in the health state, and is doing so faster for contract x than for contract x′. The first equality holds by the envelope theorem. Because $\frac{\partial^{2} b}{\partial l \partial m} = ω^{- 1} \geq 0$ , the first inequality holds as long as m* (l, w, x) is increasing in x. The second inequality holds because $\frac{\partial b}{\partial l} = ω^{- 1} (m^{*} (l, ω, x) - l) - 1 \leq 0$ , or in other words, because moral hazard spending never exceeds w.³ [⟹] Lemma 1 shows that m* (l, w, x) is increasing in x as long as $c_{x^{'}}^{'} (m) \leq c_{x}^{'} (m)$ . ${\dot{Z}}_{x}$ is therefore a mean preserving spread of ${\dot{Z}}_{x^{'}}$ , and would be preferred by any risk-averse expected utility maximizer: $E_{l ~ F} [u_{ψ} ({\dot{Z}}_{x^{'}})] \geq E_{l ~ F} [u_{ψ} ({\dot{Z}}_{x})]$ . The result follows because –ψ⁻¹ log(–x) is increasing. [⟹] If $c_{x^{'}} (\tilde{m}) > c_{x} (\tilde{m})$ for some $\tilde{m}$ , the first inequality fails to hold for consumer type $\tilde{ω}$ at health state realization $\tilde{l}$ at which $m^{*} (\tilde{l}, \tilde{ω}, x) = \tilde{m}$ . Such a consumer type exists for any $\tilde{m}$ we might choose because as w approaches zero, privately optimal utilization approaches the health state, meaning any m can be approached arbitrarily closely. As c_x is continuous, if $c_{x^{'}} (\tilde{m}) > c_{x} (\tilde{m})$ , the same will be true in a neighborhood of $\tilde{m}$ . At $\tilde{l}$ , the payoff would therefore be decreasing faster in the health state under contract x′ than under contract x, and x would provide strictly more risk protection to a consumer with health state distribution $\tilde{F}$ sufficiently concentrated around $\tilde{l}$ . □

Proof of Proposition 3. Social surplus is single peaked in coverage level. That is, fixing θ ∈ Θ and x, x′,x″ ∈ X where x ≤ x′ ≤ x″: if SS(x″, θ) ≥SS(x′, θ), then SS(x′, θ) ≥SS(x, θ).

Let ${\tilde{c}}_{x} (l) = c_{x} (m^{*} (l, ω, x))$ be the indirect out-of-pocket cost function for consumer type w under contract x.⁴ As θ is fixed throughout the proof, we omit w as an argument in ${\tilde{c}}_{x} (l)$ . Similarly, let ${\tilde{c}}_{x}^{'} (l) = c_{x}^{'} (m^{*} (l, ω, x))$ be the indirect marginal out-of-pocket cost function. Note that because m*(l, w, x) is increasing in x (see Lemma 1) and contracts are concave, ${\tilde{c}}_{x^{''}}^{'} (l) \leq {\tilde{c}}_{x^{'}}^{'} (l) \leq {\tilde{c}}_{x}^{'} (l)$ wherever these derivatives exist.

Next, for each contract x ∈ {x, x′, x″}, calculate the cutoff values of the health state l that determine which segment of the piecewise linear out-of-pocket cost function the consumer of type θ will choose. Appendix C.2 describes this procedure and provides formulas for the cutoffs. As the contracts we consider have at most three segments, each contract has at most three cutoffs: one at which positive healthcare utilization begins and two separating the segments of the out-of-pocket cost function.⁵ Considering the three cutoff values of our three candidate contracts simultaneously, the space of health states (the real line) is divided into at most 10 regions. Denote these regions by ${R_{r}}_{r = 1}^{10}$ , where $R_{r} = (l_{r}^{l b}, l_{r}^{u b})$ and $l_{r}^{u b} = l_{r + 1}^{l b}$ .⁶ The lower bound of the first region is −∞ and the upper bound of the final region is ∞. For each contract x in each region R_r, out-of-pocket costs are linear in the health state, and so can be written ${\tilde{c}}_{x, r} (l) = γ_{x, r} + l {\tilde{c}}_{x, r}^{'}$ , with intercept γ_x,r and slope ${\tilde{c}}_{x, r}^{'}$ . As before, higher coverage contracts are flatter: $c_{x^{''}, r}^{'} \leq c_{x^{'}, r}^{'} \leq c_{x, r}^{'} \forall r$ .

Extend this notation to the consumer’s payoff z_x(l, θ). Omitting θ, the payoff in region r under contract x can now be written:

z_{x} (l) = \hat{y} - p_{x} + \frac{ω}{2} (1 - {\tilde{c}}_{x}^{'} {(l)}^{2}) - {\tilde{c}}_{x} (l) = \hat{y} - p_{x} + \frac{ω}{2} (1 - {\tilde{c}}_{x, r}^{' 2}) - γ_{x, r} - {\tilde{c}}_{x, r}^{'} l, l \in R_{r} .

The payoff is linear in the health state with slope and intercept determined by the relevant segment of the indirect out-of-pocket cost function. To isolate the effects of level from the effects of slope, it is useful to express the payoff in terms of differences from its mean in a given region. To this end, write:

z_{x} (l) = {\bar{z}}_{x, r} - {\tilde{c}}_{x, r}^{'} (l - {\bar{l}}_{r}), l \in R_{r}

where ${\bar{l}}_{r} = E_{l ∣ R_{r}} [l]$ is the conditional expectation of the health state in region r with respect to the consumer’s health state distribution F, and ${\bar{z}}_{x, r} = z_{x} ({\bar{l}}_{r})$ is the conditional expectation of the payoff. Note that because higher coverage contracts deliver everywhere higher payoffs (see proof of Proposition 1): ${\bar{z}}_{x^{''}, r} \geq {\bar{z}}_{x^{'}, r} \geq {\bar{z}}_{x, r} \forall r$ . Each contract is now fully characterized by the payoff function it generates, which in turn is fully described by its mean and slope in each region: ${{\bar{z}}_{x, r}, {\tilde{c}}_{x, r}^{'}}_{r = 1}^{10}$ . Higher coverage contracts generate both higher and flatter payoffs in every region. Expressing the payoff function in this way allows us think about changing a contract’s slope while holding its expected payoff fixed, and vice versa.

We now proceed in two steps. We first show that the social cost of moral hazard scmh(x, θ) is increasing and “convex” in coverage level. As coverage level itself has no cardinal interpretation, the idea of convexity is applicable with respect to the slope of contracts’ indirect out-of-pocket cost functions ${\tilde{c}}_{x, r}^{'}$ . We then show that the value of risk protection Ψ(x, θ) is increasing and “concave” in coverage level, where the idea of concavity is again applicable with respect to ${\tilde{c}}_{x, r}^{'}$ . Note that the tradeoff between risk protection and moral hazard operates entirely through the slope of the out-of-pocket cost function. The level of out-of-pocket costs impacts only the value of risk protection, and does so monotonically. As SS(x, θ) = Ψ(x, θ) − scmh(x, θ), these two steps imply SS(x, θ) is itself concave in the slope of the out-of-pocket function. Single-peakedness in coverage level follows from the fact that this slope is monotonic in coverage level.

1. scmh(x, θ) is increasing and “convex” in coverage level.

First, split the expectation between the defined regions, omitting θ as an argument:

S C M H (x) = \underset{l ~ F}{E} [\frac{ω}{2} {(1 - {\tilde{c}}_{x}^{'} (l))}^{2}] = \sum_{r = 1}^{10} π_{r} [\frac{ω}{2} {(1 - {\tilde{c}}_{x, r}^{'})}^{2}],

where $π_{r} = \Pr (l \in R_{r} ∣ l ~ F)$ is the probability of realizing a health state in region R_r. Taking the derivative with respect to the slope of the indirect out-of-pocket cost function in a given region:

\frac{d S C M H (x)}{d {\tilde{c}}_{x, r}^{'}} = - π_{r} ω (1 - {\tilde{c}}_{x, r}^{'}) \leq 0.

As scmh(x) is decreasing in ${\tilde{c}}_{x, r}^{'}$ in any region, it is increasing in coverage level. Taking the second derivative:

\frac{d^{2} S C M H (x)}{d {\tilde{c}}_{x, r}^{'} 2} = π_{r} ω \geq 0.

The social cost of moral hazard is therefore increasing in the slope of the indirect out-of-pocket cost function ${\tilde{c}}_{x, r}^{'}$ at an increasing rate. It is unaffected by changes in ${\bar{z}}_{x, r}$ .

2. Ψ(x, θ) is increasing and “concave” in coverage level.

First, split the expectation between the defined regions, omitting θ as an argument:

Ψ (x) = R P (x_{0}) - ψ^{- 1} \log (E_{l} [\exp (- ψ (z_{x} (l) - {\bar{z}}_{x}))]) = R P (x_{0}) - ψ^{- 1} \log (\sum_{r = 1}^{10} π_{r} E_{l ∣ R_{r}} [\exp (- ψ (z_{x} (l) - {\bar{z}}_{x}))])

\frac{d Ψ (x)}{d {\tilde{c}}_{x, r}^{'}} = {(E_{l} [\exp (- ψ z_{x} (l))])}^{- 1} π_{r} E_{l ∣ R_{r}} [\exp (- ψ z_{x} (l)) ({\bar{l}}_{r} - l)] \leq 0.

Because the function exp(−ψx) is convex and the payoffs z_x(l) are decreasing in the health state, worse-than-average health states $(l \geq {\bar{l}}_{r})$ receive more weight than better-than-average health states $(l \leq {\bar{l}}_{r})$ , and the expression is nonpositive. Taking the second derivative:

\frac{d^{2} Ψ (x)}{d {\tilde{c}}_{x, r}^{'}^{'}} = ψ [{(\frac{π_{r} E_{l ∣ R_{r}} [\exp (- ψ z_{x} (l)) ({\bar{l}}_{r} - l)]}{E_{l} [\exp (- ψ z_{x} (l))]})}^{2} - (\frac{π_{r} E_{l ∣ R_{r}} [\exp (- ψ z_{x} (l)) {({\bar{l}}_{r} - l)}^{2}]}{E_{l} [\exp (- ψ z_{x} (l))]})] \leq 0.

The first term is the squared conditional expectation of $({\bar{l}}_{r} - l)$ . The second term is the conditional expectation of ${({\bar{l}}_{r} - l)}^{2}$ . Because x² is convex, the expression is nonpositive by Jensen’s inequality. □

Lemma 1.

Healthcare utilization is increasing in coverage level.

Proof. Fix $l \in ℝ$ , $ω \in ℝ_{+ +}$ , and x, x′ ∈ X where x ≤ x′. Optimal utilization $m^{*} (l, ω, x) = {argmax}_{m} (b (m; l, ω) - c_{x} (m))$ . Consider $m, m^{'} \in ℝ_{+}$ where m ≤ m′:

b (m^{'}; l, ω) - c_{x^{'}} (m^{'}) - [b (m^{'}; l, ω) - c_{x} (m^{'})] = c_{x} (m^{'}) - c_{x^{'}} (m^{'}) \geq c_{x} (m) - c_{x^{'}} (m) = b (m; l, ω) - c_{x^{'}} (m) - [b (m; l, ω) - c_{x} (m)],

where the inequality holds because $c_{x^{'}} (m) \leq c_{x} (m)$ and $c_{x^{'}}^{'} (m) \leq c_{x}^{'} (m)$ guarantees c is submodular in m and x. The objective b(m; l, u) − c_x(m) is therefore supermodular and standard monotone comparative statics imply m*(l, w, x) is increasing in x. □

Lemma 2.

Efficient coverage level is increasing in risk aversion.

Proof. Fix θ ∈ Θ. Efficient coverage level $x^{e f f} = {argmax}_{x} (R P (x_{0}, F, ω, ψ) - R P (x, F, ω, ψ) - S C M H (x, F, ω))$ . As the insurer is risk-neutral, the social cost of moral hazard is unaffected by ψ. Differentiating RP(x, F, u, ψ) with respect to ψ:

\frac{d R P (x, θ)}{d ψ} = - ψ^{- 1} [R P (x, θ) + {(\underset{l ~ F}{E} [\exp (- ψ {\dot{Z}}_{x})])}^{- 1} \underset{l ~ F}{E} [\exp (- ψ {\dot{Z}}_{x}) {\dot{Z}}_{x}]],

where ${\dot{Z}}_{x} = : z_{x} (l, θ) - {\bar{z}}_{x} (θ)$ . The first term in the brackets, RP(x, θ), is shown to be decreasing in x in Proposition 2. The second term represents a weighted average of deviations from mean payoffs, where the weights correspond to the utility weight at that realization. As ${\dot{Z}}_{x}$ becomes less risky as x increases (see proof of Proposition 2), this term is also decreasing in x. $\frac{d S S (x, θ)}{d ψ}$ is therefore increasing in x, and standard monotone comparative statics imply x^eff is increasing in ψ. □

Lemma 3.

Efficient coverage level is decreasing in the moral hazard parameter.

Proof. Fix θ ∈ Θ. Efficient coverage level $x^{e f f} = {argmax}_{x} (Ψ (x, θ) - S C M H (x, θ))$ , where $S C M H (x, θ) = E_{l ~ F} [\frac{ω}{2} {(1 - c_{x}^{'} (m^{*} (l, ω, x)))}^{2}]$ . Differentiating scmh(x, θ) with respect to w:

\frac{d S C M H (x, θ)}{d ω} = \underset{l ~ F}{E} [\frac{1}{2} {(1 - c_{x}^{'} (m^{*} (l, ω, x)))}^{2}] \leq 0.

Note that contracts are piecewise linear and $c_{x}^{'} \in [0, 1]$ . Because m*(l, w, x) is increasing in x (see Lemma 1) and contracts are concave, $c_{x}^{'} (m^{*} (l, ω, x))$ is decreasing in x and $\frac{d S C M H (x, θ)}{d ω}$ is increasing in x. $\frac{d S S (x, θ)}{d ω}$ is therefore decreasing in x, and standard monotone comparative statics imply x^eff is decreasing in w. □

Appendix B Additional Analysis

B.1. Estimation of Plan Cost-sharing Features

A crucial input to our empirical model is the cost-sharing function of each plan. While Table 1 describes plans using the deductible and in-network out-of-pocket maximum, plans are in reality characterized by a much more complex set of payment rules, including copayments, specialist visit coinsurance, out-of-network fees, and fixed charges for emergency room visits. To structurally model moral hazard, we make the huge simplification that healthcare is a homogenous good over which the consumer chooses only the quantity to consume. We then model this decision as being based in part on out-of-pocket cost. To that end, our empirical model requires as an input a univariate function that maps total healthcare spending into out-of-pocket cost.

A natural choice might be to use the deductible, nonspecialist coinsurance rate, and innetwork out-of-pocket maximum. However, in our setting, the out-of-pocket cost function described by these features does not correspond well to what we observe in the claims data. In particular, we often observe out-of-pocket spending amounts that exceed plans’ in-network out-of-pocket maximum. Because of this, we take a different approach.

We define plan cost-sharing functions by three parameters: a deductible, a coinsurance rate, and an out-of-pocket maximum. Taking the true deductibles as given (since these correspond well to the data), we estimate a coinsurance rate and an out-of-pocket maximum that minimizes the sum of squared residuals between predicted and observed out-of-pocket cost. We observe realized total healthcare spending for each household in the claims data. Predicted out-of-pocket cost is calculated by applying the deductible and supposed coinsurance rate and out-of-pocket maximum. “Observed” out-of-pocket cost is either observed directly in the claims data (if a household chose that plan) or else calculated counterfactually. We carry out this procedure separately for each plan, year, and family status (individual or family).⁷ Figure A.1 shows an example of the data and estimates for a particular plan: Moda - 3 for individual households in 2012. Table A.3 presents the estimated cost-sharing features for all plans in all years.

B.2. Variation in Plan Menu Generosity

Measuring Plan Menu Generosity.

We want to measure the likelihood that a household would choose generous health insurance coverage when presented with a particular plan menu. We refer to this measure as “plan menu generosity.” At a simple level, if plan menus consisted of only a single plan, the assignment to higher coverage would obviously constitute a “more generous menu” than the assignment to lower coverage. But plan menus in our setting are more complex. They contain multiple plans and many possible permutations of plan choice sets, and plans vary by their actuarial value, the identity of their insurer, their associated employee premium, and their potential HSA/HRA and vision/dental contribution. All of these factors likely influence households’ plan choices.

In order to construct usable measures of plan menu generosity, we transform these multi-dimensional objects using a conditional logit model that excludes all household observables. This specification allows us to predict the probability that a given household would choose a given plan when presented with a given plan menu as if the household had been acting like the average household in the data. Variation in the resulting predicted choice probabilities is driven only by variation in plan menus, and not by variation in (observed or unobserved) household characteristics.

Abstracting from the dimension of time for now, we define plan_jk as an indicator for the plan j chosen by household k. We estimate the following conditional logit model:

{plan}_{j k} = \underset{j \in J_{d}}{argmax} (α p_{j d} + α^{V D} p_{j d}^{V D} + α^{H A} p_{j d}^{H A} + ν_{j} + ϵ_{j k}),

(B.1)

where $J_{d}$ is the set of plans available in the school district-family type-occupation type combination d (to which household k belongs), p_jd is the employee premium, $p_{j d}^{V D}$ is the vision/dental subsidy, and $p_{j d}^{H A}$ is the HSA/HRA contribution. Plan characteristics are captured nonparametrically by plan fixed effects v_j. All household-specific determinants of plan choice are contained in the error term ϵ_jk. Estimated parameters are presented in Table A.4, separately for each year of our data. As expected, households dislike premiums, prefer higher HSA/HRA and vision/dental subsidies, and prefer higher-coverage plans to lower-coverage plans.

We use the choice probabilities implied by Equation B.1 to construct our measures of plan menu generosity. Given plan menu ${menu}_{d} \equiv {p_{j d}, p_{j d}^{V D}, p_{j d}^{H A}, ν_{j}}_{j \in J_{d}}$ , we denote the predicted probability that plan j is chosen as ρ_jd.⁸ Our measures of plan menu generosity are the probability a household would choose a given insurer and the expected actuarial value of a household’s plan choice conditional on insurer, respectively given by:

ρ_{f d} = \sum_{j \in J_{d}^{f}} ρ_{j d}, {\hat{A V}}_{f d} = \sum_{j \in J_{d}^{f}} (\frac{ρ_{j d}}{ρ_{f d}}) A V_{j},

(B.2)

where $J_{d}^{f}$ is the set of plans in menu_d offered by insurer f.

Explaining Plan Menu Generosity.

Because the majority of the variation in coverage level lies within Moda, we focus on explaining plan menu generosity using the predicted actuarial value among Moda plans. We first compare plan menu generosity to observed household health (see Table A.5). We can in all years reject the hypothesis that household risk scores are correlated with plan menu generosity, conditional on family structure. We also find that plan menus are consistently most generous for single employee coverage and least generous for employee plus family coverage. This pattern is consistent with our understanding of OEBB’s benefit structure, and is common in employer-sponsored health insurance.

We further explore which covariates, in addition to family structure, can explain variation in plan menu generosity. Table A.6 presents three additional regressions of predicted actuarial value on employee-level covariates (part-time versus full-time status, occupation type, and union affiliation), as well as on school district-level covariates (home price index and percent of Republicans).⁹ Employees are either part-time or full-time. There are eight mutually exclusive employee occupation types; the regressions omit the type “Licensed Administrator.” There are five mutually exclusive union affiliations, and employees may not be affiliated with a union; the regressions omit the non-union category. We calculate the average home price index (HPI) in a school district by taking the average zip-code level home price index across employees’ zip-code of residence.¹⁰ Pct. Republican measures the percent of households in a school district that are registered as Republicans as of 2016.¹¹

We find that plan menus are less generous for part-time employees, are substantially less generous for substitute teachers, and are more generous for employees at community colleges. Certain union affiliations are also predictive of more or less generous plan menus. Across school districts, plan menu generosity is decreasing in both the logged home price index and the percent of registered Republicans.

B.3. Reduced-form Estimates of Moral Hazard

While our primary sample consists of data from 2009–2013, we conduct our reduced-form analysis of moral hazard using only data from 2008.¹² The OEBB marketplace began operating in 2008, so that year all employees chose from this set of plans for the first time. This “active choice” year permits us to look cleanly at how plan choices and healthcare spending depended on plan menus without also having to account for how prior-year plan menus affected current-year plan choices. While our structural model will capture these dynamics, we feel they are better avoided at this stage.

We estimate how plan menus—choice sets and prices—affect plan choices, and in turn how plan choices affect total healthcare spending, as described by Equations (B.3) and (B.4):

{plan}_{k} = f (m e n u_{d}, X_{k}, ξ_{k}),

(B.3)

y_{k} = g ({plan}_{k}, X_{k}, ξ_{k}) .

(B.4)

Here, plan_k represents the plan chosen by household k, menu_d represents the plan menu available to the school district-family type-occupation type combination d (to which household k belongs), X_k are observable household characteristics, ξ_k are unobservable household characteristics, and y_k is total healthcare spending. Because household characteristics appear in both equations, the standard challenge in estimating the effect of plan_k on y_k is that a household’s chosen plan is correlated with its unobservable characteristics ξ_k. Our identifying assumption is that plan menus are independent of household unobservables ξ_k conditional on household observables X_k.

We parameterize plan_k to be an indicator variable for the identity of the insurer and a continuous variable for the plan actuarial value. We then parameterize Equation B.4 according to

\log (y_{k}) = δ_{f} 1_{f (k) = f} + γ \log (1 - A V_{j (k)}) 1_{f (k) = M o d a} + β X_{k} + ξ_{k},

(B.5)

where 1_f(k)=f is an indicator for the insurer chosen by household k and AV_j_(k) is the actuarial value of the plan chosen by household k. The parameter δ_f represents insurer-specific treatment effects on total spending.¹³ Our parameter of interest is γ, which represents the responsiveness of total spending to plan generosity, holding the insurer fixed (at Moda).¹⁴ We follow the literature in formulating the model so that γ represents the elasticity of total healthcare spending with respect to the average out-of-pocket price per dollar of total spending.¹⁵

We estimate Equation B.5 using two-stage least squares, instrumenting for the chosen insurer (1_f(k)=f) and actuarial value (AV_j_(k)) using menu_d. As instruments, we use the measures of plan menu generosity constructed in Appendix B.2. Namely, we instrument for 1_f(k)=f using using ρ_fd and for log(1 − AV_j_(k))1_f(k)=Moda using $\log (1 - {\hat{A V}}_{d, M o d a}) ρ_{d, M o d a}$ . Table A.7 reports the estimates. We report only the coefficient of interest (γ), but all specifications also contain insurer fixed effects, as well as controls for household risk score and family structure. The first column presents the parameters estimated without instruments, and the second column presents the instrumental variables estimates. Comparing the coefficients in columns 1 and 2, we find that moral hazard explains 46 percent of the observed relationship between plan generosity and total healthcare spending. Our overall estimate of the elasticity of demand for healthcare spending in the population is −0.27. The standard benchmark estimate from the RAND health insurance experiment is −0.2 (Manning et al., 1987; Newhouse, 1993).

Heterogeneity.

Columns 3 and 4 of Table A.7 introduce heterogeneity in γ by household health. For each household type (individual or family), we classify households into quartiles based on household risk score, where Q_n denotes the quartile of risk (Q₄ is highest risk). We construct separate instruments for each of the eight household types by estimating the logit model in Equation B.1 for only that subsample of households. We find noisy but large differences in γ across household risk quartiles and between individual and family households.

Variation in γ could reflect either heterogeneity in the intensity of treatment (extent of exposure to varying marginal prices of healthcare across plans), or heterogeneity in treatment effect (different responsiveness to varying marginal prices of healthcare across plans), or both. While this analysis cannot distinguish between these two effects, we find suggestive evidence that the heterogeneity at least in part reflects differential treatment intensity. The remainder of this section presents an analysis that compares the realized spending outcomes of households in different risk quartiles with the variation in plan cost-sharing features that gives rise to different end-of-year marginal out-of-pocket prices. We find that the household types for which we estimate higher γ are also more likely to be exposed to varying marginal out-of-pocket costs. Distinguishing variation in treatment intensity from variation in treatment effect is an important advantage of our structural model.

Appendix C Estimation Details

C.1. Fenton-Wilkinson Approximation

Because there is no closed-form solution for the distribution of the sum of lognormal random variables, the Fenton-Wilkinson approximation is widely used in practice.¹⁶ Under this approximation, the distribution of the sum of draws from independent lognormal distributions can be represented by a lognormal distribution. The parameters of the approximating distribution are chosen such that its first and second moments match the corresponding moments of the true distribution of the sum of lognormals. In our application, the sum of lognormals is the household’s health state distribution, and the lognormals being summed are the individuals’ health state distributions. An individual’s health state ${\tilde{l}}^{i}$ is assumed have a shifted lognormal distribution:

\log ({\tilde{l}}^{i} + κ_{i}) ~ N (μ_{i}, σ_{i}^{2}) .

All parameters may vary over time (since individual demographics vary over time), but t subscripts are omitted here for simplicity. The moment-matching conditions for the distribution of the household-level health state $\tilde{l}$ are:

E (\tilde{l} + κ_{k}) = \sum_{i \in I_{k}} E ({\tilde{l}}^{i} + κ_{i}),

(C.1)

V a r (\tilde{l} + κ_{k}) = \sum_{i \in I_{k}} V a r ({\tilde{l}}^{i} + κ_{i}),

(C.2)

E (\tilde{l}) = \sum_{i \in I_{k}} E ({\tilde{l}}^{i}),

(C.3)

where $I_{k}$ is the set of individuals in household k. Equation C.1 sets the mean of the household’s distribution equal to the sum of the means of each individual’s distribution. Equation C.2 matches the variance. Because we have a third parameter to estimate (the shift, κ_k), we use a third moment-matching condition to match the first moment of the unshifted distribution, shown in Equation C.3.

Under the approximating assumption that $\tilde{l} + κ_{k}$ is distributed lognormally, and substituting the analytical expressions for the mean and variable of a lognormal distribution, these equations become:

\exp (μ_{k} + \frac{σ_{k}^{2}}{2}) = \sum_{i \in I_{k}} \exp (μ_{i} + \frac{σ_{i}^{2}}{2}) (\exp (σ_{k}^{2}) - 1) \exp (2 μ_{k} + σ_{k}^{2}) = \sum_{i \in I_{k}} (\exp (σ_{i}^{2}) - 1) \exp (2 μ_{i} + σ^{2}) \exp (μ_{k} + \frac{σ_{k}^{2}}{2}) - κ_{k} = \sum_{i \in I_{k}} \exp (μ_{i} + \frac{σ_{i}^{2}}{2}) - κ_{i}

Given a guess of the parameters to be estimated (the individual-level parameters), this leaves three equations in three unknowns, and we can solve for the household-level parameters. The solutions for μ_k, $σ_{k}^{2}$ , and κ_k are:

σ_{k}^{2} = \log [1 + {[\sum_{i \in I_{k}} \exp (μ_{i} + \frac{σ_{i}^{2}}{2})]}^{- 2} \sum_{i \in I_{k}} (\exp (σ_{i}^{2}) - 1) \exp (2 μ_{i} + σ_{i}^{2})]

μ_{k} = - \frac{σ_{k}^{2}}{2} + \log [\sum_{i \in I_{k}} \exp (μ_{i} + \frac{σ_{i}^{2}}{2})]

κ_{k} = \sum_{i \in I_{k}} κ_{i}

Given these algebraic Solutions for the parameters of a household’s health state distribution, we can work backward to estimate which individual-level parameters best explain the observed data on individual-level demographics and household-level healthcare spending. A key advantage of using this approximation instead of simply simulating the true distribution of the sum of lognormals is that we can use quadrature to integrate the distributions of health states, thereby limiting the number of support points needed for numerical integration.

C.2. Estimation Algorithm

We estimate the model using a maximum likelihood approach similar to that described by Revelt and Train (1998) and Train (2009), with the appropriate extension to a discrete/continuous choice model in the style of Dubin and McFadden (1984). The maximum likelihood estimator selects the parameter values that maximize the conditional probability density of households’ observed total healthcare spending, given their plan choices.

The model contains four dimensions of unobservable heterogeneity: risk aversion, household health, the moral hazard parameter, and the T1-EV idiosyncratic shock. The last we can integrate analytically, but the first three we must integrate numerically; we denote these as $β_{k t} = {ψ_{k}, μ_{k t}, ω_{k}}$ . We denote the full set of parameters to be estimated as θ, which, among other things, contains the parameters of the distribution of β_kt. Given a guess of θ, we simulate the distribution of β_kt using Gaussian quadrature with 27 support points, yielding simulated points β_kts(θ) = {ψ_ks, μ_kts, w_ks}, as well as weights W_s.^17,18 For each simulation draw s, we then calculate the conditional density at households’ observed total healthcare spending and the probability of households’ observed plan choices.

Conditional Probability Density of Healthcare Spending.

We have data on realized healthcare spending m_kt for each household and year. We aim to construct the distribution of healthcare spending for each household-year implied by the model and guess of parameters. We start by constructing individual-level health state distribution parameters μ_it, σ_it, and κ_it from θ and individual demographics, as described in Equation 7. We then construct household-level health state distribution parameters μ_kts, σ_kt, and κ_kt using the formulas in Equation 8 and the draws of, β_kts(θ). The model predicts that upon realizing their health state l, households choose total healthcare spending m by trading off the benefit of healthcare utilization with its out-of-pocket cost. Specifically, accounting for the fact that negative health states may imply zero spending, the model predicts optimal healthcare spending $m_{j t}^{*} (l, ω_{k s}) = \max (0, ω_{k s} (1 - c_{j t}^{'} (m^{*})) + l)$ if household k were enrolled in plan j in year t. Inverting the expression, the health state realization l_kjts that would have given rise to observed spending m_kt under moral hazard parameter w_ks is given by

l_{k j t s} : {\begin{array}{l} l_{k j t s} < 0 & m_{k t} = 0 \\ l_{k j t s} = m_{k t} - ω_{k s} (1 - c_{j t}^{'} (m_{k t})) & m_{k t} > 0. \end{array}

Household health state is distributed according to

l = ϕ_{f} \tilde{l} \log (\tilde{l} + κ_{k t}) ~ N (μ_{k t s}, σ_{k t}^{2}) .

There are two possibilities to consider. First, if m_kt is equal to zero, the implied health state realization l_kjts is negative. Given monetary health state realization l_kjts, the implied “quantity” health state realization is equal to ${\tilde{l}}_{k j t s} = ϕ_{f}^{- 1} l_{k j t s}$ , where f is the insurer offering plan j. Since ϕ_f > 0, the probability of observing l_kjts < 0 is the probability of observing ${\tilde{l}}_{k j t s} \leq κ_{k t}$ . Second, if m_kt is greater than zero, it is useful to define $λ_{k j t s} = ϕ_{f}^{- 1} l_{k j t s} + κ_{k t}$ , which itself is distributed lognormally (no shift). The density of m_kt in this case is given by the density of λ_kjts. Taken together, the probability density of total healthcare spending m conditional on plan, parameters, and household observables X_kt is given by $f_{m} (m_{k t} ∣ c_{j t}, β_{k t s}, θ, X_{k t}) = P (m = m_{k t} ∣ c_{j t}, β_{k t s}, θ, X_{k t})$ , where

f_{m} (m_{k t} ∣ c_{j t}, β_{k s}, θ, X_{k t}) = {\begin{array}{l} Φ (\frac{\log (κ_{k t}) - μ_{k t}}{σ_{k t}}) & m_{k t} = 0, \\ ϕ_{f}^{- 1} Φ^{'} (\frac{\log (λ_{k j t s}) - μ_{k t}}{σ_{k t}}) & m_{k t} > 0, \end{array}

and Φ(·) is the standard normal cumulative distribution function. For a given guess of parameters, there are certain values of m_kt for which the probability density is zero. In order to rationalize the data at all possible parameter guesses, in practice we use a convolution of f_m(m_kt\c_jt, β_ks, θ, X_kt) and a uniform distribution over the range [−1 e-75, 1e75].¹⁹

Probability of Plan Choices.

We next calculate the probability of a household’s observed plan choice. Given θ and β_kts, we simulate the distribution of health states l_kjtsd using D = 30 support points:

l_{k j t s d} = ϕ_{f} (\exp (μ_{k t s} + σ_{k t} Z_{d}) - κ_{k t}) .

where Z_d is a vector of points that approximates a standard normal distribution using Gaussian quadrature, and W_d (to be used soon) are the associated weights. We then calculate the privately optimal healthcare spending choice m_kjtsd associated with each potential health state realization.

Plans in our empirical setting are characterized by a deductible D, a coinsurance rate C, and an out-of-pocket maximum O. Marginal out-of-pocket costs c′(m) equal 1 in the deductible region, c in the coinsurance region, and 0 in the out-of-pocket maximum region. Denote the boundary between the coinsurance region and the out-of-pocket maximum region (the “stop loss” level of total spending) by A = C⁻¹ (O − D(1 − C)). Privately optimal spending falls into one of these three regions depending on the realization of the health state l and the moral hazard parameter w. The relevant cutoff values for the health state are

Z_{1} = D - ω (1 - C) / 2,

Z_{2} = O - ω / 2,

Z_{3} = A - ω (1 - C / 2),

where Z₁ ≤ Z₂ ≤ Z₃ so long as O ≥ D and C ∈ [0, 1]. There are two types of plans to consider. If D and A are sufficiently far apart (there is a sufficiently large coinsurance region), then only the cutoffs Z₁ and Z₃ matter, and it may be optimal to be in any of the three regions, depending on where the health state is relative to those two cutoff values. If D and A are close together, it will never be optimal to be in the coinsurance region (better to burn right though it and into the free healthcare of the out-of-pocket maximum region), and the cutoff Z₂ will determine whether the deductible or out-of-pocket maximum region is optimal. If the realized health state is negative, optimal spending will equal zero. In sum:

\begin{matrix} \begin{matrix} If A - D > ω / 2 : \\ m^{*} = {\begin{array}{l} \max (0, l) & l \leq Z_{1}, \\ l + ω (1 - C) & Z_{1} < l \leq Z_{3}, \\ l + ω & Z_{3} < l; \end{array} \end{matrix} & \begin{matrix} If A - D \leq ω / 2 : \\ m^{*} = {\begin{array}{l} \max (0, l) & l \leq Z_{2}, \\ l + ω & Z_{2} < l . \end{array} \end{matrix} \end{matrix}

A graphical example (of the case in which the coinsurance region is sufficiently large) is shown in Figure A.2b. All plans in our empirical setting have A − D > w/2 at reasonable values of w.

With distributions of privately optimal total healthcare spending $m_{k j t s d}^{*}$ in hand for each household, plan, year, and draw of β_ks, we can calculate households’ expected utility from enrolling in each potential plan. We construct the numerical approximation to Equation 5 using the quadrature weights W_d:

U_{k j t s} = - \sum_{d = 1}^{D} W_{d} \cdot \exp (- ψ_{k} z_{k j t s} (l_{k j t s d})),

where the monetary payoff z is calculated as in Equation 6. To avoid numerical issues arising from double-exponentiation, we estimate the model in certainty-equivalent units of U_kjts:

U_{k j t s}^{C E} = {\bar{z}}_{k j t s} - \frac{1}{ψ_{k}} \log (\sum_{d = 1}^{D} W_{d} \cdot \exp (- ψ_{k} (z_{k j t s} (l_{k j t s d}) - {\bar{z}}_{k j t s}))),

where ${\bar{z}}_{k j t s} = E_{d} [z_{k j t s} (l_{k j t s d})]$ . Another reason for estimating the model in certainty equivalents is that it becomes simple to denominate the logit error term in dollars rather than in utils. This ensures that our choice model is “monotone,” in the sense that the probability of preferring a less-risky plan is everywhere increasing in risk aversion; see Apesteguia and Ballester (2018) for a full treatment of this issue.

Choice probabilities, conditional on β_kts, are given by the standard logit formula:

L_{k j t s} = \frac{\exp (U_{k j t s}^{C E} / σ_{ϵ})}{\sum_{i \in J_{k t}} \exp (U_{k i t s}^{C E} / σ_{ϵ})} .

Likelihood Function.

The numerical approximation to the likelihood of the sequence of choices and healthcare spending amounts for a given household is given by

L L_{k} = \sum_{j = 1}^{J} d_{k j t} \sum_{s = 1}^{S} W_{s} \prod_{t = 1}^{T} f_{m} (m_{k t} ∣ θ, β_{k t s}, c_{j t}, X_{k t}) L_{k j t s} .

where d_kjt = 1 if household k chose plan j in year t and zero otherwise. The log-likelihood function for parameters θ is

L L (θ) = \sum_{k = 1}^{K} \log (L L_{k}) .

C.3. Recovering Household-specific Types

We assume that household types β_kt(θ) = {ψ_k, μ_kt, w_k} are distributed multivariate normal with observable heterogeneity in the mean vector, according to Equation 9. After estimating the model and obtaining $\hat{θ}$ , we want to use each household’s observed outcomes (plan choices and healthcare spending amounts) to back out which type they are likely to be. Let $g (β ∣ \hat{θ})$ denote the population distribution of types. Let $h (β ∣ \hat{θ}, y)$ denote the density of β conditional on parameters $\hat{θ}$ and a sequence of observed plan choices and healthcare spending amounts y. Using what Revelt and Train (2001) term the “conditioning of individual tastes” method, we recover households’ posterior distribution of β using Bayes’ rule:

h (β ∣ \hat{θ}, y) = \frac{p (y ∣ β) g (β ∣ \hat{θ})}{p (y ∣ \hat{θ})} .

Taking the numerical approximations, $p (y ∣ \hat{θ})$ is simply the household-specific likelihood function LL_k for an observed sequence of plan choices and spending amounts; $g (β ∣ \hat{θ})$ is the quadrature weights W_s on each simulated point; and p(y|β) is the conditional household likelihood function LL_ks:

L L_{k s} = \sum_{j = 1}^{J} d_{k j t} \prod_{t = 1}^{T} f_{m} (m_{k t} ∣ θ, β_{k s}, c_{j t}, X_{k t}) L_{k j t s} .

Taken together, the numerical approximation to each household’s posterior distribution of unobserved heterogeneity is given by

h_{k s} (β ∣ \hat{θ}, y_{k}) = \frac{L L_{k s} \cdot W_{s}}{L L_{k}},

where $\sum_{s} h_{k s} (β ∣ \hat{θ}, y_{k}) = 1$ .

For the purposes of examining total variation in types across households (accounting for both observed and unobserved heterogeneity), we assign each household the expectation of their type with respect to their posterior distribution.

We also use the household-specific distributions over types to calculated expected quantities of interest for each household. In particular, we calculate WTP_kjt and SS_kjt as

W T P_{k j t} = \sum_{s} h_{k s} (β ∣ \hat{θ}, y_{k}) W T P_{k j t s},

S S_{k j t} = \sum_{s} h_{k s} (β ∣ \hat{θ}, y_{k}) S S_{k j t s},

Joint Distribution of Household Types.

We investigate the distribution implied by our primary estimates in column 3 of Tables 3 and A.8. For each household, we first calcúlate the expectation of their type with respect to their posterior distribution of unobservable heterogeneity:

ψ_{k} = \sum_{s} h_{k s} (β ∣ \hat{θ}, y_{k}) ψ_{k s},

ω_{k} = \sum_{s} h_{k s} (β ∣ \hat{θ}, y_{k}) ω_{k s} .

In place of μ_kt, a more relevant measure of household health is the expected health state, i.e., expected total spending absent moral hazard. Using the expectation of a shifted lognormal variable and price parameter ϕ =1, the expected health state ${\bar{l}}_{k t}$ is given by

{\bar{l}}_{k t} = \sum_{s} h_{k s} (β ∣ \hat{θ}, y_{k}) (\exp (μ_{k t s} + \frac{σ_{k t}^{2}}{2}) - κ_{k t}) .

To limit our focus to one type for each household, we look at ${\bar{l}}_{k t}$ for the first year each household appears in the data. Figure A.3 presents the joint distribution of household types along the dimensions of risk aversion, moral hazard parameter, and expected health state. We measure the expected health state on a log scale for readability.

Footnotes

By market regulator, we mean the entity that administers a particular health insurance market: in employer-sponsored insurance, this is the employer; in Medicare, it is the Centers for Medicare and Medicaid Services; under a national health insurance scheme, it is the government.

We also note the close relationship between our paper and recent work by Landais et al. (2021) on unemployment insurance and Hendren, Landais and Spinnewijn (2021) on social insurance more broadly. Like us, these papers consider the value of offering a choice from the perspective of a social planner that can set prices.

It may not be possible to vary premiums with consumer attributes if consumers have private information (Cardon and Hendel, 2001), or it may not be desirable to do so to prevent exposing consumers to reclassification risk (Handel, Hendel and Whinston, 2015). Otherwise, the market could be partitioned according to observable characteristics, and each submarket could be considered separately.

⁴

Importantly, this is true only if m represents the true cost of healthcare provision and if there are not externalities associated with healthcare utilization, as we assume here.

⁵

Following convention, we use the term “moral hazard” to describe the scenario at hand, in which there is elastic demand for the insured good and a state that is not contractible. Note that this is not a problem of hidden action, but rather of hidden information. A fuller discussion of this (ab)use of terminology in the health insurance literature can be found in Section I.B of Einav et al. (2013), as well as in the dialogue between Pauly (1968) and Arrow (1968).

⁶

The single role of constant absolute risk aversion is to ensure that the value of risk protection, and thereby social surplus, is invariant to the contract premium.

⁷

Note that as SS represents an average, this condition does not itself guarantee that the social surplus curve will cross zero. Since it is necessary for SS to cross zero for vertical choice to be optimal, we focus our two examples on cases in which that occurs. If SS did not cross zero, a single plan would be on-average optimal at every level of willingness to pay, and the optimal menu would feature a single contract.

⁸

Individual risk scores are calculated based on prior-year medical diagnoses and demographics using Johns Hopkins ACG Case-Mix software. This software uses diagnostic information contained in past claims data as well as demographic information to predict future healthcare spending. See, for example, Brot-Goldberg et al. (2017); Carlin and Town (2008); or Handel and Kolstad (2015) for more in-depth explanation of the software and examples of its use in economic research.

⁹

Decisions about HSA/HRA and vision/dental contributions are also made independently by school districts. An HRA is a notional account that employers can use to reimburse employees’ uninsured medical expenses on a pre-tax basis; balances expire at the end of the year or when the employee leaves the employer. An HAS is a financial account maintained by an external broker to which employers or employees can make pre-tax contributions. Data on employer premium contributions and savings account contributions were hand-collected via surveys of each school district. Additional details on the data collection process can be found in Abaluck and Gruber (2016).

¹⁰

The majority of school districts used either a fixed dollar contribution or a percentage contribution, but the levels of the contribution varied widely. Other districts used a fixed employee contribution. In addition, the districts’ policies for how “excess” contributions were treated varied; in some cases, contribution amounts in excess of the full plan premium could be “banked” by the employee in a HSA or HRA, or else put toward the purchase of a vision or dental insurance plan.

¹¹

Many other cost-sharing details determine plan coverage level. For the purposes of our empirical model, we estimate the coinsurance rate and out-of-pocket maximum that best fit the relationship between out-of-pocket spending and total spending observed in the claims data. This procedure is described in Appendix B.1.

¹²

We evaluate out-of-pocket spending for each household in each plan, and then divide average insured spending by average total spending across all households for each plan. We evaluate counterfactual out-of-pocket spending using the “claims calculator” developed for this setting by Abaluck and Gruber (2016).

¹³

We construct this measure using a conditional logit model of household plan choice. This model and the resulting measure of plan menu generosity are described in detail in Appendix B.2.

¹⁴

The relationship could also run the other way: households could move across school districts, or select a district initially, based on the available health benefits. Such selection could again result in unobservably sicker households obtaining more generous health benefits. To the extent that observable health factors are correlated with unobservable factors that would drive this relationship, the analysis that follows is also relevant to this concern.

¹⁵

Note that C_jt is indexed by t because cost-sharing parameters vary within a plan across years. It also varies by household type (individual versus family), but we omit an additional index to save on notation. With a linear out-of-pocket cost function with coinsurance rate c and nonnegative health states: m* = w(1 − c) + l and $b^{*} = \frac{ω}{2} (1 - c^{2})$ . Appendix C.2 provides solutions when contracts are piecewise linear and negative health states are permitted.

¹⁶

The model predicts, for example, that if a consumer realizes a health state just under the deductible, she will take advantage of the proximity to cheaper healthcare and consume a bit more (putting her into the coinsurance region). Figure A.2 provides a depiction of optimal spending behavior predicted by this model.

¹⁷

X_kjt includes HRA or HSA contributions, HA_kjt; vision and dental plan contributions, VD_kjt; and a fixed effect $ν_{j t}^{N a r r o w N e t}$ for one plan (Moda −2) that had a narrow provider network in 2011 and 2012. The associated parameters for health account and vision/dental contributions are α^HA and α^VD, respectively.

¹⁸

Provider prices are a well-documented source of heterogeneity in total healthcare spending across insurers (Cooper et al., 2018), and these differences are often modeled to be linear in utilization (Gowrisankaran, Nevo and Town, 2015; Ghili, 2016; Ho and Lee, 2017; Liebman, 2018). Of course, ϕ_f may also capture other differences across insurers, such as care management protocols or provider practice patterns.

¹⁹

Household-level covariates are fixed over time as follows. If a household has children in some years but not others, we assign it to its modal status. Household risk score is calculated as the mean risk score of all individuals in a household across all years. Household age is the mean age of all adults across all years.

²⁰

We divide the state into three regions, based on groups of adjacent Hospital Referral Regions (HRRs): the Portland and Salem HRRs in northwest Oregon (containing 55 percent of households); the Eugene and Medford HRRs in southwest Oregon (32 percent of households); and the Bend, Spokane, and Boise HRRs in eastern Oregon (13 percent of households). For more information and HRR maps, see http://www.dartmouthatlas.org/data/region (Dartmouth Atlas Project, n.d.).

²¹

For comparison, the average w estimated by Einav et al. (2013) is $1,330, in a sample of households with average total healthcare spending of $5,283. In our sample, average total spending is $6,339 for individuals and $12,954 for families.

²²

Note that we measure monetary variables in thousands of dollars. Dividing our estimated coefficients of absolute risk aversion by 1,000 makes them comparable to estimates that use risk measured in dollars.

²³

A risk-neutral household would have $X equal to $100, and an infinitely risk-averse household would have $X equal to $0. Using the same example, Handel (2013) reports a mean $X of $91.0; Einav et al. (2013) report a mean $X of $84.0; and Cohen and Einav (2007) report a mean $X of $76.5.

²⁴

Following Revelt and Train (2001), we derive each household’s posterior type distribution using Bayes’ rule, conditioning on their observed choices and the population distribution. For the purposes of examining total variation in types across households (accounting for both observed and unobserved heterogeneity), we assign each household the expectation of their type with respect to their posterior distribution. This procedure is described in detail in Appendix C.3.

²⁵

Coverage level ordering requires that contracts are well-ordered in the amount of risk protection provided. See Appendix A.2 for a definition of coverage level ordering and the conditions on contracts that imply this ordering.

²⁶

The contracts’ deductibles, coinsurance rates, and out-of-pocket maximums are: $10,000, -, $10,000 for Catastrophic; $5,846, 40%, $7,500 for Bronze; $3,182, 27%, $5,000 for Silver; and $1,125, 15%, $2,500 for Gold.

²⁷

As our model allows for rich heterogeneity in preferences over financially differentiated contracts, we are comfortable with the interpretation that any remaining determinants of plan choice contained in ϵ can be considered “mistake-making” (e.g., Handel and Kolstad, 2015) or “monkey-on-the-shoulder tastes” (Akerlof and Shiller, 2015), and so can be omitted from the social welfare calculation. In our counterfactuals, we suppose consumers have access to a tool that perfectly aids them in expressing their true preferences. Our question is whether, for this dimension of choice, such a tool is needed.

²⁸

See Appendix A.2 for the expression of the value of risk protection.

²⁹

We focus on family households because families make up 75 percent of the sample and because our set of potential contracts is chosen to mimic the coverage levels typically offered to families. Our results among individual households are qualitatively unchanged.

³⁰

Households are in fact ordered by willingness to pay for full insurance, but the ordering is nearly identical across contracts. The consistent willingness-to-pay ordering of households across contracts is what permits a graphical analysis of multiple contracts analogous to the two-contract example in Figure 1. See Geruso et al. (2019) for a detailed discussion of this point.

³¹

We note that this finding is closely related to the embedded assumption that moral hazard will not be expressed as long as end-of-year marginal out-of-pocket cost does not vary across contracts. While there is substantial empirical evidence that consumers do respond to spot prices (e.g. Aron-Dine et al., 2015; Dalton, Gowrisankaran and Town, 2020), here we do not find evidence of moral hazard among high-risk households (see Table A.7). If the data did suggest a moral hazard response among these households, the model would load the effect onto the moral hazard parameter ω, compensating a weak treatment with a strong treatment effect.

³²

Although Gold is the efficient contract at every level of willingness to pay, it is not the efficient contract for every household. Figure A.7 shows the heterogeneity in households’ efficient contracts.

³³

The four contracts are the Gold contract (actuarial value 0.86) and the three next-less-generous contracts (actuarial values 0.84, 0.83, and 0.81). At the optimal feasible allocation, 28 percent of households choose Gold, and 34 percent, 37 percent, and 1 percent of households choose the next three contracts respectively. The optimal single contract in the dense set is the 0.83 actuarial value contract.

³⁴

Evaluating a regulator’s choice between these options is no longer a question of vertical choice. Though our estimates suggest that a lower stop-loss point is more efficient, we acknowledge that there are important considerations our model may not capture. For example, consumers may inefficiently restrict utilization in response to even moderate marginal out-of-pocket costs (Aron-Dine et al., 2015; Dalton, Gowrisankaran and Town, 2020), which may pull in favor of a lower stop-loss point, or they may benefit from smoothing out-of-pocket spending within a year (Ericson and Sydnor, 2018; Hong and Mommaerts, 2021), which may pull in favor of a higher stop-loss point.

³⁵

We present fairly large perturbations, changing our estimates by a factor of 2, in order to show cases in which our results do vary. Smaller changes to our parameter estimates, e.g., raising and lowering mean risk aversion by up to 30 percent, do not affect our results.

³⁶

Among family households, 6 percent are childless and under age 45, 27 percent are childless and over age 45, 52 percent have children and are under age 40, and 15 percent have children and are over age 45.

³⁷

This allocation is implementable because the regulator need not break even in aggregate. The Gold contract can be provided for free, and the deficit of $10,619 per household can be funded by taxing incomes (here, at zero cost of public funds). We note that if the regulator did need to break even in aggregate, vertical choice would likely be efficient. The focus would shift to ensuring low-WTP consumers were not left out of the market entirely, even if that induced some high-WTP consumers to select lower-than-efficient coverage. See Azevedo and Gottlieb (2017) and Geruso et al. (2019) for a full treatment of a setting in which the regulator must break even in aggregate.

³⁸

Shares are from Kaiser Family Foundation and are available at https://www.kff.org/health-reform/state-indicator/marketplace-plan-selections-by-metal-level. We map Platinum coverage to full insurance. Premiums that can support these shares are $7,059 for full insurance, $4,594 for Gold, $2,173 for Silver, $375 for Bronze, and $0 for Catastrophic, resulting in an aggregate deficit of $6,856 per household.

³⁹

The interpretation that consumers face a lottery over all elements of type, including preferences, is consistent with Harsanyi (1953, 1955). We take this approach because it permits a simple informal analysis, but refer the reader to Eden (2020) for an alternative potential approach.

⁴⁰

Given the chosen normalization, maximal equity is achieved by allocating all households to full insurance. Maximal (static) efficiency, meanwhile, is achieved by allocating all households to Gold. A regulator placing some weight on each of these objectives may want to offer a vertical choice between the Gold contract and full insurance.

In Equation A.1, $\hat{y} - p$ cancels out completely. This assumption is most reasonable when marginal premiums between relevant plans are small relative to initial income.

$Δ^{c} (ℝ)$ denotes the the set of continuous probability measures on the Borel σ-algebra of $ℝ$ .

Note that this statement would not be true under the “multiplicative” specification of preferences proposed by Einav et al. (2013) and used in Ho and Lee (2021). In that case, $\frac{\partial b}{\partial l}$ becomes positive at a certain health state level, and the payoff z_x(l, θ) begins increasing in the health state. The conditions given in Proposition 2 would therefore not be sufficient to guarantee coverage level ordering in that context.

⁴

The line labelled c* in Figure A.2 represents the function ${\tilde{c}}_{x} (l)$ in that example.

⁵

The proof extends trivially to piece-wise linear out-of-pocket functions with a different number of segments.

⁶

As we have assumed F is continuously distributed, there is zero mass on region boundaries.

⁷

So that the cost-sharing estimates are not affected by large outliers, we drop observations where out-of-pocket spending was above $20,000 or total healthcare spending was above $100,000.

⁸

Formally: $ρ_{j d} = \frac{\exp (U_{j d})}{\sum_{g \in J_{d}} \exp (U_{g d})}$ , where $U_{j d} = α p_{j d} + α^{V D} p_{j d}^{V D} + α^{H A} p_{j d}^{H A} + ν^{j}$ .

⁹

Possible employee occupation types are licensed administrator, non-licensed administrator, classified, community college non-instructional, community college faculty, confidential, licensed, substitute, and superintendent. “Licensed” refers to the possession of a teaching license. Within each type, an employee can be either full-time or part-time. Possible family types are employee only; employee and spouse; employee and child(ren); and employee, spouse, and child(ren).

¹⁰

We use 5-digit zip-code-level home price indices from Bogin, Doerner and Larson (2019).

¹¹

Data on percent of registered voters by party is available at the county level (Oregon Elections Division, 2019). We construct school-district-level measures by taking the average over employees’ county of residence.

¹²

The cost-sharing features of 2008 plans are presented in Table A.1; they are very similar to the plans offered in 2009. We apply the same sample construction criteria to our 2008 sample, except that households must be present for one prior year.

¹³

These may arise due to “supply side” effects arising from differences in provider prices, provider networks, or care management practices, or due to “demand side” effects from differences in average plan generosity.

¹⁴

We do not try to estimate a moral hazard elasticity among the plans offered by Kaiser and Providence because there is so little variation in coverage level.

¹⁵

To accommodate the fact that 2 percent of households have zero spending, we add 1 to total spending.

¹⁶

See Fenton (1960), and for a summary, Cobb, Rumí and Salmerón (2012).

¹⁷

Note that the mean vector of β_kts is a fixed function of θ and household demographics.

¹⁸

We use the Matlab program qnwnorm to implement this method, with three points in each dimension of unobserved heterogeneity. The program can be obtained as part of Mario Miranda and Paul Fackler’s CompEcon Toolbox; for more information, see Miranda and Fackler (2002).

¹⁹

We have experimented with varying these bounds and found that this does not affect parameter estimates as long as the uniform density is sufficiently small.

References

Apesteguia Jose, and Ballester Miguel A.. 2018. “Monotone Stochastic Choice Models: The Case of Risk and Time Preferences.” Journal of Political Economy, 126(1): 74–106. [Google Scholar]
Bogin Alexander, Doerner William, and Larson William. 2019. “Local House Price Dynamics: New Indices and Stylized Facts.” Real Estate Economies, 47(2): 365–398. [Google Scholar]
Cobb Barry, Rumí Rafael, and Salmerón Antonio. 2012. “Approximating the Distribution of a Sum of Log-normal Random Variables” Proceedings of the 6th European Workshop on Probabilistic Graphical Models, PGM 2012. [Google Scholar]
Dubin Jeffrey A., and McFadden Daniel L.. 1984. “An Econometric Analysis of Residential Electric Appliance Holdings and Consumption.” Econometrica, 52(2): 345–362. [Google Scholar]
Einav Liran, Finkelstein Amy, Ryan Stephen P., Schrimpf Paul, and Cullen Mark R.. 2013. “Selection on moral hazard in health insurance.” American Economia Review, 103(1): 178–219. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fenton LF 1960. “The sum of log-normal probability distributions in scatter transmission systems.” IRE Transactions on Communication Systems, 8: 57–67. [Google Scholar]
Ho Kate, and Lee Robin. 2021. “Health Insurance Menu Design for Large Employers.”
Manning Willard G., Newhouse Joseph P., Duan Naihua, Keeler Emmett B., and Leibowitz Arleen. 1987. “Health Insurance and the Demand for Medical Care: Evidence from a Randomized Experiment.” The American Economic Review, 77(3): 251–277. [PubMed] [Google Scholar]
Miranda Mario J., and Fackler Paul L.. 2002. Applied Computational Economics and Finance. MIT Press. [Google Scholar]
Newhouse Joseph. 1993. “Free for All? Lessons from the RAND Health Insurance Experiment.” Cambridge, MA. Harvard University Press. [Google Scholar]
Oregon Elections Division. 2019. “Voter Registration Data.” https://data.oregon.gov/api/views/6a4f-ecbi.
Revelt David, and Train Kenneth. 1998. “Mixed Logit with Repeated Choices: Households’ Choices of Appliance Efficiency Level.” The Review of Economics and Statistics, 80(4): 647–657. [Google Scholar]
Revelt David, and Train Kenneth. 2001. “Customer-Specific Taste Parameters and Mixed Logit: Households’ Choice of Electricity Supplier.”, (0012001).
Train Kenneth. 2009. Discrete Choice Methods with Simulation: Second Edition. Cambridge University Press. [Google Scholar]

References

Abaluck Jason, and Gruber Jonathan. 2011. “Choice Inconsistencies among the Elderly: Evidence from Plan Choice in the Medicare Part D Program.” American Economia Review, 101(4): 1180–1210. [DOI] [PMC free article] [PubMed] [Google Scholar]
Abaluck Jason, and Gruber Jonathan. 2016. “Evolving Choice Inconsistencies in Choice of Prescription Drug Insurance.” American Economia Review, 106(8): 2145–84. [DOI] [PMC free article] [PubMed] [Google Scholar]
Abaluck Jason, and Gruber Jonathan. 2017. “Improving the Quality of Choices in Health Insurance Markets.” NBER Working Paper 22917. [Google Scholar]
Akerlof George A. 1970. “The Market for ”Lemons”: Quality Uncertainty and the Market Mechanism.” Quarterly Journal of Economics, 84(3): 488–500. [Google Scholar]
Akerlof George A., and Shiller Robert J.. 2015. Phishing for Phools: The Economics of Manipulation and Deception. Princeton University Press. [Google Scholar]
Aron-Dine Aviva, Einav Liran, Finkelstein Amy, and Cullen Mark. 2015. “Moral Hazard in Health Insurance: Do Dynamic Incentives Matter?” The Review of Economics and Statistics, 97(4): 725–741. [DOI] [PMC free article] [PubMed] [Google Scholar]
Arrow Kenneth J. 1965. “Uncertainty and the Welfare Economics of Medical Care: Reply (The Implications of Transaction Costs and Adjustment Lags).” The American Economic Review, 55(1/2): 154–158. [Google Scholar]
Arrow Kenneth J. 1968. “The Economics of Moral Hazard: Further Comment.” The American Economic Review, 58(3): 537–539. [Google Scholar]
Azevedo Eduardo M., and Gottlieb Daniel. 2017. “Perfect Competition in Markets With Adverse Selection.” Econometrica, 85(1): 67–105. [Google Scholar]
Bhargava Saurabh, Loewenstein George, and Sydnor Justin. 2017. “Choose to Lose: Health Plan Choices from a Menu with Dominated Option*.” The Quarterly Journal of Economics, 132(3): 1319–1372. [Google Scholar]
Brot-Goldberg Zarek C., Chandra Amitabh, Handel Benjamin R., and Kolstad Jonathan T.. 2017. “What does a Deductible Do? The Impact of Cost-Sharing on Health Care Prices, Quantities, and Spending Dynamics*.” The Quarterly Journal of Economics, 132(3): 1261–1318. [Google Scholar]
Bundorf M. Kate, Jonathan Levin, and Mahoney Neale 2012. “Pricing and welfare in health plan choice.” American Economic Review, 102(7): 3214–3248. [DOI] [PubMed] [Google Scholar]
Bundorf M. Kate, Polyakova Maria, Stults Cheryl, Meehan Amy, Klimke Roman, Pun Ting, Chan Albert Solomon, and Tai-Seale Ming. 2019. “Machine-Based Expert Recommendations And Insurance Choices Among Medicare Part D Enrollees.” Health Affairs, 38(3): 482–490. [DOI] [PubMed] [Google Scholar]
Cardon James H., and Hendel Igal. 2001. “Asymmetric Information in Health Insurance: Evidence from the National Medical Expenditure Survey.” The RAND Journal of Economies, 32(3): 408–427. [PubMed] [Google Scholar]
Carlin Caroline, and Town Robert. 2008. “Sponsored Health Plans.” ReVision, 0–67. [Google Scholar]
Cohen Alma, and Einav Liran. 2007. “Estimating Risk Preferences from Deductible Choice.” American Economic Review, 97(3): 745–788. [Google Scholar]
Cooper Zack, Stuart V Craig Martin Gaynor, and Van Reenen John. 2018. “The Price Ain’t Right? Hospital Prices and Health Spending on the Privately Insured*.” The Quarterly Journal of Economics, 134(1): 51–107. [DOI] [PMC free article] [PubMed] [Google Scholar]
Cutler David M, and Reber Sarah J. 1998. “Paying For Health Insurance: The Tradeoff Between Competition and Adverse Selection.” The Quarterly Journal of Economics, 113(2): 433–466. [Google Scholar]
Dafny Leemore, Ho Kate, and Varela Mauricio. 2013. “Let Them Have Choice: Gains from Shifting Away from Employer-Sponsored Health Insurance and toward an Individual Exchange.” American Economic Journal: Economic Policy, 5(1): 32–58. [Google Scholar]
Dalton Christina M., Gowrisankaran Gautam, and Town Robert J.. 2020. “Salience, myopia, and complex dynamic incentives: Evidence from Medicare Part D.” Review of Economic Studies, 87(2): 822–869. [Google Scholar]
Dartmouth Atlas Project. n.d.. “Interactive Maps: Hospital Referral Regions.”
Dixit Avinash K., and Stiglitz Joseph E.. 1977. “Monopolistic Competition and Optimum Product Diversity.” The American Economic Review, 67(3): 297–308. [Google Scholar]
Dubin Jeffrey A., and McFadden Daniel L.. 1984. “An Econometric Analysis of Residential Electric Appliance Holdings and Consumption.” Econometrica, 52(2): 345–362. [Google Scholar]
Eden Maya. 2020. “Welfare Analysis with Heterogeneous Risk Preferences.” Journal of Political Economy, 128(12): 000–000. [Google Scholar]
Einav Liran, Finkelstein Amy, and Levin Jonathan. 2010. “Beyond Testing: Empirical Models of Insurance Markets.” Annual Review of Economics, 2(1): 311–336. [DOI] [PMC free article] [PubMed] [Google Scholar]
Einav Liran, Finkelstein Amy, Ryan Stephen P., Schrimpf Paul, and Cullen Mark R.. 2013. “Selection on moral hazard in health insurance.” American Economic Review, 103(1): 178–219. [DOI] [PMC free article] [PubMed] [Google Scholar]
Einav Liran, Finkelstein Amy N., and Cullen Mark. 2010. “Estimating Welfare in Insurance Markets Using Variation in Prices.” The Quarterly Journal of Economics, CXV(3). [DOI] [PMC free article] [PubMed] [Google Scholar]
Ericson Keith Marzilli, and Sydnor Justin R. 2018. “Liquidity Constraints and the Value of Insurance.” National Bureau of Economic Research; Working Paper 24993. [Google Scholar]
Ericson Keith Marzilli, and Sydnor Justin. 2017. “The Qüestionable Value of Having a Choice of Levels of Health Insurance Coverage.” Journal of Economic Perspectives, 31(4): 51–72. [DOI] [PubMed] [Google Scholar]
Finkelstein Amy, and McGarry Kathleen. 2006. “Multiple Dimensions of Private Information: Evidence from the Long-Term Care Insurance Market.” American Economic Review, 96(4): 938–958. [PubMed] [Google Scholar]
Geruso Michael. 2017. “Demand heterogeneity in insurance markets: Implications for equity and efficiency.” Quantitative Economics, 8(3): 929–975. [DOI] [PMC free article] [PubMed] [Google Scholar]
Geruso Michael, Layton Timothy J, McCormack Grace, and Shepard Mark. 2019. “The Two Margin Problem in Insurance Markets.” National Bureau of Economic Research; Working Paper 26288. [DOI] [PMC free article] [PubMed] [Google Scholar]
Ghili Soheil. 2016. “Network Formation and Bargaining in Vertical Markets: The Case of Narrow Networks in Health Insurance.”
Ghili Soheil, Handel Benjamin R, Hendel Igal, and Whinston Michael D. 2020. “The Welfare Effects of Long-Term Health Insurance Contracts.” National Bureau of Economic Research; Working Paper 23624. [Google Scholar]
Glazer Jacob, and McGuire Thomas G.. 2011. “Gold and Silver health plans: Accommodating demand heterogeneity in managed competition.” Journal of Health Economics, 30(5): 1011–1019. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gowrisankaran Gautam, Nevo Aviv, and Town Robert. 2015. “Mergers When Prices Are Negotiated: Evidence from the Hospital Industry.” American Economic Review, 105(1): 172–203. [Google Scholar]
Gross Tal, and Notowidigdo Matthew J.. 2011. “Health insurance and the consumer bankruptcy decision: Evidence from expansions of Medicaid.” Journal of Public Economics, 95(7): 767–778. [Google Scholar]
Gruber Jonathan, Handel Ben, Kolstad Jonathan, and Kina Sam. 2019. “Managing Intelligence: Skilled Experts and AI in Markets for Complex Products.”
Handel Benjamin, Hendel Igal, and Whinston Michael. 2015. “Equilibria in Health Exchanges : Adverse Selection vs. Reclassification Risk.” Econometrica, 83(4): 1261–1313. [Google Scholar]
Handel Benjamin R. 2013. “Adverse Selection and Inertia in Health Insurance Markets: When Nudging Hurts.” American Economic Review, 103(7): 2643–2682. [DOI] [PubMed] [Google Scholar]
Handel Benjamin R., and Kolstad Jonathan T.. 2015. “Health Insurance for ”Humans”: Information Frictions, Plan Choice, and Consumer Welfare.” American Economic Review, 105(8): 2449–2500. [DOI] [PubMed] [Google Scholar]
Harsanyi John. 1953. “Cardinal Utility in Welfare Economics and in the Theory of Risktaking.” Journal of Political Economy, 61. [Google Scholar]
Harsanyi John C. 1955. “Cardinal Welfare, Individualistic Ethics, and Interpersonal Comparisons of Utility.” Journal of Political Economy, 63(4): 309–321. [Google Scholar]
Hendren Nathaniel. 2020. “Measuring Ex-Ante Welfare in Insurance Markets.” Review of Economic Studies. [Google Scholar]
Hendren Nathaniel, Landais Camille, and Spinnewijn Johannes. 2021. “Choice in Insurance Markets: A Pigouvian Approach to Social Insurance Design.” National Bureau of Economic Research; Working Paper 27842. [Google Scholar]
Ho Kate, and Lee Robin. 2017. “Insurer Competition in Health Care Markets.” Econometrica, 85(2): 379–417. [Google Scholar]
Ho Kate, and Lee Robin. 2021. “Health Insurance Menu Design for Large Employers.”
Hong Long, and Mommaerts Corina. 2021. “Time Aggregation in Health Insurance Deductibles.” National Bureau of Economic Research; Working Paper 28430. [Google Scholar]
Ketcham Jonathan D., Lucarelli Claudio, Miravete Eugenio J., and Roebuck M. Christopher. 2012. “Sinking, Swimming, or Learning to Swim in Medicare Part D.” American Economic Review, 102(6): 2639–73. [DOI] [PubMed] [Google Scholar]
Kowalski Amanda E. 2015. “Estimating the tradeoff between risk protection and moral hazard with a nonlinear budget set model of health insurance.” International Journal of Industrial Organization, 43: 122–135. [DOI] [PMC free article] [PubMed] [Google Scholar]
Landais Camille, Nekoei Arash, Nilsson Peter, Seim David, and Spinnewijn Johannes. 2021. “Risk-Based Selection in Unemployment Insurance: Evidence and Implications.” American Economic Review, Forthcoming. [Google Scholar]
Liebman Eli. 2018. “Bargaining in Markets with Exclusion: An Analysis of Health Insurance Networks.”
Lustig Joshua. 2008. “The Welfare Effects of Adverse Selection in Privatized Medicare.” Department of Economics, Institute for Business and Economic Research, UC Berkeley, Department of Economics, Working Paper Series. [Google Scholar]
Mahoney Neale. 2015. “Bankruptcy as Implicit Health Insurance.” American Economic Review, 105(2): 710–46. [DOI] [PubMed] [Google Scholar]
Manning Willard G., Newhouse Joseph P., Duan Naihua, Keeler Emmett B., and Leibowitz Arleen. 1987. “Health Insurance and the Demand for Medical Care: Evidence from a Randomized Experiment.” The American Economic Review, 77(3): 251–277. [PubMed] [Google Scholar]
McManus Margaret A., Berman Stephen, McInerny Thomas, and Tang Suk-fong. 2006. “Weighing the Risks of Consumer-Driven Health Plans for Families.” Pediatrics, 117(4): 1420–1424. [DOI] [PubMed] [Google Scholar]
Newhouse Joseph. 1993. “Free for All? Lessons from the RAND Health Insurance Experiment.” Cambridge, MA. Harvard University Press. [Google Scholar]
OEBB. 2018. “Health Insurance Benefits Data.” Oregon Educators Benefit Board. [Google Scholar]
Pauly Mark V. 1968. “The Economics of Moral Hazard: Comment.” American Economic Review, 58(3): 531–537. [Google Scholar]
Pauly Mark V. 1974. “Overinsurance and Public Provision of Insurance : The Roles of Moral Hazard and Adverse Selection.” The Quaterly Journal of Economics, 88(1): 44–62. [Google Scholar]
Reed Mary, Fung Vicki, Price Mary, Brand Richard, Benedetti Nancy, Derose Stephen F., Newhouse Joseph P., and Hsu John. 2009. “High-Deductible Health Insurance Plans: Efforts To Sharpen A Blunt Instrument.” Health Affairs, 28(4): 1145–1154. [DOI] [PubMed] [Google Scholar]
Revelt David, and Train Kenneth. 1998. “Mixed Logit with Repeated Choices: Households’ Choices of Appliance Efficiency Level.” The Review of Economies and Statistics, 80(4): 647–657. [Google Scholar]
Revelt David, and Train Kenneth. 2001. “Customer-Specific Taste Parameters and Mixed Logit: Households’ Choice of Electricity Supplier.”, (0012001). [Google Scholar]
Rothschild Michael, and Stiglitz Joseph. 1976. “Equilibrium in Competitive Insurance Markets: An Essay on the Economics of Imperfect Information.” The Quarterly Journal of Economics, 90(4): 629–649. [Google Scholar]
Samek Anya, and Sydnor Justin R. 2020. “Impact of Consequence Information on Insurance Choice.” National Bureau of Economic Research; Working Paper 28003. [Google Scholar]
Shepard Mark. 2016. “Hospital Network Competition and Adverse Selection: Evidence from the Massachusetts Health Insurance Exchange.” National Bureau of Economic Research; Working Paper 22600. [Google Scholar]
Tilipman Nicholas. 2018. “Narrow Physician Networks, Switching Costs, and Product Variety in Employer Markets.”
Train Kenneth. 2009. Discrete Choice Methods with Simulation: Second Edition. Cambridge University Press. [Google Scholar]
Zeckhauser Richard. 1970. “Medical Insurance: A Case Study of the Tradeoff Between Risk Spreading and Appropriate Incentives.” Journal of Economic Theory, 2(1): 10–26. [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Appendix

NIHMS1768818-supplement-Appendix.pdf^{(2MB, pdf)}

[R1] Apesteguia Jose, and Ballester Miguel A.. 2018. “Monotone Stochastic Choice Models: The Case of Risk and Time Preferences.” Journal of Political Economy, 126(1): 74–106. [Google Scholar]

[R2] Bogin Alexander, Doerner William, and Larson William. 2019. “Local House Price Dynamics: New Indices and Stylized Facts.” Real Estate Economies, 47(2): 365–398. [Google Scholar]

[R3] Cobb Barry, Rumí Rafael, and Salmerón Antonio. 2012. “Approximating the Distribution of a Sum of Log-normal Random Variables” Proceedings of the 6th European Workshop on Probabilistic Graphical Models, PGM 2012. [Google Scholar]

[R4] Dubin Jeffrey A., and McFadden Daniel L.. 1984. “An Econometric Analysis of Residential Electric Appliance Holdings and Consumption.” Econometrica, 52(2): 345–362. [Google Scholar]

[R5] Einav Liran, Finkelstein Amy, Ryan Stephen P., Schrimpf Paul, and Cullen Mark R.. 2013. “Selection on moral hazard in health insurance.” American Economia Review, 103(1): 178–219. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R6] Fenton LF 1960. “The sum of log-normal probability distributions in scatter transmission systems.” IRE Transactions on Communication Systems, 8: 57–67. [Google Scholar]

[R7] Ho Kate, and Lee Robin. 2021. “Health Insurance Menu Design for Large Employers.”

[R8] Manning Willard G., Newhouse Joseph P., Duan Naihua, Keeler Emmett B., and Leibowitz Arleen. 1987. “Health Insurance and the Demand for Medical Care: Evidence from a Randomized Experiment.” The American Economic Review, 77(3): 251–277. [PubMed] [Google Scholar]

[R9] Miranda Mario J., and Fackler Paul L.. 2002. Applied Computational Economics and Finance. MIT Press. [Google Scholar]

[R10] Newhouse Joseph. 1993. “Free for All? Lessons from the RAND Health Insurance Experiment.” Cambridge, MA. Harvard University Press. [Google Scholar]

[R11] Oregon Elections Division. 2019. “Voter Registration Data.” https://data.oregon.gov/api/views/6a4f-ecbi.

[R12] Revelt David, and Train Kenneth. 1998. “Mixed Logit with Repeated Choices: Households’ Choices of Appliance Efficiency Level.” The Review of Economics and Statistics, 80(4): 647–657. [Google Scholar]

[R13] Revelt David, and Train Kenneth. 2001. “Customer-Specific Taste Parameters and Mixed Logit: Households’ Choice of Electricity Supplier.”, (0012001).

[R14] Train Kenneth. 2009. Discrete Choice Methods with Simulation: Second Edition. Cambridge University Press. [Google Scholar]

PERMALINK

When Should There Be Vertical Choice in Health Insurance Markets?

Victoria R Marone

Adrienne Sabety

Abstract

I. Introduction

II. Theoretical Framework

II.A. Model

Demand for Health Insurance and Healthcare Utilization.

Private vs. Social Incentives.

Supply and Regulation.

II.B. Graphical Analysis

Two Contract Example.

Figure 1. Examples in which Vertical Choice (a) Is and (b) Is Not Efficient.

Remarks.

III. Empirical Setting

III.A. Data

Plan Characteristics.

Table 1.

Household Characteristics.

Table 2.

III.B. Variation in Plan Menus

Figure 2. Average Spending by Plan Coverage Level.

IV. Empirical Model

IV.A. Parameterization

Household Utility.

Distribution of Health States.

IV.B. Identification

IV.C. Estimation

V. Results

V.A. Model Estimates

Table 3.

Model Fit.

Figure 3. Model Fit: Plan Choices.

Figure 4. Model Fit: Healthcare Spending, by Tertile of Households by Risk Score.

V.B. Evaluating Vertical Choice

Potential Contracts.

Willingness to Pay.

Figure 5. Willingness to Pay.

Social Surplus.

Figure 6. Components of Social Surplus.

Figure 7. Social Surplus ($).

V.C. Robustness

More Contracts.

Different Contracts.

Different Consumers.

VI. Counterfactual Pricing Policies

VI. A. Welfare Outcomes

Table 4.

VI.B. Distributional Outcomes

Figure 8. Distributional Outcomes.

Dynamic Considerations.

VII. Conclusion

Supplementary Material

Acknowledgments

Appendix A Derivations and Proofs

A.1. Derivation of Willingness to Pay

A.2. Definitions and Proofs

Assumptions.

Definitions.

Implications.

Definition 1.

Definition 2.

Proposition 1.

Proposition 2.

Proposition 3.

1. scmh(x, θ) is increasing and “convex” in coverage level.

2. Ψ(x, θ) is increasing and “concave” in coverage level.

Lemma 1.

Lemma 2.

Lemma 3.

Appendix B Additional Analysis

B.1. Estimation of Plan Cost-sharing Features

B.2. Variation in Plan Menu Generosity

Measuring Plan Menu Generosity.

Explaining Plan Menu Generosity.

B.3. Reduced-form Estimates of Moral Hazard

Heterogeneity.

Appendix C Estimation Details

C.1. Fenton-Wilkinson Approximation