Skip to main content
Proceedings of the National Academy of Sciences of the United States of America logoLink to Proceedings of the National Academy of Sciences of the United States of America
. 2020 Jan 14;117(4):1917–1923. doi: 10.1073/pnas.1905355117

Predicting high-risk opioid prescriptions before they are given

Justine S Hastings a,b,c,d,1,2, Mark Howison a,b,1, Sarah E Inman a,e,1
PMCID: PMC6994994  PMID: 31937665

Significance

We describe a hypothetical preventative policy solution to address the opioid crisis using an integrated administrative database developed in collaboration with the State of Rhode Island. Machine learning algorithms trained on observations of past opioid prescription accurately predict adverse opioid-related outcomes among Medicaid recipients even before their initial opioid prescription is written. Although these models are limited to individuals who have been selected for opioid prescription, they suggest a feasible path forward for using administrative data to inform prescription risk. Under the assumption that the cost of diverting individuals from opioid therapy to an alternative therapy is homogenous across individuals, we simulate a hypothetical policy for restricting opioid prescriptions based on risk that is likely net-beneficial given current cost estimates.

Keywords: opioids, evidence-based policy, predictive modeling, machine learning, administrative data

Abstract

Misuse of prescription opioids is a leading cause of premature death in the United States. We use state government administrative data and machine learning methods to examine whether the risk of future opioid dependence, abuse, or poisoning can be predicted in advance of an initial opioid prescription. Our models accurately predict these outcomes and identify particular prior nonopioid prescriptions, medical history, incarceration, and demographics as strong predictors. Using our estimates, we simulate a hypothetical policy which restricts new opioid prescriptions to only those with low predicted risk. The policy’s potential benefits likely outweigh costs across demographic subgroups, even for lenient definitions of “high risk.” Our findings suggest new avenues for prevention using state administrative data, which could aid providers in making better, data-informed decisions when weighing the medical benefits of opioid therapy against the risks.


Prescription opioids rank among the highest in terms of potential for dependence, abuse, and poisoning. In 2016, more Americans under the age of 50 y died from drug overdoses than from car crashes or gun violence, a trend driven by increases in opioid overdoses (1).

However, opioids may also be an important therapy for those who suffer from chronic pain. The majority of those prescribed opioids do not experience adverse outcomes; a survey of studies of opioid use found that rates of misuse, abuse, and addiction averaged between 8% and 12% (2). This rate is, however, higher than an early (and widely cited) claim that less than 1% of hospitalized patients receiving narcotics developed an addiction (3).

Moreover, many of those suffering from adverse outcomes were introduced to opioids through a legitimate opioid prescription. One study of 6 y of medical and pharmacy claims found that 79.9% of opioid abusers had a prescription prior to their first abuse diagnosis (4). Of the opioid abusers who did not themselves have a prior prescription, 50.8% had a family member with a prior prescription.

Given the risks and long-term consequences of adverse outcomes following legitimate opioid prescriptions, many providers now report a lack of confidence in managing their patients’ chronic pain through opioid therapy (5). Providers could benefit from better information on the risks of initiating a patient on opioid therapy, especially when that patient has never received an opioid prescription before.

Prior studies have identified risk factors for opioid abuse and dependence through descriptive analysis and statistical modeling of both medical claims and electronic health records (610), and two studies have also evaluated the predictive performance of such models (11, 12). However, these studies focus on individuals already persistently receiving opioid therapy and describe patterns of opioid use which are indicative of dependency and misuse within this subpopulation. Previous research has not yet developed a predictive model that is applicable to the larger population of recipients of opioid therapy using data on individuals known only prior to a prescription being given.

In this study, we use integrated administrative data to estimate models of adverse opioid-related outcomes for Medicaid enrollees in Rhode Island and conduct policy simulations of restricting opioid prescriptions to only those with low predicted risk. By some estimates, the opioid epidemic created $5.5 billion in additional health care costs to the Medicaid program nationally in 2013 (13). Estimating our model on state administrative data provides an avenue for state policymakers to predict the risk associated with prescribing opioids to Medicaid enrollees, which could be used to inform providers’ treatment decisions.

Materials and Methods

We use deidentified administrative records from a research data lake we helped build for the State of Rhode Island to support science- and data-driven policy (14). The data lake is housed in a secure enclave, and personally identifiable information has been removed and replaced with anonymous identifiers so that researchers with approved access can join and analyze records associated with the same individual across data sources while preserving anonymity (15). Because this study does not involve data that are both identifiable and private, Brown University’s Institutional Review Board does not classify it as research with human subjects. The database includes Medicaid records from 2005 to 2017 and data on major social benefit and insurance programs, employment, incarceration, and criminal history.

We construct a panel dataset of 80,768 individuals who received an opioid prescription or injection according to the Medicaid claims records between 2006 and 2012 (16). There are 400,024 distinct Medicaid enrollees in this period. Further details and descriptive statistics are in SI Appendix, section 2 and Table S1.

We define an adverse opioid-related outcome as receiving a diagnosis of opioid dependence, abuse, or poisoning* or receiving treatment for an opioid use disorder in the 5 y following initial prescription. SI Appendix, Fig. S1 shows the cumulative frequency of adverse outcomes from the time of initial prescription, which peaks at 5.7% by year 5.

We construct variables from observations in the 12 mo prior to when an individual receives an opioid prescription. These include 84 variables for demographics, incarceration, citations, arrests, car crashes, wages, unemployment rates, household composition, and payments received from social benefit and insurance programs.

We construct 327 variables from Medicaid claims and enrollment records, including summary counts of the number of distinct diseases, chronic conditions, and procedures. The pharmacy claims data include 39,805 distinct drug product codes, and we use a pharmacological classification to consolidate these into prior prescriptions indicators for 262 drug categories. There are 8,494 distinct diagnosis codes and 6,507 distinct procedure codes observed in the claims data. We intend to simultaneously reduce the dimensionality of these variables and estimate the underlying latent structure of the occurrence of the codes. One approach is to use the preexisting hierarchical structure of the codes: to use, for instance, the fact that all ICD-9 codes starting with 303, 304, or 305 relate to use of psychoactive substances. However, this constrains the model to nest codes in ways that may or may not be helpful for our predictive modeling purposes. For example, is it the case that codes 305.0 (nondependent alcohol abuse) and 305.2 (nondependent cannabis abuse) are together more likely to predict our outcome, as a combined measure of nondependent substance abuse? Or is 305.0 together with 303.0 (acute alcoholic intoxication) and 303.9 (other and unspecified alcohol dependence) a broader measure of alcohol use? Because we do not know a priori how to optimally nest the codes, we instead use natural language-processing topic-modeling techniques to consolidate the codes into 50 topics, based on the text descriptions and frequencies of the codes. For example, the 10 most frequent words in topic no. 39 are “hand sprain lateral closed fracture foot minimum examination ankle views.” The variable for topic no. 39 measures how strongly this combination of diagnoses and procedures for hand, foot, and ankle injuries is represented in each individual’s medical history. Details on the topic modeling implementation appear in SI Appendix, section 3.

Finally, we construct 890 interaction terms from the 84 non-Medicaid and 327 Medicaid variables, for a total of 1,301 variables. We consider interactions among demographics, between the Medicaid summary counts and all non-Medicaid variables and between payments received from social benefit and insurance programs and all non-Medicaid variables.

We estimate predictive models using machine-learning algorithms that search over variables and functions of those variables to maximize out-of-sample predictive fit. We fit three kinds of models: a regularized regression, an ensemble, and a neural network. These models vary in complexity (17). For example, the prediction function from a regularized regression is a linear combination of explanatory variables whose regression weights are algorithmically selected from a set of variables and functions of those variables predetermined by the researcher. Neural networks can approximate any function, potentially delivering tighter predictive fit. However, their prediction functions are algorithmically determined layers of functions of covariates and are therefore more difficult to summarize or understand.

For the regularized regression, we use a bootstrapped LASSO (BOLASSO) with 100 bootstrap relicates to avoid arbitrary variable selection among highly correlated subsets of variables (18) and a post-BOLASSO regression on the subset of variables that are consistently selected among the 100 bootstrap replicates. For the ensemble model, we average the predictions across the 100 bootstrap replicates from the BOLASSO. For the neural network, we use a recurrent neural network which can explicitly model the time dependence of the variables (19). In all models, data were split at the beginning of the study into randomly sampled training, validation, and testing sets using the ratio 50:25:25. We report the results of model predictions on the testing set (the “hold-out” sample), which was withheld from analysis prior to the preparation of this paper. SI Appendix, section 4 contains details on model implementation.

We use the model predictions to describe the potential costs and benefits of a hypothetical policy that identifies high-risk individuals before their initial prescription, prevents those prescriptions, and also prevents their adverse outcomes. Such a hypothetical policy is supported by recent findings that predictive screening tools for opioid use disorder help primary care providers improve clinical outcomes (20) and by a growing movement advising clinicians to consider patient risk before initiating opioid therapy (21). It also has similarities to the Centers for Disease Control’s Patient Review and Restriction Program for limiting opioid prescriptions (22).

We define two potential costs. Let CA,i denote the cost to an individual and to society of an adverse outcome for person i and CD,i denote the “diversion cost” i experiences when diverted from an opioid therapy to an alternative therapy. This could include assignment to alternative therapies or to an opioid prescription regimen with a shorter duration and closer monitoring by and communication with a health care professional. Assuming the prescription restriction policy successfully imposes diversion costs and prevents adverse outcomes for i at a rate αi, it will save the cost αi(CA,iCD,i) for each true positive (TPi) who is predicted as high risk and would have had an adverse outcome. False positive individuals (FPi) accrue CD,i because they are incorrectly classified as high risk and prevented from obtaining an opioid prescription. The policy misses the potential savings of CA,i for an individual i who is a false negative, someone who is incorrectly classified as low risk but has an adverse outcome. However, there is no net change since these costs would accrue in the absence or presence of the policy. Finally, true negative individuals are predicted as low risk, do not have an adverse outcome, and accrue neither cost.

The net benefit of the hypothetical prescription restriction policy for person i, therefore, is TPiαi(CA,iCD,i)FPiCD,i. It is positive when αiTPi/(FPi+αiTPi)>CD,i/CA,i. This captures the tradeoff between model accuracy (the probability that i is a true positive, defined as TPi/(FPi+TPi), adjusted in our setting for the prevention efficacy αi) and i’s “cost ratio” CD,i/CA,i. If the diversion cost for i, CD,i, is low relative to the adverse outcome cost CA,i, then it will be beneficial to intervene at a lower risk threshold and accept a lower degree of classification accuracy and/or a lower diversion efficacy rate of αi. We can use this framework to illustrate hypothetical policy tradeoffs and to measure fairness across marginalized subpopulations.

Data Availability.

Data are available through individual data-sharing agreements with each of the following Rhode Island agencies and municipal police departments: RI Department of Corrections, RI Department of Labor and Training, RI Executive Office of Health and Human Services, RI State Police, Central Falls Police Department, Cranston Police Department, Cumberland Police Department, Middletown Police Department, Narragansett Police Department, Providence Police Department, Warwick Police Department, and Woonsocket Police Department. Email hhipnas2020@ripl.org for information on how to request data for replication from the respective state agencies. Analysis code is available from GitHub at https://github.com/ripl-org/predict-opioids.

Results

Predictive Performance.

A common metric for assessing the performance of a machine-learning model is the area under the receiver-operating characteristic curve (AUC), which measures the probability that, given two randomly chosen individuals with different outcomes, the model will correctly assign a higher risk to the individual with the adverse outcome. A perfect classifier has an AUC of 1, and a classifier that chooses at random has an AUC of 0.5. Our models achieve AUCs of 0.778 (95% CI 0.762 to 0.790) for the BOLASSO, 0.786 (95% CI 0.771 to 0.797) for the LASSO ensemble, and 0.801 (95% CI 0.785 to 0.812) for the neural network. SI Appendix, Fig. S2 shows that for all models, the top three deciles of predicted risk have a higher fraction of true outcomes than the full sample base outcome rate of 0.057. In our case, the less-transparent, more-complex neural network does not deliver significant gains in predictive performance.

Consistent Predictors.

Fig. 1 shows the distribution of odds ratios from the post-BOLASSO regression for the 51 variables which the BOLASSO model selected as the strongest, consistent predictors from the full set of 1,301 variables across the 100 bootstrap replicates. BOLASSO helps to identify consistent covariates, avoiding arbitrary choices among highly correlated pairs. While the coefficients on the selected variables do not necessarily have a causal interpretation, they pick up factors which are strong predictors among observables. For example, observed claims for routine preventative health (e.g., Fig. 1, topics 4 and 10) may themselves lower risk through increased or more frequent interactions with medical professionals, or they may proxy for attention to personal health or responsibility which is the true unobserved underlying factor that reduces risk. The primary purpose of our post-BOLASSO regression is to identify the strongest predictors which may point us in the direction of potential underlying mechanisms for further study.

Fig. 1.

Fig. 1.

Odds ratios from the post-BOLASSO regression. Those <0.9 and >1.1 are labeled. A complete regression table is available in SI Appendix, Table S2.

The two variables with the largest odds ratios (indicating increased risk) are related to crime: release from prison and an indicator for an arrest. Individuals released from prison in the prior year are estimated as 119% more likely to develop an adverse outcome if given an initial prescription (odds ratio of 2.19), all else equal, and those with an arrest in the prior year are 76% more likely to do so (odds ratio of 1.76). The next three variables with the largest odds ratios are prior prescriptions for benzodiazepines (1.51), centrally acting muscle relaxants (1.39), and opiate agonists (1.36). Opioid agonists, such as cough syrups and mild painkillers, may have small dosages of an opioid ingredient (SI Appendix, Table S3), but are not considered strong enough for chronic opioid therapy and therefore not classified as or considered to be opioids. Benzodiazepines are relaxants used to treat, for example, alcohol withdrawal, anxiety, and panic disorders.

Variables with the smallest odds ratios (indicating decreased risk) were age 65+ y (0.13, indicating an almost complete, 87% reduction in risk), Hispanic ethnicity (0.41), age 55 to 64 y (0.43), African-American race (0.50), and missing marital status (0.51). Because we use modal marital status across all administrative sources, the missing indicator is likely a proxy for individuals who are enrolled only in Medicaid and not in other programs where marital status is reported.

Twenty-four of the strongest predictors are derived from Medicaid records. These include enrollment in managed care, number of unique Medicaid IDs, summary counts of distinct procedures and CCS diseases, total pharmacy payments, and three indicators for prior prescriptions. The remainder of the Medicaid predictors are diagnosis/procedure topics. Some of the significant themes among the selected topics with positive coefficients are drug/alcohol screening, back pain and injury, sprains and strains, contusions, psychotherapy, and depression and anxiety; and those with negative coefficients are asthma/allergies, breast cancer, gynecological examination, cholesterol screening, eyeglasses, dental evaluations, and intellectual disability.

Cost–Benefit Analysis.

Whether the prescription diversion policy delivers benefits overall depends on whether it delivers benefits for those denied prescriptions. This in turn depends on how the parameters αi, CD,i, and CAi covary with TPi. Assume for simplicity that α, CD, and CA do not vary across individuals and that α=1. Fig. 2 shows the break-even cost ratio CD/CA at which the policy is cost neutral using predictive risk from the neural network model, with the green line assuming a diversion rate α=1 and homogeneous diversion and adverse outcome costs (CD,i=CD, CAi=CA). In the top risk decile, the break-even ratio is 0.233: It is net beneficial to recommend against opioid prescriptions for individuals in the top decile if CD is less than 23.3% of CA. It is net beneficial to intervene with the entire population if CD is less than 5.7% of CA.

Fig. 2.

Fig. 2.

The break-even cost ratio for three values of the efficacy rate α. The break-even cost ratio is the point at which the hypothetical policy becomes cost neutral. If the diversion cost is less than this ratio times the adverse outcome cost (estimated at $450,000), then the policy will be net beneficial. Lower diversion costs are required to make the policy net beneficial among lower risk scores. Error bars indicate the 95% confidence interval calculated from 100 bootstrap replicates.

The existing literature provides guidance on reasonable estimates for CD and CA, and we detail the calculation for an estimate of CA $450,000 (2010 dollars) in SI Appendix, section 5 and Table S4. Diversion costs are more difficult to quantify. They may include lost productivity due to chronic pain after receiving an alternative therapy, or they may include lost time due to requirements for more frequent monitoring of high-risk individuals by prescribing physicians. The economic cost of pain in the United States is conservatively estimated at $560 to $635 billion, with a value of lost productivity from $299 to $335 billion (23). Treating pain compassionately is a moral imperative for physicians, who must balance protecting those experiencing chronic pain with the significant risk of harm that opioids can cause individuals, their families, and their communities (24). However, recent research suggests that opioid therapy may not be more effective at pain relief than nonopioid therapy in both the short and the long term. A randomized trial comparing opioid therapy to nonopioid therapy for acute short-term pain found similar levels of pain relief between the two treatments (25), and observational studies also show no advantage for opioid treatment in terms of pain relief, with some patients on higher-potency opioids reporting more psychological impairment than those on lower-potency opioids (26, 27).

Estimates of costs of time are often calculated and utilized in the transportation literature. The value of time (VOT) has been estimated using stated-preference surveys as well as using revealed preference methodologies (28, 29). Typical VOT estimates are on the order of $30/h (30). Using a 2,000-h work year, the VOT estimate would correspond to a $60,000 annual loss in productivity if diversion costs resulted in loss of 1 y of full-time VOT.

This suggests that CD is likely lower than the $104,400 break-even cost (23.2% of $450,000) for the top risk decile predicted by our model at α=1; $104,400 is above the 86th percentile of the annual earnings distribution in the United States in 2017 (31). Thus, a low risk threshold that maximizes true positives at the cost of increased false positives could be optimal. These findings support a growing belief among some within the medical community that the risks of opioid prescription outweigh the benefits in many cases of prescription outside of cancer or palliative care (32).

A benefit of structuring our cost–benefit analysis in terms of the cost ratio is that a risk threshold can be reevaluated as better data on these costs become available or as knowledge about opioid dependency improves. For example, the cost–benefit analysis represented by the green line in Fig. 2 assumes perfect prevention of dependency for predicted high-risk individuals as a result of the policy (α=1). Individuals may still get access to an opioid through prescriptions given to friends and family. Approximately 10.7% of dependents (50.8% of 21.1% who did not themselves have a prescription) claim friends and family as the source of their first opioid (4), and diversion may still fail if those who do not receive a prescription subsequently borrow pills from others. An α of 0.893 (the red line in Fig. 2) would assume that 10.7% of people go on to seek opioids from a friend or family, and true positives would then develop a dependency. In this case, the break-even costs for the top decile would be $95,400 (21.2% of $450,000).

Furthermore, high-risk individuals who are diverted to alternative therapies could have a higher rate of seeking and obtaining alternative opioid sources (e.g., αi and TPi are negatively correlated). This may occur, for example, if opioid addiction is rational. Rational addiction models (33) predict that those seeking doctor prescriptions for opioids may be rationally seeking them prior to their first prescription to form an addiction as a fully informed, forward-looking, rational decision. Therefore, while restricting opioids may raise the cost of acquiring them and decrease the total number of prescribed opioids, diversion effectiveness may still be imperfect if those seeking prescriptions are making a rational choice and are therefore more likely to obtain opioids and develop a dependency even without a prescription.

To explore whether rational addiction may drive first-time prescriptions for opioids, we examine data on adverse outcomes as a function of patients’ degree of knowledge that they are receiving an opioid. We use the fact that patients may receive opioids through epidural or intravenous injections during inpatient procedures. Under the assumption that these opioid recipients were less likely to be informed they were receiving an opioid than those receiving and filling a prescription from a physician, we would expect fewer adverse outcomes from opioids received through inpatient procedures than through prescriptions in a rational addiction framework. We find that, when used as an explanatory variable for dependency while controlling flexibly for observable characteristics, an indicator for opioid injection is not significantly different from zero (SI Appendix, section 6 and Tables S5 and S6), suggesting that rational addiction may not be driving opioid prescription demand among those receiving their first prescription. Indeed, many researchers point out that informed, rational addiction decisions may be applicable to drugs like nicotine (34), but may not apply to mind-altering drugs or drugs whose effects are not widely known. In the case of opioids, there is evidence that the risks of prescription opioids and their long-term effects were not widely known to the public (35).

Our framework provides a way to adapt and evaluate policy by adjusting α as the information set and health and policy landscapes evolve. More generally, αi and TPi may be negatively correlated for other reasons. SI Appendix, section 7 and Fig. S3 present cost–benefit simulations that allow αi and TPi to be negatively correlated and show there exist parameters for which a policy could be less effective among the highest-risk individuals. In general, trialing a policy and evaluating outcomes would allow policymakers and scientists to uncover individual-level parameter distributions by estimating heterogeneous treatment effects. This could allow policy to improve dynamically over time and eventually predict prescription restriction efficacy for diverting adverse outcomes.

Fairness.

In addition to evaluating the overall cost–benefit tradeoff of a prescription restriction policy, our framework can help policymakers examine measures of “fairness” by quantifying the extent to which policy costs versus benefits accrue disproportionately to marginalized groups. The predictive model’s false discovery rate (FDR) is defined as the fraction of false positives among all individuals who are predicted to have an adverse outcome. Differences in FDR across subgroups can occur when the model predictions Ŷ are not independent of subgroup membership conditional on the true outcomes Y, which is a construct for evaluating fairness that is well cited in the literature (36, 37). Here, we focus on FDR because it represents a notion of unfairness arising from a disproportionate diversion cost accruing to individuals from marginalized groups.

Fig. 3 shows the FDR by risk decile and by minority status, incarceration history, and disability status. The previously incarcerated have a significantly lower FDR, as release from incarceration is a strong positive predictor of adverse outcomes. There is no significant difference by disability status, and this was not a selected predictor of adverse outcomes.

Fig. 3.

Fig. 3.

(A–C) The false discovery rates for minority status (A), incarceration history (B), and disability status (C). The false discovery rate is defined as the fraction of false positives among all individuals who are predicted to have an adverse outcome, which is the population that the hypothetical policy would affect. Error bars indicate the 95% confidence interval calculated from 100 bootstrap replicates.

Minority status is a negative predictor of adverse outcomes, all else equal. Members of minority groups (African-American, Hispanic) have a higher point estimate for FDR in the top-risk decile of our model. The difference between white and minority FDRs in the top-risk decile is 3.2% and insignificant. A power calculation shows that for the top decile we are powered to detect an 8.2 percentage point difference given our sample size (SI Appendix, Table S7). For the lower-risk deciles, the FDR difference becomes significant as a fraction of minorities in the subsample increases. The break-even diversion costs for whites and minorities in the top decile are $107,100 and $92,700 (SI Appendix, Fig. S4), which are above the 86th and 76th percentiles of the annual earnings distribution.

Thus, while the FDR is higher for minorities, restricting opioid prescriptions to those with high predicted risk may generate net benefits in minority and nonminority communities alike. It could be that diversion costs are higher for minorities than for nonminorities. In our data, minorities receiving an opioid prescription have roughly the same number of provider visits in the 30 d prior to an initial prescription, but live on average closer to providers, suggesting that diversion costs may not be substantially different across minority groups to negate overall predicted benefits from prevention policies (SI Appendix, section 8 and Table S8). Our predictive modeling and cost–benefit approach allows policy makers to quantify and weigh benefits and costs within and across subpopulations when designing a data-driven preventative policy.

Discussion

Prevention and treatment policies can be complementary approaches to opioid use disorders. Treatment can help the many individuals already suffering from adverse outcomes, while prevention can stem the growth of new cases of opioid dependence, abuse, or poisoning.

The proven standard treatment for opioid use disorder is medication-assisted treatment (MAT) (3840). However, it faces two significant hurdles. First, MAT is not widely available to those with opioid use disorders; only 36% of substance abuse treatment facilities offer one of three different kinds of medication treatment (41). Second, even when those suffering from opioid use disorders can be connected to treatment, the costs associated with treatment are high and recovery from an opioid use disorder is challenging. The probability of recovery after a year of MAT is estimated at 50% (42).

Prevention strategies can help prevent further cases of opioid use disorder. Current strategies are primarily designed around reducing the quantity or potency of opioid prescriptions to curb misuse and prevent poisoning among those with existing opioid use disorders. These strategies are especially complementary to a treatment approach. A recent study suggests that limiting opioid availability for those with an existing disorder may increase the use of illicit drugs such as heroin.

The most widespread approach to preventing misuse by those with a disorder has been the deployment of prescription drug monitoring programs (PDMPs). These electronic data systems present data on the prescription history of controlled drugs to providers and are now in use in almost every state (40). They have been shown to reduce prescription rates of opioids and increase provider comfort in prescribing opioids, as providers can be reassured that they are not enabling risky opioid-dependency–related behaviors such as doctor shopping or receiving multiple overlapping prescriptions (45, 46).

These strategies are reactive rather than proactive; they target individuals who have already begun opioid treatment and have likely developed dependency. Our models complement these policies by providing an opportunity to predict high-risk prescriptions among the larger population of patients based on their characteristics and health histories before an opioid prescription is given for the first time. The models can be applied to the broader population of Medicaid enrollees, alerting physicians to possible risk when an opioid prescription is being considered, along with, for example, risk indicators of existing dependency from prior opioid prescription patterns.

Our models and hypothetical policy aim to prevent dependency before it occurs. This is complementary to existing efforts and could make use of the infrastructure already in place, such as the PDMPs. For example, a PDMP could implement our modeling approach to show providers a risk categorization for all patients (e.g., a red, yellow, or green indicator for predicted risk). This could increase information available to providers, expand the population covered by the PDMP, and help providers consider the benefits and risks of initiating opioid therapy with a new patient.

The information policy could be implemented without disclosing particular and potentially sensitive information about the individual not known to the physician. By determining a threshold based on rough high-risk/low-risk categories, it may be possible to both protect privacy and communicate valuable information to support health care professionals in determining the best course of treatment for their patients. For example, the mean rate of prior incarceration in the top two risk deciles is 9.7%, implying that being in the highest-risk deciles does not imply an individual is highly likely to have a prior criminal record.

Moreover, diversion costs may be small and effectiveness relatively high as the number of opioid prescriptions will be reduced, reducing the probability of unintended dependency. Once dependency occurs, MAT typically costs $6,552 to $14,112 annually (47) and is estimated to be effective 50% of the time (42). This means that for 1,000 individuals, it would cost $5.7 to $12.3 million over 3 y to bring 88% into remission (SI Appendix, Table S9). Prevention is lower cost than treatment and can reduce treatment costs going forward by decreasing dependency rates. Given that PDMP platforms have been deployed in most states, distribution channels exist for converting the government’s own data into actionable intelligence accessible by physicians.

A limitation of our models is that they are trained on data from individuals to whom a physician decided to give an opioid prescription. We do not observe the cases where a patient requested an opioid prescription or had a condition that was treatable by opioid therapy, but the physician decided not to give an opioid prescription. In this sense, our models face a “selective labels problem” (48, 49), in which the data that can be observed are determined by prior human decisions whose decision rules are not known and may respond to the policy once implemented. For example, if, given the publicity of the opioid crisis, some physicians decreased opioid prescriptions, having a risk indicator could lead them to increase overall prescribing if they now feel more confident to prescribe given a low-risk indicator. Any implementation of a prescription restriction policy based on a predictive model should be accompanied by a causal analysis of impact. For example, assume the information policy is rolled out through the PDMP to a treatment of group of physicians or providers, but not to a control group. The causal impact could then be estimated for high- and low-risk patients, allowing inference on heterogeneous changes in prescribing behavior across types of physicians for patients with high-predicted versus low-predicted baseline risk to uncover how physician decision rules adapt to information. This could then support further improvements to the predictive model, for example predicting diversion success incorporating physician responses.

Our definition of adverse outcomes is limited by the accuracy of diagnosis codes in the Medicaid records. Prior studies have found that opioid-related diagnoses can be underreported because of their potential stigma. Although it is unknown precisely what fraction of opioid use disorders go undiagnosed, Carrell et al. (50) found that diagnosis codes were missing for as many as one-quarter of patients for whom their providers were aware of opioid abuse. Similarly, a study by Barocas et al. (51) estimated that only 44% of individuals with opioid use disorder were identified as such in claims and administrative records. To address this limitation, we added an adverse outcome based on procedure codes for the treatment of opioid use disorder, which could indicate an adverse outcome even in the absence of a diagnosis.

Including treatment as an indicator of adverse outcomes is also a limitation. As noted in prior work, receiving treatment for an opioid use disorder is a positive outcome conditional on already having a disorder (51, 52). However, the goal of this study is to suggest opportunities for prevention by examining whether individuals at a high risk of developing an adverse outcome can be identified with confidence before they are given a prescription using administrative data. This complements important research being done on successfully treating opioid use disorders after they have occurred (53).

Rhode Island has a research data lake that enables predictive modeling using cross-agency data. While any state or county could develop a similar research data lake (14, 15), restricting our predictive model to use only Medicaid claims and enrollment data yields nearly the same accuracy as models using integrated, cross-agency data. This is because, in the case of opioid dependency, Medicaid claims data contain many variables correlated with key predictors found in non-Medicaid data. For example, Medicaid enrollment data contain information on prior incarceration through payer codes related to receipt of health services while incarcerated, indicating an incarceration in the base period. They also contain data on demographics, family structure, and income from the application process. SI Appendix, Figs. S5–S7 replicate Figs. 13 using only data from Medicaid in the predictive model, with minimal changes in the results.

That being said, all models achieve an AUC near 0.800, indicating they have strong predictive power but could still be improved. While the Rhode Island data lake is uniquely rich in the connected and anonymized administrative records it holds, it contains only medical claims records from Medicaid. Those receiving a first prescription outside of Medicaid and developing a dependency diagnosed in Medicaid records, or vice versa, will cause decreased predictive accuracy in our model. Expanding the data to include, for example, state-wide electronic health records to examine impact on predictive fit, false positive rates, fairness, and cost–benefit analysis of diverting opioid prescriptions from those predicted to have high dependency risk is an important topic for future research.

Conclusion

The opioid epidemic is a complex public health challenge that requires policy solutions spanning prevention to treatment and recovery. Our results demonstrate the feasibility of an approach to prevention based on intervening with high-risk initial prescriptions through predictive modeling. Our data-driven, machine-learning approach to modeling adverse outcome risk provides insights into the benefits, costs, and fairness of policies limiting opioid prescriptions. Intervening at the earliest stage, before an individual receives an initial opioid prescription, has the potential to prevent future treatment costs and recovery challenges and, ultimately, the life-long consequences of opioid use disorders.

Supplementary Material

Supplementary File
pnas.1905355117.sapp.pdf (396.6KB, pdf)

Acknowledgments

We thank the Smith Richardson Foundation and the Laura and John Arnold Foundation for financial support. We thank Miraj Shah for contributions to the project; Tom Corderre, Brandon Marshall, Susan Athey, and participants at the National Bureau of Economic Research Conference on Machine Learning in Healthcare for helpful comments; and the Office of the Governor of Rhode Island and the Rhode Island Executive Office of Health and Human Services for supporting research to improve fact-based policymaking.

Footnotes

Competing interest statement: J.S.H. is a scholar on leave visiting Amazon Inc. during the 2018 to 2020 academic years, but is not working on projects that directly relate to the subject matter of this study in that role.

This article is a PNAS Direct Submission.

Data deposition: The analysis code for this article has been deposited in GitHub, https://github.com/ripl-org/predict-opioids.

*This includes both opioid and heroin poisoning. See SI Appendix, section 2C for details.

For example, a major health insurer’s effort to reduce extended-release oxycodone prescription by requiring prior authorization led to an increase in the rate of short-acting opioid prescriptions and no overall change in the total morphine milligram equivalents prescribed (43).

Abuse-deterrent reformulations of prescription opioids were developed to make it more difficult to crush or dissolve pills to release the drug more quickly. Unfortunately, recent evidence suggests that the introduction of abuse-deterrent prescription opioids into the market caused opioid abusers to substitute away from prescription opioids to heroin, with differential increases in fatal heroin poisonings (44).

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.1905355117/-/DCSupplemental.

References

  • 1.Kaplan S., C.D.C. reports a record jump in drug overdose deaths last year (2017). NY Times, 4 November 2017, Section A, p. 11.
  • 2.Vowles K. E., et al. , Rates of opioid misuse, abuse, and addiction in chronic pain: A systematic review and data synthesis. Pain 156, 569–576 (2015). [DOI] [PubMed] [Google Scholar]
  • 3.Porter J., Jick H., Addiction rare in patients treated with narcotics. N. Engl. J. Med. 302, 123 (1980). [DOI] [PubMed] [Google Scholar]
  • 4.Shei A., et al. , Sources of prescription opioids among diagnosed opioid abusers. Curr. Med. Res. Opin. 31, 779–784 (2015). [DOI] [PubMed] [Google Scholar]
  • 5.Pearson A., Moman R., Moeschler S., Eldrige J., Hooten W. M., Provider confidence in opioid prescribing and chronic pain management: Results of the opioid therapy provider survey. J. Pain Res. 10, 1395–1400 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.White A. G., Birnbaum H. G., Schiller M., Tang J., Katz N. P., Analytic models to identify patients at risk for prescription opioid abuse. Am. J. Manag. Care 15, 897–906 (2009). [PubMed] [Google Scholar]
  • 7.Sullivan M. D., et al. , Risks for possible and probable opioid misuse among recipients of chronic opioid therapy in commercial and Medicaid insurance plans: The TROUP study. Pain 150, 332–339 (2010). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Palmer R. E., et al. , The prevalence of problem opioid use in patients receiving chronic opioid therapy: Computer-assisted review of electronic health record clinical notes. Pain 156, 1208–1214 (2015). [DOI] [PubMed] [Google Scholar]
  • 9.Yang Z., et al. , Defining risk of prescription opioid overdose: Pharmacy shopping and overlapping prescriptions among long-term opioid users in Medicaid. J. Pain 16, 445–453 (2015). [DOI] [PubMed] [Google Scholar]
  • 10.Brat G. A., et al. , Postsurgical prescriptions for opioid naive patients and association with overdose and misuse: Retrospective cohort study. BMJ 360, j5790 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Dufour R., et al. , Understanding predictors of opioid abuse: Predictive model development and validation. Am. J. Pharm. Benefits 6, 208–216 (2014). [Google Scholar]
  • 12.Hylan T. R., et al. , Automated prediction of risk for problem opioid use in a primary care setting. J. Pain 16, 380–387 (2015). [DOI] [PubMed] [Google Scholar]
  • 13.Florence C. S., Zhou C., Luo F., Xu L., The economic burden of prescription opioid overdose, abuse, and dependence in the United States, 2013. Med. Care 54, 901–906 (2016). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Hastings J. S., Fact-Based Policy: How Do State and Local Governments Accomplish It? (The Hamilton Project, Brookings Institution, Washington, DC, 2019) Policy Proposal 2019-01. [Google Scholar]
  • 15.Hastings J. S., Howison M., Lawless T., Ucles J., White P., Unlocking data to improve public policy. Commun. ACM 62, 48–53 (2019). [Google Scholar]
  • 16.Hastings J. S., Howison M., Inman S. E.. 2020. Analysis code for: Predicting high-risk opioid prescriptions before they are given. https://github.com/ripl-org/predict-opioids [Accessed 3 January 2020]. [DOI] [PMC free article] [PubMed]
  • 17.Doshi-Velez F., Kim B., Towards a rigorous science of interpretable machine learning (2017). arXiv:1702.08608 (27 February 2017).
  • 18.Bach F. R., “BOLASSO: Model consistent LASSO estimation through the bootstrap” in Proceedings of the 25th International Conference on Machine Learning (Association for Computing Machinery, New York, NY, 2008), pp. 33–40. [Google Scholar]
  • 19.Hochreiter S., Schmidhuber J., Long short-term memory. Neural Comput. 9, 1735–1780 (1997). [DOI] [PubMed] [Google Scholar]
  • 20.Lee C., Sharma M., Kantorovich S., Brenton A., A predictive algorithm to detect opioid use disorder: What is the utility in a primary care setting? Health Serv. Res. Managerial Epidemiol. 5, 1–8 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Agarin T., Trescot A., Agarin A., Lesanics D., Decastro C., Reducing opioid analgesic deaths in America: What health providers can do. Pain Phys. 18, E307–E322 (2015). [PubMed] [Google Scholar]
  • 22.Centers for Disease Control and Prevention , “Patient review & restriction programs: Lessons learned from state Medicaid programs” (Tech. Rep. CS240524, Centers for Disease Control and Prevention, Atlanta, GA, 2013).
  • 23.Gaskin D. J., Richard P., The economic costs of pain in the United States. J. Pain 13, 715–724 (2012). [DOI] [PubMed] [Google Scholar]
  • 24.Califf R. M., Woodcock J., Ostroff S., A proactive response to prescription opioid abuse. N. Engl. J. Med. 374, 1480–1485 (2016). [DOI] [PubMed] [Google Scholar]
  • 25.Chang A. K., Bijur P. E., Esses D., Barnaby D. P., Baer J., Effect of a single dose of oral opioid and nonopioid analgesics on acute extremity pain in the emergency department: A randomized clinical trial. J. Am. Med. Assoc. 318, 1661–1667 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 26.Shimoni Z., Varon D., Froom P., Minimal use of opioids for pain relief in an internal medicine department. South. Med. J. 111, 288–292 (2018). [DOI] [PubMed] [Google Scholar]
  • 27.Elsesser K., Cegla T., Long-term treatment in chronic noncancer pain: Results of an observational study comparing opioid and nonopioid therapy. Scand. J. Pain 17, 87–98 (2017). [DOI] [PubMed] [Google Scholar]
  • 28.Tseng Y. Y., Verhoef E. T., Value of time by time of day: A stated-preference study. Transp. Res. Part B 42, 607–618 (2008). [Google Scholar]
  • 29.Lam T. C., Small K. A., The value of time and reliability: Measurement from a value pricing experiment. Transp. Res. Part E 37, 231–251 (2001). [Google Scholar]
  • 30.Brownstone D., Ghosh A., Golob T. F., Kazimi C., Drivers’ willingness-to-pay to reduce travel time: Evidence from the San Diego I-15 congestion pricing project. Transp. Res. Part A 37, 373–387 (2003). [Google Scholar]
  • 31.U.S. Census Bureau , Table S2001 - Earnings in the past 12 months (in 2017 inflation-adjusted dollars) (2013-2017 American Community Survey 5-Year Estimates). https://data.census.gov/cedsci/table?q=s2001&table=S2001&tid=ACSST5Y2017.S2001. Accessed 3 January 2020.
  • 32.Chou R., et al. , Clinical guidelines for the use of chronic opioid therapy in chronic noncancer pain. J. Pain 10, 113–130.e22 (2009). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 33.Becker G. S., Murphy K. M., A theory of rational addiction. J. Political Econ. 96, 675–700 (1988). [Google Scholar]
  • 34.Gruber J., Köszegi B., Is addiction “rational”? Theory and evidence. Q. J. Econ. 116, 1261–1303 (2001). [Google Scholar]
  • 35.State of Ohio, Complaint, State of Ohio v. Purdue Pharma L.P.” (Case No. 17 CI 000261, Common Pleas Court of Ross County, Ohio, 2017).
  • 36.Hardt M., Price E., Srebro N., “Equality of opportunity in supervised learning” in Proceedings of the 30th International Conference on Neural Information Processing Systems, Lee D. D., von Luxburg U., Garnett R., Sugiyama M., Guyon I., Eds. (Curran Associates Inc, Red Hook, NY, 2016), pp. 3323–3331. [Google Scholar]
  • 37.Kleinberg J., Mullainathan S., Raghavan M., “Inherent trade-offs in the fair determination of risk scores” in Proceedings of the 8th Innovations in Theoretical Computer Science Conference (ITCS 2017), Papadimitriou C. H., Ed. (Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 2017), Vol. 67, pp. 43:1–43:23. [Google Scholar]
  • 38.Barry C. L., Fentanyl and the evolving opioid epidemic: What strategies should policy makers consider? Psychiatr. Serv. 69, 100–103 (2017). [DOI] [PubMed] [Google Scholar]
  • 39.Mohlman M. K., Tanzman B., Finison K., Pinette M., Jones C., Impact of medication-assisted treatment for opioid addiction on Medicaid expenditures and health services utilization rates in Vermont. J. Subst. Abus. Treat. 67, 9–14 (2016). [DOI] [PubMed] [Google Scholar]
  • 40.Volkow N. D., McLellan A. T., Opioid abuse in chronic pain — misconceptions and mitigation strategies. N. Engl. J. Med. 374, 1253–1263 (2016). [DOI] [PubMed] [Google Scholar]
  • 41.Mojtabai R., Mauro C., Wall M. M., Barry C. L., Olfson M., Medication treatment for opioid use disorders in substance use treatment facilities. Health Aff. 38, 14–23 (2019). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 42.Weiss R. D., Rao V., The prescription opioid addiction treatment study: What have we learned. Drug Alcohol Depend. 173, S48–S54 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.Barnett M. L., et al. , A health plan’s formulary led to reduced use of extended-release opioids but did not lower overall opioid use. Health Aff. 37, 1509–1516 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 44.Alpert A., Powell D., Pacula R. L., Supply-side drug policy in the presence of substitutes: Evidence from the introduction of abuse-deterrent opioids. Am Econ J Econ Policy 10, 1–35 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 45.Lin D. H., et al. , Physician attitudes and experiences with Maryland’s prescription drug monitoring program (PDMP). Addiction 112, 311–319 (2017). [DOI] [PubMed] [Google Scholar]
  • 46.Wen H., Schackman B. R., Aden B., Bao Y., States with prescription drug monitoring mandates saw a reduction in opioids prescribed to Medicaid enrollees. Health Aff. 36, 733–741 (2017). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.National Institute on Drug Abuse , How much does opioid treatment cost? https://www.drugabuse.gov/publications/research-reports/medications-to-treat-opioid-addiction/how-much-does-opioid-treatment-cost. Accessed 4 August 2019.
  • 48.Lakkaraju H., Kleinberg J., Leskovec J., Ludwig J., Mullainathan S., “The selective labels problem: Evaluating algorithmic predictions in the presence of unobservables” in Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (Association for Computing Machinery, New York, NY, 2017), pp. 275–284. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 49.Kleinberg J., Lakkaraju H., Leskovec J., Ludwig J., Mullainathan S., Human decisions and machine predictions. Q. J. Econ. 133, 237–293 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Carrell D. S., et al. , Using natural language processing to identify problem usage of prescription opioids. Int. J. Med. Inform. 84, 1057–1064 (2015). [DOI] [PubMed] [Google Scholar]
  • 51.Barocas J. A., et al. , Estimated prevalence of opioid use disorder in Massachusetts, 2011–2015: A capture–recapture analysis. Am. J. Public Health 108, 1675–1681 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 52.Hadland S. E., et al. , Receipt of timely addiction treatment and association of early medication treatment with retention in care among youths with opioid use disorder. JAMA Pediatr. 172, 1029–1037 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 53.Green T. C., et al. , Postincarceration fatal overdoses after implementing medications for addiction treatment in a statewide correctional system. JAMA Psychiatry 75, 405–407 (2018). [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

Supplementary File
pnas.1905355117.sapp.pdf (396.6KB, pdf)

Data Availability Statement

Data are available through individual data-sharing agreements with each of the following Rhode Island agencies and municipal police departments: RI Department of Corrections, RI Department of Labor and Training, RI Executive Office of Health and Human Services, RI State Police, Central Falls Police Department, Cranston Police Department, Cumberland Police Department, Middletown Police Department, Narragansett Police Department, Providence Police Department, Warwick Police Department, and Woonsocket Police Department. Email hhipnas2020@ripl.org for information on how to request data for replication from the respective state agencies. Analysis code is available from GitHub at https://github.com/ripl-org/predict-opioids.


Articles from Proceedings of the National Academy of Sciences of the United States of America are provided here courtesy of National Academy of Sciences

RESOURCES