Skip to main content
Public Health Reports logoLink to Public Health Reports
. 2021 Sep 20;137(5):944–954. doi: 10.1177/00333549211038323

A Scoping Review of Data Sources for the Conduct of Policy-Relevant Substance Use Research

Kimberley H Geissler 1,, Elizabeth A Evans 1, Julie K Johnson 2, Jennifer M Whitehill 1
PMCID: PMC9379843  PMID: 34543133

Abstract

Objective

Existing administrative and survey data are critical for understanding the effects of exigent policies on population health outcomes related to opioid, cannabis, and other substance use disorders (SUDs). The objective of this study was to determine the state of the data available for evaluating SUD-related health outcomes.

Methods

We performed a scoping review of national and state government data sources to measure and evaluate the effects of state policy changes on substance use and SUD-related health outcomes and health care use. We used Massachusetts as a case study for availability of relevant state-level data as well as national datasets with state-level indicators available to measure outcomes. We compared key features of each dataset to assess their usefulness for research and policy evaluation. We conducted our review during November 2018–March 2019, and we updated data availability as of March 2019 for all data sources.

Results

We identified 11 survey datasets, 12 national administrative datasets, and 10 state administrative datasets as being suitable for policy-relevant research and practice purposes. These datasets varied substantially in their usefulness for evaluation and research. Despite substantial data limitations, including prohibitive regulatory and monetary costs to obtain the data and limited availability, these data can be mined to examine a diversity of policy-relevant questions.

Conclusions

Findings provide a comprehensive resource for using survey and administrative data to evaluate the health effects of SUD-related policies and interventions. The construction of state-level public health data warehouses or record linkage projects connecting individual-level information in state data sources is valuable for analyzing the effects of policy changes. Understanding strengths and limitations of available data sources is important for ongoing research and evaluation.

Keywords: substance use disorders, substance use, administrative data, survey data


Public health interest in understanding the effects of state laws, policies, and regulations on substance use, 1 substance use disorder (SUD),2,3 SUD treatment,4-7 and co-occurring conditions has grown in recent years.8-10 With recent policy interest in opioid use disorder and related overdoses, public health interest in a range of related health outcomes has increased.11,12 Existing administrative and survey data are critical for understanding the effects of policies on health care use and health outcomes related to opioid, cannabis, and SUDs.

State policy changes affect health behavior, health care use, and health outcomes related to SUDs and SUD treatment; most policy changes either occur at the state level and/or have differential effects at the state level on the basis of existing policy and/or differences in implementation. Well-known examples such as legalization of cannabis (“marijuana”),13,14 Medicaid expansion,15,16 parity laws,6,17 prescription drug monitoring programs,15,18 and alcohol taxes influence substance use behaviors and related SUD outcomes. 19 A key challenge of conducting research in this domain is that information on SUD incidence is not captured in existing data or is difficult to access, 20 highlighting the need for a comprehensive understanding of data available to monitor the effect of policies on SUDs and related health care systems and health outcomes.

Previous reviews of data sources that can be used for the conduct of policy-relevant SUD research are limited in scope and purpose. For example, some efforts focused on charting the usefulness for SUD research on surveys, medical records, and program-level administrative data. 21 Other efforts reviewed data access policies 22 and explored the use of administrative databases to assess SUD treatment use, outcomes, and associated costs23,24 and data related to substances of abuse, including opioid use and opioid use disorder25-28 and cannabis use. 1 A limitation of these approaches is that reviews of data sources did not focus on identifying sources that are representative of the general population at the state level—critical for analyzing the effects of state policies—for substance use and SUDs generally, including alcohol use and alcohol use disorder. Understanding the prevalence of SUD and its effect on health care use and health outcomes is important for public health and health policy; recent estimates suggest that 1 in 14 adults in the United States had an SUD in the last year. 29 The 3 most common types of SUDs were alcohol use disorder, cannabis use disorder, and opioid use disorder. Of people with any SUD, 73% had an alcohol use disorder, 22% had a cannabis use disorder, and 8.4% had an opioid use disorder. 29

In this scoping review, we aimed to identify the available data from state (Massachusetts-specific) or national government sources and surveys that can be used to assess the prevalence of SUD and changes in SUD-related health outcomes and health care use for analysis at the state level. Our analysis provides information about available data and allows researchers and policy makers to understand the tradeoffs inherent in the use of various data sources for understanding the prevalence and consequences of SUD.

Methods

We used Massachusetts as a case study for availability of relevant state-level data and the potential for linkages across state data sources. 30 We chose Massachusetts because the state has been a national leader in the availability of state health data through transparency initiatives and policy changes; state-level data available in other states may differ, but this case study provides an example of state-level data likely to be available in many states and advanced datasets possible only through legislative action. 31

We identified relevant national datasets that contain state indicators and Massachusetts-specific datasets by reviewing salient published literature, including previous data reviews and peer-reviewed articles analyzing state policy changes1,24-26,32; conducting interviews with key personnel from Massachusetts government agencies; searching state government data sources; and searching state-contributed data to national datasets. We conducted our review during November 2018–March 2019, and we updated data availability as of March 2019 for all data sources. The University of Massachusetts Human Research Protection Office determined this study was not human subjects research.

A data source was eligible for inclusion if it could be used to examine the effects of substance use and/or SUD on health outcomes and/or the health care system. We defined effects on health outcomes and the health care system broadly to include outcomes of interest for specific substances (eg, medications to treat opioid use disorder, mortality). We were primarily interested in state and federal government data sources rather than private sources because the number and breadth of private sources, combined with their often-varying availability policies for various types of research, were too complex to adequately describe here. We limited inclusion in our review to datasets that can produce estimates or comparisons at the state level, although we did not require that they be able to produce representative prevalence estimates of SUD and/or health or health care–related outcomes of SUD.

We excluded datasets for which the most recent data collection was >5 years ago at the beginning of our analysis (ie, the most recent year is 2013 or later). We excluded datasets that could not identify states and/or produce state estimates, including for Massachusetts. This exclusion criteria excluded certain locally collected datasets, along with national datasets that may allow for comparisons of sets of states (eg, limited to geographic regions) but not an individual state. We also excluded datasets that could identify the effects of substance use or SUD on the health care system and/or health outcomes; for example, datasets that include only information about alcohol use and/or alcohol use disorder that might allow construction of prevalence estimates but do not include any information about treatment and/or alcohol-related outcomes.

After the identification of relevant datasets, we collected comprehensive information on survey and administrative data, including source, cost, inclusion of SUD information, time periods available, sample size, and other key features of the data. We compared and contrasted availability of key measures and noted items that may affect the usefulness of research. We provide a broad overview of the available data sources and refer interested readers to the documentation of each data source as needed for further information. Using Microsoft Excel, we used the descriptive information in the datasets to calculate the number of datasets meeting each criterion. We separated information about datasets into survey data and administrative data, and we stratified data by the level of data collection (ie, national vs state).

Results

Overall, 31 data sources met our inclusion criteria; of these, 11 were surveys (8 national, 3 state) and 20 were administrative data sources (10 national, 10 state). We found a wide range of indicators of interest. In addition, although the details of each data source—including sample, availability, timing, and outcomes—varied substantially, each detail was important for the use of each data source.

Eleven surveys collected information about alcohol, drug, and/or cannabis use and/or SUDs, with additional information about health status, health behaviors, health outcomes, and/or health care use (Table 1).33-43 Of these, 8 were national surveys for which state indicators were available, and 3 were Massachusetts-specific data sources. The primary ways of assessing substance use and SUD were self-reported alcohol, drug, and cannabis use and/or misuse; self-reported alcohol and/or drug treatment; and International Classification of Diseases, Ninth Revision (ICD-9) or International Classification of Diseases, Tenth Revision (ICD-10) diagnosis codes related to SUDs. The surveys varied in the health and health system outcomes that are available, from self-reported health status and health care access or use to verified physician visits and medical use that allow for national estimates of health care use related to these diagnoses. Most of these surveys included information on both adolescents and adults, although sample size limitations were likely at the state level for some age and population groups. Several surveys are conducted annually, but multiple years of data are necessary to create reliable estimates at the state level. We excluded major national surveys, such as the Pregnancy Risk Assessment Monitoring System, in which states could add questions related to substance use (but Massachusetts did not) 44 ; the National Epidemiologic Survey on Alcohol and Related Conditions, which was last conducted in 2012-2013 45 ; and Monitoring the Future, which cannot be used to produce state-level estimates. 46

Table 1.

Survey data available at the state and national levels to monitor the effects of substance use disorder (SUD) on health systems, United States (data availability as of 2019)

Data name [source] Population included Age range, y No. of people Years, frequency Example outcomes measured Substance use/SUD measures Access and cost
National data with state-specific indicators
Behavioral Risk Factor Surveillance System (BRFSS) [CDC] 33 US residents (civilian, noninstitutionalized) ≥18 400 000 people per year 1984-2017, annual Self-reported health status, health care access and use, chronic conditions Self-reported alcohol use; marijuana use collected in state-selected module (some states) Public use, free
Medical Expenditure Panel Survey [AHRQ] 34 US residents (civilian, noninstitutionalized) All 33 000 people per year 1996-2017, annual Health care costs and use Clinical Classification System codes related to SUD Public use (restricted use at AHRQ data center required for state indicators)
National Ambulatory Medical Care Survey [CDC] 35 Outpatient physician visits All 15 000 visit records 1973-2016, annual Visit-specific health care use and related outcomes (eg, reason for visit, prescriptions) ICD-9, ICD-10 codes related to SUD, co-occurring SUD, prescriptions for SUD treatment Public use, free (state indicators not available for all states, restricted use at NCHS RDC for all states)
National Hospital Ambulatory Medical Care Survey [CDC] 36 ED visits All 21 000 visit records 1992-2016, annual Visit-specific health care use and related outcomes ICD-9/ICD-10 codes related to SUD, co-occurring SUD, prescriptions for SUD treatment Public use, free (restricted use at NCHS RDC required for state indicators)
National Survey on Drug Use and Health [SAMHSA] 37 US residents (civilian, noninstitutionalized) ≥12 65 000 people 1971-2017, annual ED visits; SUD treatment or counseling Self-reported substance use, SUD, SUD treatment Public use, free (restricted use at NCHS RDC required for state indicators)
National Health and Nutrition Examination Survey [CDC] 38 US residents (civilian, noninstitutionalized) All 5000 people 1999-2016, annual (with 2-year panels) Self-reported health outcomes, health care use, drug treatment, health indicators Self-reported alcohol and drug use Public use, free (limited use at NCHS RDC required for state indicators and drug use questions for children)
National Health Interview Survey [CDC] 39 US residents (civilian, noninstitutionalized) ≥18 35 000 households 1963-2017, annual Self-reported health outcomes, disability, health care use, drug treatment, health indicators Self-reported alcohol and drug use disorder Public use, free (limited use at NCHS RDC required for state indicators)
Youth Risk Behavior Surveillance System [CDC] 40 High school students Grades 9-12 15 000 students 1991-2017, odd years Self-reported health status, health behaviors Self-reported alcohol, drug, and marijuana use Public use, free
State-specific data (Massachusetts)
Massachusetts BRFSS [CDC and MA DPH]33,41 Massachusetts residents (civilian, noninstitutionalized) ≥18 7000 people 1984-2017, annual Self-reported health status, health care access and use, chronic conditions Self-reported alcohol, drug, and marijuana use Limited use, free (requires application process)
Massachusetts Marijuana Baseline Health Study [MA DPH] 42 Massachusetts adults (civilian, noninstitutionalized) ≥18 3000 people 2017, one-time ED and urgent care visits Self-reported marijuana use No information available
Massachusetts Youth Risk Behavior Survey [MA DPH and CDC]40,43 Massachusetts high school students 13-18 3300 students 2007-2017, odd years Health behaviors, suicide attempts that required treatment Self-reported alcohol, drug, and marijuana use Limited use, free (requires application process)

Abbreviations: AHRQ, Agency for Healthcare Research and Quality; CDC, Centers for Disease Control and Prevention; ED, emergency department; ICD-9, International Classification of Diseases, Ninth Revision; ICD-10, International Classification of Diseases, Tenth Revision; MA DPH, Massachusetts Department of Public Health; MDH, Massachusetts Department of Health; NCHS, National Center for Health Statistics; RDC, Research Data Center.

We identified 10 administrative data sources that were available nationally and contained relevant information (Table 2).47-56 These national data sources varied in their granularity, outcomes, and methods. Three data sources were related to prescription drugs, and 7 were related to health care use and other related outcomes. Some of these national data sources contain similar, if not duplicative, information combined to various levels of observation; we included them separately if access policies varied substantially (eg, individual-level claims data vs state-level drug prescribing).

Table 2.

National administrative data available to monitor the effects of substance use disorder (SUD) on the health system, United States (data availability as of 2019)

Data name [source] Population included Age range No. of people Years available, frequency Example outcomes measured Substance use/SUD measures Access and cost
Prescription drug data
Medicare provider utilization and payment data: Part D prescriber public use file [CMS] 47 Prescriber-level (National Provider Identifier) data for prescription drugs paid by Medicare Part D Medicare eligible All prescribers for Part D beneficiaries 2012-2017, annual Number of prescriptions, unique beneficiaries, total costs Related prescriptions, including opioids and SUD treatment Public use, free
Medicare Part D prescription drug claims data [CMS] 48 People covered by Medicare fee-for-service and Medicare Advantage health insurance Medicare eligible 25 million 2006-2019, annual (or more frequent) Prescription drug Medicare claims Related prescriptions, including opioids and SUD treatment Limited use, substantial costs associated (requires application process)
State drug utilization data [CMS] 49 People covered by Medicaid Medicaid eligible Millions 1991-2019, annual Drug use, costs Related prescriptions, including opioids and SUD treatment Public use, free
Health care use and other outcomes
Medicaid Analytic eXtract [CMS] 50 People covered by Medicaid in reporting states All Millions 1999-2014, annual (or more frequent) Health care use from health insurance claims ICD-9/ICD-10 codes related to SUD, co-occurring SUD, prescriptions for SUD treatment Limited use, substantial costs associated (requires application process)
Medicare claims data [CMS] 51 People covered by Medicare fee-for-service and Medicare Advantage health insurance Medicare eligible 44 million (5%-20% files commonly available for research) 1999-2017, annual (or more frequent) (Medicare fee-for-service)
2015, annual (or more frequent) (Medicare Advantage)
Health care use from health insurance claims ICD-9/ICD-10 codes related to SUD, co-occurring SUD, prescriptions for SUD treatment Limited use, substantial costs associated (requires application process)
National Poison Data System [American Association of Poison Control Centers] 52 All poison-related calls managed by poison control centers All Millions 2012-2019, near-real time Poison exposure and information calls Type of poison, including alcohol and drugs such as cannabis Limited use, substantial costs associated for some applicants (state-level data require application)
National Vital Statistics System [CDC] 53 All deaths nationally All ages Millions 1929–present, annual Death Substance use–related cause of death Public use, free (restricted use at NCHS Research Data Center for individual-level data)
State Emergency Department Databases [AHRQ] 54 All ED discharges, comparable with other available states All 2.5 million discharges in Massachusetts 2002-2016, annual ED visits and associated charges and procedures ICD-9/ICD-10 related to SUD, co-occurring SUD Limited use, costs associated (requires application process)
State Inpatient Databases [AHRQ] 55 All inpatient discharges, comparable with other states All 800 000 discharges in Massachusetts 2002-2016, annual Inpatient discharges and associated charges ICD-9/ICD-10 related to SUD, co-occurring SUD Limited use, costs associated (requires application process)
Treatment Episode Data Set [SAMHSA] 56 Client-level data for substance abuse treatment admissions/discharges ≥12 2 million admissions 1992-2017, annual SUD treatment Primary substance for which individual is receiving treatment Public use, free

Abbreviations: AHRQ, Agency for Healthcare Research and Quality; CDC, Centers for Disease Control and Prevention; CMS, Centers for Medicare & Medicaid Services; ED, emergency department; ICD-9, International Classification of Diseases, Ninth Revision; ICD-10, International Classification of Diseases, Tenth Revision; NCHS, National Center for Health Statistics; SAMHSA, Substance Abuse and Mental Health Services Administration.

The 7 datasets related to health care use and other outcomes included health insurance claims data for various populations, death data, poison control calls, and treatment discharges from hospitals and substance use treatment centers. These datasets contain some national data with state indicators and some national data sources that are available for each state (eg, State Inpatient Database, State Emergency Department Databases). We excluded data such as the National Emergency Medical Services Information System, which required individual state permission to obtain state identifiers, because this process is likely to be onerous for a national sample and the data are otherwise unable to produce estimates for individual states.

We identified 10 administrative data sources available at the state level containing relevant information (Table 3).57-66 Only 1 of these data sources was related to prescription drugs, and the rest were related to health care use and other related outcomes. Some of the state-level data (eg, Massachusetts CaseMix data) are contributed to national datasets; these national datasets further de-identify individual records and make the available data comparable across states. Accessing these datasets at the state level can sometimes have benefits, such as the availability of more detailed information or more recent data. Massachusetts has an all-payer claims database, including Medicaid data upon request/application, which is a useful resource for understanding the effects of state policies on the full population, including populations with high rates of SUD. The other major contribution of Massachusetts state-specific data that may not be available in other states, but may be particularly useful for research, was the Massachusetts Public Health Data Warehouse. This database contains more than 20 datasets available from state government agencies linked at the individual level. It can be used to identify individuals and their SUD-related encounters and/or treatment across a large number of agencies, and it includes information about social factors such as homelessness and criminal justice encounters. Currently, this dataset is only available via special agreement with the Massachusetts Department of Public Health for analyses that relate to priority topic areas for the agency.

Table 3.

Massachusetts administrative data available to monitor the effects of substance use disorder (SUD) on health systems, United States (data availability as of 2019)

Data name [source] Population included Age range No. of people Years available Example outcomes measured Substance use/SUD measures Access and cost
Prescription drug data
Massachusetts Prescription Monitoring Program [MA DPH] 57 Tracks statewide schedule II-V prescriptions All Thousands 2013-2018 People receiving multiple prescriptions Related prescriptions, including opioids Limited use, free (requires application process)
Health care use and other outcomes
Massachusetts All Payer Claims Database [CHIA] 58 All insured people covered by reporting insurers All Millions 2013-2017, annual (or more frequent) All health care use and associated paid amounts ICD-9/ICD-10 codes related to SUD, co-occurring SUD, prescriptions for SUD treatmenta Limited use, costs associated (requires application process)
Massachusetts Ambulance Trip Record Information System [Massachusetts Ambulance Services, MA DPH] 59 All patients transported by ambulance All Millions 2011-2017, annual (or more frequent) Ambulance trips, overdoses Reason for transport related to substance use Limited use, free (requires application process)
Massachusetts Case Mix data [MA CHIA] 60 All inpatient, ED, and outpatient observation discharges All Millions 2000-2017, annual (or more frequent) Visits, discharges, and associated charges ICD-9/ICD-10 codes related to SUD, co-occurring SUD Limited use, costs associated (requires application process)
Massachusetts Mental Health Information System [MA DMH] 61 People admitted to MA DMH facilities and programs All Thousands annually 2004-2018, annual (or more frequent) Mental illness admissions Secondary ICD-9/ICD-10 codes for SUD Limited use, free (requires application process)
Massachusetts Public Health Data Warehouse [MA DPH] 62 All people in Massachusetts All Millions annually Unknown, annual (or more frequent) Health care use, criminal justice, death, homelessness ICD-9/ICD-10 codes related to SUD, co-occurring SUD, prescriptions for SUD treatment Limited use, free (requires application process for specific purposes defined by MA DPH priorities)
Massachusetts Registry of Vital Records and Statistics [MA DPH] 63 All deaths in the state of Massachusetts All 250 000 1999-2018, annual (or more frequent) Death Substance use–related cause of death Limited use, free (requires application process)
Massachusetts and Rhode Island Regional Center for Poison Control and Prevention [Massachusetts/Rhode Island Regional Center for Poison Control and Prevention] 64 All poison-related calls managed by the regional center All >46 000 calls annually 2009-2018, annual (or more frequent) Poison exposure and information calls Type of poison, including alcohol and drugs such as cannabis Limited use (requires application process)
Massachusetts State Trauma Registry [MA DPH] 65 Patients with traumatic injuries receiving emergency services at designated state trauma centers All Thousands 2008-2015, annual (or more frequent) Trauma use, health outcomes Secondary ICD-9/ICD-10 SUD codes; markers of intoxication Limited use, free (requires application process)
PHIT data: BSAS substance addiction treatment data [MA DPH] 66 People admitted into SUD treatment programs ≥15 y 109 000 2008-2017, annual (or more frequent) SUD treatment Primary substance for which individual is receiving treatment Limited use, free (requires application process)

Abbreviations: BSAS, Bureau of Substance Addiction Services; ED, emergency department; ICD-9, International Classification of Diseases, Ninth Revision; ICD-10, International Classification of Diseases, Tenth Revision; MA CHIA, Massachusetts Center for Health Information and Analysis; MA DMH, Massachusetts Department of Mental Health; MA DPH, Massachusetts Department of Public Health; PHIT, Population Health Information Tool.

Discussion

In this scoping review, we found a large number of national and state-specific data sources that can be used to measure the effects of state policy changes on substance use and SUD and the effects of SUD on the health care system and/or health outcomes. These survey and administrative data sources vary substantially in key features, including availability, inclusion of key items, and state-specific factors. We found numerous Massachusetts-specific datasets that are likely to be available in other states, but the availability of these datasets is not guaranteed and consistency of data collection and comparability is an issue.1,25,67

Our findings provide a comprehensive resource for using survey and administrative data to evaluate the effects of substance use and SUD-related policies and interventions on health and health systems. Understanding the strengths and limitations of these data is important for research related to policy changes. Ongoing monitoring and research on the effects of state policy on substance use are critical, and a substantial amount of secondary data are available, mostly from state government sources. Although these state-specific data sources can be difficult to find and compare across states, they are of high value for evaluating the impact of SUD on health care use, health behaviors, and health outcomes. However, even within states, changing availability of data sources and use policies can make it difficult to plan and fund analyses, particularly because multiyear longitudinal data and the ability to plan studies several years in advance are often necessary in research and policy evaluation. The state and national administrative datasets available at the individual level may be of particular use to researchers because they allow for detailed analysis, including among vulnerable population groups for whom an adequate sample size may not be available in population-based surveys.

We identified a large number of secondary data sources, although primary data collection is often still used to collect information on specific populations or topics or to evaluate the effects of a certain small-scale intervention. Of note, primary data collection can be expensive and often lacks information over time, which is important for understanding the effects of policy changes. Some of the existing data sources have notable limitations, and some datasets that were previously available to help triangulate data on substance use and SUD (eg, Arrestee Drug Abuse Monitoring system and the Drug Abuse Warning Network) are no longer being updated.37,68 A common limitation across data sources was the delay between data collection or events and data availability to researchers. We found a delay of 2 or more years for 27 of the 31 data sources reviewed. This lag substantially reduces the usefulness of data for regulators, evaluators, and researchers who are interested in understanding the effect of policy changes on outcomes in near-real time.

Obtaining secondary data related to substance use and SUD outcomes is complex, particularly given changes in regulatory policy and enforcement. 69 Legal issues can limit data sharing for SUD-related treatment services among clinical providers and, in 2013, the Substance Abuse and Mental Health Services Administration issued an advisory notice to the Centers for Medicare & Medicaid Services (CMS) that suppressed data on SUD claims for research use. 20 A legal change occurred in 2017 that allowed for the addition of these claims to CMS-provided data, 70 but some data providers (eg, all-payer claims databases) still suppress SUD claims in various ways. 58 In addition to changes in what is available in the data sources, changes in mental health and SUD treatment funding sources over time,71,72 including changes in being funded by the state to being funded by health insurance, may affect the interpretation of data over time. In addition, researchers and policy makers often want to understand the effects of policy changes on populations at high risk of SUD, such as people with serious mental illness or young adults. The inclusion of these populations with adequate sample sizes is of ongoing concern, particularly in survey data.

We found secondary datasets that were useful and commonly used for SUD research, and we identified substantial limitations to many of these datasets. SUD-related data in the survey and administrative data we examined have been operationalized differently by regulatory agencies and data providers, which creates inconsistencies in data definitions and data availability across sources and over time. These inconsistencies can create difficulties in evaluating the effects of new policies on health. For example, the inclusion of ICD-9 and ICD-10 codes within administrative data is used for identifying people with SUD and their treatment patterns for SUD and other co-occurring conditions. The use of these ICD-9 and ICD-10 codes is common in SUD morbidity and treatment research, but the change from ICD-9 to ICD-10 codes in October 2015 73 may make it difficult to separate the effects of coding changes from other state policy changes implemented during the same period. In addition, ICD-10 codes have been used in mortality data for a longer period than in treatment data, thereby complicating comparisons between SUD morbidity and mortality. 73 In addition, limited research has examined the sensitivity and specificity of SUD coding practices by clinicians to understand the use of diagnostic codes related to all SUDs, 74 including, for example, cannabis use disorder, which may be gauged to be of lower clinical relevance than opioid use disorder.

Another important factor is the growing recognition of linking data from various sources at the individual level to track outcomes over time. 31 For example, the Massachusetts Public Health Data Warehouse, with linked data from many state government sources, has contributed substantially to a general understanding of the costs and outcomes of the opioid epidemic. 11 Anecdotally, other jurisdictions are linking data at various levels to monitor and address substance use and SUD-related indicators, including the Researched Abuse, Diversion and Addiction-Related Surveillance by the Denver Health and Hospital Authority 75 and the Allegheny County Data Warehouse developed by the Allegheny County Department of Human Services. 76 The use of these data will likely contribute vital knowledge of health and health system effects of SUD, particularly among vulnerable populations, but they are limited by their geographic reach and limited approved use cases by data providers. 31

Several useful secondary datasets were available at the state level. It is difficult to know the extent to which data from various states and/or regions can be compared. State-level comparisons are often used for causal inference, which highlights the importance of national datasets and problems that may arise as some of these national datasets remove state indicators.46,77,78 In addition, if data from multiple states are needed and researchers must go through prolonged processes and pay each state for data, the difficulty and cost of understanding the effects of policy changes will increase. Currently available in 15 states, 79 all-payer claims databases are particularly useful in tracking individuals over time; the Massachusetts all-payer claims database is particularly successful in observing individuals over time, including tracking individuals across health insurers. However, these all-payer claims databases are decreasing in usefulness as the number of self-insured employers (covering approximately 37% of the working-age population)80,81 that report data to the databases decreases. 82

Understanding the local and state context was important for completing this scoping review and is important for researchers seeking to understand the effect of policy changes in a state. Some local data (eg, local surveys, electronic medical records from a health system, local health information exchanges) may be useful in certain scenarios but are not generally systematically collected or available. These local data may be useful for better understanding local effects or context and/or validating state-level findings. State government data sources are particularly important in this respect but may be difficult to find. Knowing that these data sources exist is an important part of identifying the right data source for a particular research question, but monitoring current data availability and data inclusion for each source can be difficult. We examined state government contributions to national datasets as a starting point, and we worked with state government agencies to identify other potential resources. We expect many of these data sources, or data containing similar information, will be available in most states, particularly data that are contributed to national datasets.

Limitations

This research had several limitations. First, we limited the scope of state-related data to Massachusetts, which may have different data availability than other states. Because of several rich data sources (ie, Massachusetts all-payer claims database and the Massachusetts Public Health Data Warehouse), we expect that using Massachusetts as a case study is a more rather than less comprehensive set of data than may be available in other states. Second, because our study was a scoping review, we could not include all potential secondary data sources. We did, however, capture those datasets that are likely to be the most useful for researchers and policy makers for evaluating broad questions related to SUD.

Conclusions

Survey and administrative datasets vary substantially in their usefulness for evaluation and research. Although these data can be mined to examine a diversity of policy-relevant questions, the data have substantial limitations, including prohibitive regulatory and monetary costs to obtain the data and limited availability by state. Additional considerations include differences in systematic implementation across years and states, as well as fidelity of implementation of data collection mechanisms by state agencies. Potential linkages at the individual level among state-level datasets across a large number of state data sources (eg, substance use treatment, health insurance claims, ambulance trips, criminal justice encounters) to create a state-level public health data warehouse can advance our understanding of the health effects of policy, depending on data specifics and data availability.

Acknowledgments

The authors thank Faith English, Isabel Albinger, Kia Kaizer, and Samantha Doonan for research assistance.

Footnotes

Declaration of Conflicting Interests: The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding: The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by the Massachusetts Cannabis Control Commission (ISACNB10700840UMSF19). The content is solely the responsibility of the authors and does not necessarily represent the views of the Massachusetts Cannabis Control Commission.

ORCID iD

Kimberley H. Geissler, Dr.PhD https://orcid.org/0000-0002-7425-1203

References


Articles from Public Health Reports are provided here courtesy of SAGE Publications

RESOURCES