ABSTRACT
Health insurance claims are a rich source of information for health services researchers and can provide evidence to understand issues related to access, efficiency, and effectiveness of care. While numerous studies have examined rehabilitation utilization using Medicare, Medicaid, and/or private insurance claims data, these studies typically lack detail on approaches used to identify rehabilitation services. The primary objectives of this perspective are: (1) to raise awareness of the need for and importance of methodological transparency and openness in rehabilitation-related health services research using claims data and (2) to provide a case example by sharing the details of a method for identifying community-based physical therapy and occupational therapy in Medicare claims. General decisions made in claims-based analyses are discussed and then illustrated with the approach used for identifying community-based therapy claims within the context of a secondary analysis of data from a large, multicenter pragmatic clinical trial. Specific decisions made and challenges encountered are discussed, and recommendations are made for future work in this area. Sharing methodological details, data when possible, and metadata on approaches for conducting rehabilitation-related health services research can enhance its validity, rigor, and—ultimately—overall value. Rehabilitation health services researchers are encouraged to make greater efforts to share information on their methodological approaches using claims data and other data relevant to health services research.
Keywords: Administrative Claims, Cohort Studies, Health Care, Health Services Research, Research Design
INTRODUCTION
Health insurance claims, often referred to as administrative claims, contain rich information that is a valuable resource for health services researchers.1–4While the structure and content of claims data may vary to some degree by payor, they uniformly contain standard information on the utilization of health care services, tests, and procedures that are identified with various types of codes (ie, International Classification of Disease–Tenth Edition [ICD-10 procedure], diagnosis-related group [DRG], revenue center, current procedural terminology [CPT], and health care common procedural coding system [HCPCS] codes).1Claims data have been widely used to analyze health care utilization both cross-sectionally and longitudinally in populations of interest. Compared to self-report data, they may provide a more valid measure of health care use, particularly over extended time periods.5Further, they provide a practical source of data on health care utilization when extended follow-up of study cohorts via primary data collection is impractical. In addition, claims data enable researchers to analyze variation in health care utilization by patient or geographic characteristics such as race and ethnicity or rurality and to examine disparities in health care.6,7Such analyses are of interest to health care providers and policymakers aiming to enhance efficiency, effectiveness, access, and equity of care1,3Finally, these data can facilitate evaluation of relationships between health care utilization and patient outcomes, with or without augmentation from other sources such as electronic health records (EHRs).
While numerous studies have examined rehabilitation utilization using Medicare, Medicaid, and/or private insurance claims data,8–13these studies typically lack detail on approaches used to identify rehabilitation services. Guidelines for standard reporting in this regard are also lacking. We believe that increasing methodological transparency and openness of rehabilitation-related health services research using claims data and data from other sources (eg, EHR) will enhance its validity, rigor, and overall value. In particular, providing additional details of claims-based methods and sharing metadata—and primary data when possible—facilitates more valid comparisons across studies, increases the efficiency of research, and accelerates innovation.
We define methodologic transparency as “….the degree of detail and disclosure about the specific steps, decisions, and judgment calls made during a scientific study.”11Methodological transparency exists along a continuum and is a function of the amount of information researchers share on the numerous choices, decisions, and judgment calls made during the conceptualization and conduct of a study.11We define openness as “…the commitment to publicly and freely share the products and means of scientific investigation, including data, results, theories, models, hypotheses, methods, protocols, materials, and computer code used in processing and analyzing data and images.”12
The objectives of this perspective are to raise awareness of the need for and importance of methodological transparency and openness in rehabilitation-related health services research using administrative claims data and to provide a case example by sharing the details of our method for identifying community-based physical therapy and occupational therapy use in Medicare claims.
METHODS
A general framework for conducting claims-based analysis involves a series of steps, each with methodological decisions. These steps, illustrated in Figure 1, include determining the study design, identifying the cohort, selecting the data files and study years, operationalizing variables, and making analytic decisions. Within each of these steps, decisions must be made based on the specific questions or aims of the study, the available data, and the completeness of the data. Additionally, each of these steps needs to be informed by foundational knowledge of claims data structure, data variables, coding conventions, and clinical contexts.
Figure 1.
General Steps for Conducting Claims-Based Analyses.
Below we introduce our case example and describe each step in Figure 1, first in general and then in the context of our case example. While we have presented the steps in the order of our approach, we recognize that there may be variability in the sequencing of the steps, iteration between steps, or inclusion of additional steps depending on the study question.
Case Example Using Medicare Data
Our case example for identifying community-based physical therapy and occupational therapy utilization in Medicare claims is part of a larger research project examining the use and effectiveness of community-based physical and occupational therapy following stroke. We used data from the Comprehensive Post-Acute Stroke Services (COMPASS) study and linked those data with claims data.13,14The COMPASS study was a pragmatic cluster-randomized controlled trial investigating the effectiveness of a transitional care model in patients with stroke who were discharged home from 40 acute care hospitals in North Carolina (NC).15,16Primary outcomes were assessed 90 days post-discharge, and health care utilization outcomes were measured up to 1 year post-discharge. Our research study required that we determine the extent to which participants in our study received community-based physical or occupational therapy after discharge home following the acute hospitalization for stroke. Our study was determined to be exempt by institutional board review.
Determine Study Design
Choice of study design is informed by the study question of interest. For claims data, these questions encompass disease surveillance, evaluating patterns in utilization of care, performing cost analysis, and conducting outcomes research, among others. Claims-based analyses geared toward evaluating care utilization or outcomes research often utilize retrospective cohort designs that require decisions regarding time intervals for the study. For example, a study examining the effectiveness of an intervention would need to define the time period to assess baseline characteristics of participants, the intervention exposure period, if applicable, and the follow-up period to assess outcomes. Identifying the exposure period is particularly relevant for studies of rehabilitation when care is often delivered over multiple visits. Other types of designs, such as cross-sectional or case–control studies, can be employed with claims data and have their own unique considerations for design.
Our Approach
Guided by our research aims, we designed our study with a 6-month look-back prior to baseline and a 12-month follow-up period (Figure 2). Participant comorbidities and medical history information were assessed from the baseline claim (ie, the index stroke hospitalization) and included health care utilization during the 6-month period prior to the index hospital admission. We identified the intervention period as the 3 months after hospital discharge as we sought to specifically examine the effectiveness of therapy use on 90-day and 1-year outcomes, and this is a critical period for optimal stroke recovery.17We defined the follow-up period as the 12 months after hospital discharge to examine various measures of health care use (eg, hospital readmission, emergency department use).
Figure 2.
Our Study Design.
Cohort Identification
When using claims data to conduct retrospective analyses, cohort identification is often conducted based on ICD-10 diagnosis codes to identify a specific disease-based sample.18,19When taking this approach, decisions must be made about the number of diagnosis fields to examine. Claims data often include multiple diagnosis fields (that may or may not be populated) for a single claim or event. Investigators may choose to identify a cohort based on the presence of a diagnosis in the first or primary diagnosis field or by examining any diagnosis field. A second decision that needs to be made is the set of ICD-10 codes that will be used to identify the particular phenotype; this can be guided by validation studies in the literature about the performance of different coding sets in the ascertainment of phenotypes.20Finally, a decision must be made for whether claims in a single point in time or claims over a specified time period will be used to identify a case. This decision is often highly dependent on the disease of interest. For example, if one is examining outpatient claims to identify individuals with rheumatoid arthritis, looking for consistency in an arthritis diagnosis or procedure in more than 1 claim at different time points may be more valid than identifying the cohort based on a single diagnosis or procedure.21Distinguishing between an incident and a chronic issue may also require examining claims in a look-back period to determine whether the individual is experiencing the problem for the first time. For example, Marrache et al defined new onset low back pain based on a low back pain diagnosis preceded by a 3-month period with no low back pain diagnoses in the claims data.22Cohorts may also be identified outside of the claims data, for example, when prospective studies, disease registries or randomized trials link participants to their insurance claims data,13,23–26and the study participants, rather than the claims data, may serve as the source of information with which the cohort is identified. In these designs, considerations of linkage methods are critical to minimizing selection bias because of failure to link participants to the claims data.14
Our Approach
We used several data variables captured as part of the COMPASS trial (hospital, hospital admission date, hospital discharge date, sex, race, and birthdate) to identify the claim for the index hospital admission for stroke using exact matching and a series of fuzzy matching algorithms. Fuzzy matches were reviewed by a pair of investigators with a third serving as adjudicator of discrepancies.13,14,23Of the 11,193 participants in the COMPASS study, 62% were linked to claims (Suppl. Figure 1).
In this paper, we focus on identifying and classifying therapy claims using the Medicare fee-for-service (FFS) and Medicare Advantage claims data, which comprised most of our linked study population (81.8%). The data structure of the Medicaid and Blue Cross Blue Shield (BCBS) claims was very similar to that of the Medicare claims with the primary differences being the names of the files and data variables; thus, we used a similar linkage approach.
Select Data Files and Study Years
Determining the data files to use in an analysis is determined by the type of data in each file and the study question. Insurance claims data are separated into a variety of files, some of which are dependent upon the setting where care was received. For example, files available from Medicare include inpatient files, skilled nursing facility files, hospice files, home health files, outpatient files, and carrier files. The specific years of data to examine may be guided by several factors including the study design, data availability, the population of interest, and study-specific details. Due to the rapidly changing health care market, using the most current claims data available is often the most appropriate approach. File maturation for claims data varies depending on the source, as does cost, which may also impact decisions. If the research question addresses a specific policy issue, looking at historical data (eg, prior to the policy) may be preferable in order to understand the impact of a policy on health care utilization. Finally, if examining a less common diagnosis, combining several years of data may be necessary to achieve a sufficient sample size.27
Our Approach
We examined Medicare data that corresponded to the years that participants were enrolled in the COMPASS study (2016-2020) to ensure coverage of the 6-month look-back period for the first patient and 12-month follow-up for the last patient. To identify home health and outpatient care, we purchased the following research identifiable files (RIFs) for both Medicare FFS and Medicare Advantage beneficiaries in NC: Home Health Agency,28,29Outpatient,30,31and Carrier files.32The RIFs contain some identifiable information (eg, dates of service, ZIP code) on beneficiaries that increase their usefulness for health services research by allowing researchers to identify the time and length of health care received and by providing geographic data on beneficiaries that can be linked to other files, such as US Census data.11,33
Because billing for beneficiaries in Medicare Advantage plans is conducted by private insurance companies (eg, Humana, United Health Care) the data CMS receives are referred to as “encounter” data because they lack variables related to costs that are included in the FFS files.34In the case of our analyses, which examined use and not cost, the variables extracted from the FFS and Medicare Advantage files were the same across the 2 data sources.
The home health files contain claims submitted by home health agency providers for reimbursement of home health services.28The outpatient files contain claims submitted by institutional outpatient providers including hospital outpatient departments, rural health clinics, comprehensive outpatient rehabilitation facilities, and federally qualified health centers.30,31The carrier files contain claims submitted by professional providers including physicians, nurse practitioners, and therapists (eg, physical therapy, occupational therapy).32,35These claims are from providers who work in private, free-standing clinics and not outpatient institutional settings.
The home health, outpatient, and carrier claims are organized in several files.30,31For each claim category (ie, home health, outpatient, carrier), we used the base file that contains claim-level information such as the beneficiary ID, claim ID, claim type, and claim from date and a more detailed file (referred to as a revenue center file or line file) that contains data on the specific procedures/services associated with a claim and the provider who delivered the procedure. The latter files were used to identify therapy claims (described below).
Operationalize Variables
Because claims data are not developed for research purposes, investigators typically make several decisions on how to operationalize variables to represent the constructs of interest. For example, information on therapy use can be operationalized based on a visit count, the duration of visits, episode of care, or specific procedures delivered (eg, CPT codes). There are also several approaches that can be used to capture illness severity and comorbidity burden based on diagnosis and procedure codes. For example, the Elixhauser36and Charlson37Indices are often used to identify comorbidities based on International Classification of Disease diagnosis codes from hospitalization or outpatient data, and the Chronic Disease Score can be used to identify comorbidities by looking at medication prescriptions in pharmacy data.38Likewise, outcomes and how they are defined can vary (eg, all-cause readmission, disease-specific readmission, 7-day vs 30-day readmission).13,39,40
Our Approach
Figure 3provides an overview of our approach to identifying therapy visits.
Figure 3.
Overview of Approach to Identifying Therapy Visits. CF = carrier files; HH = home health; OP = outpatient.
Steps 1 and 2. We first identified home health, outpatient, and carrier file claims for participants that had linked to the Medicare FFS or Medicare Advantage claim for the index stroke admission. We then retained all claims within the follow-up period for that participant if the date of the claim was within the follow-up window.
Step 3. We then searched the selected claims for codes indicative of care delivered by a physical or occupational therapist. As outlined in Table 1, there are specific revenue center codes, CPT/HCPCS codes, HCPCS modifier codes, and provider specialty codes that are specific to physical or occupational therapy. Revenue center codes are 4-digit codes that represent different products or services offered by the provider (eg, 0434: occupational therapy evaluation or re-evaluation).41Revenue center codes clearly define the type of therapy (ie, physical vs occupational therapy) received by patients but do not provide information on the types of services delivered (eg, manual therapy, therapeutic exercise, gait training). Current procedural terminology and HCPCS codes also represent procedures, supplies, products, and services provided to patients.42,43Current procedural terminology codes are 5-digit codes (eg, 97001 physical therapy evaluation) and HCPCS codes are 5-digit alphanumeric codes with the first character being a letter followed by 4 numbers (eg, G0283: electrical stimulation other than wound). Health care common procedural coding system and CPT codes, which may or may not accompany revenue center codes, provide more detail on the types of services provided. For physical and occupational therapy, only the evaluation and re-evaluation codes are therapy-specific (eg, 96125 physical therapy evaluation, 97003 occupational therapy evaluation).44,45However, each year Medicare publishes guidelines on CPT and HCPCS codes that physical and occupational therapists can use for billing.46
Table 1.
Therapy-Specific Codes Used in Claims Data
| Code Type | Code | Description |
|---|---|---|
| Revenue center | 0420 | Physical therapy-general classification |
| 0421 | Physical therapy-visit charge | |
| 0422 | Physical therapy-hourly charge | |
| 0423 | Physical therapy-group rate | |
| 0424 | Physical therapy-evaluation or re-evaluation | |
| 0429 | Physical therapy-other | |
| 0430 | Occupational therapy-general classification | |
| 0431 | Occupational therapy-visit charge | |
| 0432 | Occupational therapy-hourly charge | |
| 0433 | Occupational therapy-group rate | |
| 0434 | Occupational therapy evaluation or re-evaluation | |
| CPT/HCPCS | 97001 | Physical therapy evaluation |
| 97002 | Physical therapy re-evaluation | |
| 97003 | Occupational therapy evaluation | |
| 97004 | Occupational therapy re-evaluation | |
| 97161 | Physical therapy eval low complex 20 min | |
| 97162 | Physical therapy eval mod complex 30 min | |
| 97163 | Physical therapy eval high complex 45 min | |
| 97164 | Physical therapy re-eval establish plan of care | |
| 97165 | Occupational therapy eval low complex 30 min | |
| 97166 | Occupational therapy eval mod complex 45 min | |
| 97167 | Occupational therapy eval high complex 60 min | |
| HCPCS modifier | GP | Physical therapist |
| GO | Occupational therapist | |
| Provider specialty | 65 | Physical therapist in private practice |
| 67 | Occupational therapist in private practice |
Abbreviations: CPT = current procedural terminology; eval = evaluation; HCPCS = health care common procedure coding system.
Because our study covered the years from 2016 to 2020, we reviewed the CMS codes for those years and created a master list of therapy codes. We then assigned these codes to physical or occupational therapists based on the expertise of our study team. Our rationale for doing this was that each visit, in theory, needed to be attributed to a specific provider type (ie, a claim can only be submitted by 1 provider type). For some codes, it was difficult to assign the code to a single provider type (eg, therapeutic exercise). In those instances, we kept the dual assignment and reasoned that data in the actual claims would help us assign the code to the appropriate provider type. Supplementary Table 1presents the CPT/HCPCS codes, their description, and our therapist designation.
Health care common procedural coding system modifier codes may accompany the CPT/HCPCS codes. Modifier codes provide additional information about the services/procedures provided that is not captured by the CPT/HCPCS codes alone. As depicted in Table 1, there are modifier codes that indicate care delivered by a physical or occupational therapist.47Finally, claims data may also include a provider specialty code that is assigned based on a standardized list created by CMS.48This list includes codes for physical and occupational therapy.
Table 2indicates the types of codes that are available in the claims we examined. Both the home health and outpatient claims data contain information on revenue center codes, HCPCS/CPT codes, HCPCS modifier codes, and provider specialty in the revenue center file. The carrier file data contain information on HCPCS/CPT codes, HCPCS modifier codes, and provider specialty in the line file.
Table 2.
Location of Codes Used to Identify Therapist Visits in Medicare Home Health, Outpatient, and Carrier Files
| Claims Data Source | File Type | Data Variable Name | Full Variable Name |
|---|---|---|---|
| Home health & outpatient files | Revenue Center file | REV_CNTR | Revenue Center Code |
| HCPCS_CD | Health care Common Procedural Coding System Code |
||
| HCPCS_1ST_MDFR_CD | HCPCS Initial Modifier Code | ||
| HCPCS_2ND_MDFR_CD | HCPCS Second Modifier Code | ||
| HCPCS_3RD_MDFR_CD | HCPCS Third Modifier Code | ||
| HCPCS_4TH_MDFR_CD | HCPCS Fourth Modifier Code | ||
| RNDRNG_PHYSN_SPCLTY_CD | Rendering Physician Specialty Code | ||
| Carrier file | Line file | HCPCS_CD | Health care Common Procedural Coding System Code |
| HCPCS_1ST_MDFR_CD | HCPCS Initial Modifier Code | ||
| HCPCS_2ND_MDFR_CD | HCPCS Second Modifier Code | ||
| PRVDR_SPCLTY | Provider Specialty |
Abbreviation: HCPCS = health care common procedure coding system.
We searched the revenue center file and line file claims for all participants in the study and retained any claim that had a revenue center code, CPT/HCPCS code, HCPCS modifier code, or provider specialty code that was indicative of physical or occupational therapy. In addition to the variables outlined in Table 2, we retained the revenue center date (REV_CNTR_DT) from the revenue center files for the home health and outpatient data. For the carrier data, we retained the claim date (CLM_THRU_DT) in the line file.
Of note during this step of the process is that therapy claims in the outpatient file may be indicative of care delivered during observation stays and not care received in the community. This required us to look carefully at the hospital discharge date and start of follow-up care date for each participant to ensure that we included only care received after discharge.
Step 4. As depicted in Table 2, not all files included all variables indicative of therapy use. Furthermore, even if the variables were present in the file, data may have been missing. Therefore, we used a hierarchical approach to assign a provider type to each claim (Figure 4). We first assigned claims with therapy revenue center codes or CPT evaluation codes for physical therapy or occupational therapy as these codes are clearly indicative of therapist type. For the remaining unassigned claims, we then assigned those that had HCPCS modifier codes indicative of physical or occupational therapy, again, because of the certainty of these codes. Next, we examined the provider specialty code in the remaining unassigned claims and assigned provider type. We then referred to our master list of CPT/HCPCS codes and provider assignment (Suppl. Table 1) to assign the remaining claims. In instances where we had a code that was assigned to both physical and occupational therapy (eg, therapeutic exercise 97110), we retained the dual assignment and used information from the other claims associated with the participant to assign provider type (eg, if most of the other claims were assigned to an occupational therapist).
Figure 4.
Hierarchical Approach to Assigning Therapy Provider Type to Claims. CPT = current procedural terminology; HCPCS = health care common procedure coding; OT = occupational therapist; PT = physical therapist.
Supplementary Table 2provides information on participants who had therapy claims in the home health, outpatient, and carrier files and the distribution of codes used to identify therapy claims. The Medicare FFS and Medicare Advantage data were similar in regard to the presence of the various codes with the exception of the carrier file where the Medicare Advantage data were less complete in regard to codes used to identify therapy claims.
Step 5. After assigning provider type to each claim, we removed duplicates in each of the 3 file types (home health, outpatient, carrier) and only retained unique claims for each participant visit and provider. For example, a participant may have had 2 claims on the same day from the same provider, but the claims varied based on the procedure code. For the purposes of our study, these 2 claims were duplicative since we were not interested in the specific procedure codes associated with the claims.
Once duplicates were removed, we merged the outpatient and carrier file data. We then had 2 visit-level analytic files: 1 with home health visits and 1 with outpatient visits. Each of these visit-level files contained 1 record for each unique participant visit and provider type. The specific variables in the data set were Participant ID, provider type (physical or occupational therapist), and visit date. Supplementary Figures 2–4provide details on the number of participants/claims for each file and the final number of therapy visits identified.
Make Analytic Decisions
The final step in the claims-based research process is to develop a statistical analysis plan (Figure 1). As with any study, analytic decisions are driven by several factors including the sample size, type and distribution of the data, study design, clustering of the data (eg, individual level versus hospital level), desired estimand (eg, risk ratio, odds ratio), missing data, and an understanding of potential biases. Because observational designs are often used in studies examining claims data, methods of controlling for confounding are also important and may involve complex approaches (eg, inverse probability weighting) that warrant detailed planning, explanation, and inclusion of pre-specified sensitivity analyses. While discussing the numerous analytic decisions we made and will make for our project is beyond the scope of this paper, we acknowledge the importance of providing the rationale for analytic decisions when publishing and disseminating study findings.
Role of the Funding Source
This project was supported by the National Institutes of Health through the Eunice Kennedy Shriver National Institute of Child Health and Human Development and the National Center for Medical Rehabilitation Research. The funder played no role in the development of this manuscript or the methods described.
DISCUSSION
This perspective aims to raise awareness of the need for and importance of methodological transparency and openness in rehabilitation-related health services research using administrative claims data and to provide an illustrative example of the methods that we used to identify therapy use in Medicare FFS and Medicare Advantage files. We believe that to promote more valid and methodologically rigorous claims-based health services research, investigators should document and share claims-based protocols. These details could be provided in supplementary materials as part of research reports or in stand-alone publications. Sharing such information takes time and attention to detail but can facilitate comparisons of findings across studies, provide guidance on approaches to handling nuanced issues of examining health care data in different contexts, and accelerate innovation by decreasing duplicative efforts. Claims data are complex, and there are often several study-specific decisions that need to be made when defining analytic variables. In the case example that we provided, we described many of our decisions related to identifying therapy claims and assigning providers to visits.
To our knowledge, this is the first published information in the peer-reviewed literature on claims-based methods relevant to rehabilitation-related health services research. We hope the information presented in this manuscript provides fellow health services researchers with an approach that can be used for examining home health and outpatient therapy use with Medicare and other types of claims. While the work reported in this perspective focused on Medicare data, we have used a similar approach to examine therapy use in Medicaid and BCBS claims. Our work also focused on physical and occupational therapy but could be easily extended to speech-language pathology. Our approach also provides an example of what and how to share information on claims methodology.
We encourage more detailed reporting of claims methodologies to advance claims-based health services research and to promote methodological transparency and reproducibility. Reporting should encompass decisions regarding the study design (eg, details on the period of the analysis and how that was determined), cohort identification, years examined and files used, study variables created, and the analytic approach (eg, approach to handling missing data). Reporting of alternative approaches and solutions to unexpected issues or challenges is also important. Finally, various methods of dissemination (eg, publications, conference presentations, workshops) should be used.
No standards are currently available that are specific for reporting claims-based analysis methods. We believe this is an area for development in the field of health services research. First steps might include convening a task force to develop expert consensus on the specific type of information to include in a reporting guideline. Such a task force could represent the health services community as a whole or be more focused on the rehabilitation community. The EQUATOR Network, which contains over 600 reporting guidelines for various study types, provides a toolkit for developing a reporting guideline.49The STROBE guidelines for strengthening the reporting of observational studies in epidemiology generally capture the steps we describe above in the Methods section of guideline checklist50but lack the specificity of what should be reported in the context of claims-based analyses.
We also encourage more openness and sharing of metadata (ie, code, algorithms, operational definitions) and primary data, when possible. Various data repositories are available to researchers, and these are likely to increase as the National Institutes of Health and other funding agencies develop more rigorous data sharing policies.51As with reporting guidelines, data could be shared in a general data repository or one that is specific to rehabilitation-related health services research.52Additionally, more methodological work is needed to develop and share rehabilitation-specific computable phenotypes.53Finally, open-source repositories for code sharing are available such as GitHub54and Open Science Framework.55
Limitations
This perspective is not without limitations. While we have provided a general framework for and examples of decisions made in our own claims-based analyses, decisions are often driven by the specific research questions, and, for any given study, there may be additional relevant steps not included in our framework. We also recognize that the method that we used to identify therapy visits has limitations. First, as with all administrative data, the data may have had coding errors. Second, our approach of assigning providers to the different CPT-HCPCS codes has not been validated. We leveraged the clinical expertise of a multidisciplinary team of clinicians and researchers and used expert-based consensus techniques but recognize that this work is in its infancy, and there may be alternative approaches.
Despite these limitations, we believe sharing our framework and approach to identifying therapy visits in Medicare claims fills important gaps in the literature. We welcome suggestions for refining and expanding our framework and hope that future studies will validate and build on our methodologies for using claims data to study rehabilitation utilization. We hope this perspective will advance increased communication, transparency, openness, and standardization of methods used to examine health care utilization data.
CONCLUSION
We have presented a framework for the decisions made in claims-based analyses along with a detailed methodology for identifying community-based physical and occupational therapy use in Medicare claims. We have also discussed the need for more methodologic transparency and openness in health services research using claims data and have provided some suggestions for moving this work forward. We encourage researchers to take the time to report on and share their methodologies when examining claims data. Such efforts will advance the validity, reproducibility, and value of rehabilitation health services research.
CRediT—CONTRIBUTOR ROLES
Janet K. Freburger (Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Validation, Visualization, Writing—original draft, Writing—review & editing [equal], Supervision [lead]), Elizabeth R. Mormer (Methodology, Project administration [Equal], Resources, Visualization, Writing—original draft, Writing—review & editing [equal]), Kristin E. Ressel (Project administration, Visualization, Writing—review & editing [equal]), Anna M. Johnson (Methodology, Validation, Writing—review & editing [equal]), Amy M. Pastva (Methodology, Validation, Writing—review & editing [equal]), Cheryl D. Bushnell (Methodology, Validation, Writing—review & editing [equal]), Pamela W. Duncan (Methodology, Validation, Writing—review & editing [equal]), Sara B. Jones Berkeley (Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Supervision, Validation, Writing—original draft, Writing—review & editing [equal]).
Supplementary Material
Contributor Information
Janet K Freburger, Department of Physical Therapy, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, 15219-3130, United States.
Elizabeth R Mormer, Department of Physical Therapy, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA, 15219-3130, United States.
Kristin Ressel, Department of Physical Therapy, School of Health and Rehabilitation Sciences, University of Pittsburgh, Pittsburgh, PA 15219-3130, United States.
Anna M Johnson, Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC, 27516, United States.
Amy M Pastva, Department of Orthopaedic Surgery, Doctor of Physical Therapy Division, School of Medicine, Duke University, Durham, NC, 27710, United States.
Cheryl D Bushnell, Department of Neurology, School of Medicine, Wake Forest University, Winston-Salem, NC, 27157, United States.
Pamela W Duncan, Department of Internal Medicine, School of Medicine, Wake Forest University, Winston-Salem, NC, 27157, United States.
Sara B Jones, Department of Epidemiology, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, NC27516, United States.
FUNDING
The project described and all authors were supported by the National Institutes of Health (award no. 1R01HD101493) through the Eunice Kennedy Shriver National Institute of Child Health and Human Development and the National Center for Medical Rehabilitation Research.
ETHICS APPROVAL
Our study was determined to be exempt by institutional board review.
DISCLOSURES
The authors completed the ICMJE Form for Disclosure of Potential Conflicts of Interest and reported no conflicts of interest.
P. Duncan and C. Bushnell report ownership interest in Care Directions, Inc. P. Duncan is a research advisor for BQ Technologies. All others declare no conflicts.
A. Pastva is a member of the PTJeditorial board.
DATA AVAILABILITY
The Medicare claims data used for the case example are not shareable under the rules of our executed Data Use Agreement with the Center for Medicare and Medicaid Services.
REFERENCES
- 1. Bjarndóttir MV, Czerwinski D, Guan Y. The history and modern applications of insurance claims data in healthcare research. In:Yang H, Lee EK, eds. Healthcare Analytics: From Data to Knowledge to Healthcare Improvement. John Wiley & Sons; 2016. 10.1002/9781118919408.ch19 [DOI] [Google Scholar]
- 2. Ferver K, Burton B, Jesilow P. The use of claims data in healthcare research. Open Public Health J 2009;2(1), 11, 24. 10.2174/1874944500902010011 [DOI] [Google Scholar]
- 3. Konrad R, Zhang W, Bjarndottir M, Proano R. Key considerations when using health insurance claims data in advanced data analyses: an experience report. Health Syst (Basingstoke) 2019;9(4):317–325. 10.1080/20476965.2019.1581433 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Freburger JK, Konrad TR. The use of federal and state databases to conduct health services research related to physical and occupational therapy. Arch Phys Med Rehabil 2002;83(6):837–845. 10.1053/apmr.2002.32661 [DOI] [PubMed] [Google Scholar]
- 5. Sheehan OC, Prvu-Bettger J, Huang Jet al. Is self or caregiver report comparable to Medicare claims indicators of healthcare utilization after stroke? Topics in Stroke Rehabilitation, 25(7):521–526. 10.1080/10749357.2018.1493251 [DOI] [PubMed] [Google Scholar]
- 6. Ou L, Chen J, Hillman K. Socio-demographic disparities in the utilisation of general practice services for Australian children - results from a nationally representative longitudinal study. PLoS One 2017;12(4):e0176563. 10.1371/journal.pone.0176563 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.Ahmad Z, Xing C, Khera, et al. Using Healthcare Claims Data and Machine Learning to Identify Health Disparities for Individuals with Familial Hypercholesterolemia. Journal of Clinical Lipidology, 2023;17(4):p.e16. 10.1016/j.metabol.2023.155496 [DOI] [PubMed] [Google Scholar]
- 8. Kumar A, Adhikari D, Karmarkar A, Freburger J, Gozalo P, Mor V, Resnik L. Variation in hospital-based rehabilitation services among patients with ischemic stroke in the United States. Phys Ther 2019;99(5):494–506. 10.1093/ptj/pzz014 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Kumar A, Resnik L, Karmarkar A, Freburger J, Adhikari D, Mor V, Gozalo P. Use of hospital-based rehabilitation services and hospital readmission following ischemic stroke in the United States. Arch Phys Med Rehabil 2019;100(7):1218–1225. 10.1016/j.apmr.2018.12.028 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Kumar A, Roy I, Falvey Jet al. Effect of variation in early rehabilitation on hospital readmission after hip fracture. Phys Ther. 2023;103(3):1–10. 10.1093/ptj/pzac170 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Aguinis H, Ramani RS, Alabduljader N. What you see is what you get? Enhancing methodological transparency in management research. Acad Manag Ann 2018;12(1):83–110. 10.5465/annals.2016.0011 [DOI] [Google Scholar]
- 12. Resnik D. Openness in scientific research: a historical and philosophical perspective. J Open Access Law 2023;11(1):1–10. [PMC free article] [PubMed] [Google Scholar]
- 13. Bushnell CD, Kucharska-Newton AM, Jones SB, Psioda MA, Johnson AM, Daras LC, Halladay JR, Prvu Bettger J, Freburger JK, Gesell SB, Coleman SW, Sissine ME, Wen F, Hunt GP, Rosamond WD, Duncan PW. Hospital readmissions and mortality among fee-for-service Medicare patients with minor stroke or transient ischemic attack: findings from the COMPASS cluster-randomized pragmatic trial. J Am Heart Assoc 2021;10(23):e023394–e023394. 10.1161/JAHA.121.023394 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14. Hammill BG, Curtis LH, Qualls LG, Hastings SN, Wang V, Maciejewski ML. Linkage of laboratory results to Medicare fee-for-service claims. Med Care. 2015;53(11):974–979. 10.1097/MLR.0000000000000420 [DOI] [PubMed] [Google Scholar]
- 15. Bushnell CD, Duncan PW, Lycan SL, et al. A person-centered approach to poststroke care: the comprehensive post-acute stroke services model. J Am Geriatr Soc 2018;66(5):1025–1030. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16. Duncan PW, Bushnell CD, Rosamond WD, Jones Berkeley SB, Gesell SB, D’Agostino RB Jr, Ambrosius WT, Barton-Percival B, Bettger JP, Coleman SW, Cummings DM, Freburger JK, Halladay J, Johnson AM, Kucharska-Newton AM, Lundy-Lamm G, Lutz BJ, Mettam LH, Pastva AM, Sissine ME, Vetter B. The comprehensive post-acute stroke services (COMPASS) study: design and methods for a cluster-randomized pragmatic trial. BMC Neurol 2017;17(1):133. 10.1186/s12883-017-0907-1 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Dromerick AW, Geed S, Barth J, Brady K, Giannetti ML, Mitchell A, Edwardson MA, Tan MT, Zhou Y, Newport EL, Edwards DF. Critical period after stroke study (CPASS): a phase II clinical trial testing an optimal time for motor recovery after stroke in humans. Proc Natl Acad Sci USA 2021;118(39):1–10. 10.1073/pnas.2026676118 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Columbo JA, Daya N, Colantonio LDet al. Derivation and validation of ICD-10 codes for identifying incident stroke. JAMA Neurol. 2024;81(8):875–881. 10.1001/jamaneurol.2024.2044 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Kuang A, Xu C, Southern DA, Sandhu N, Quan H. Validated administrative data based ICD-10 algorithms for chronic conditions: a systematic review. J Epidemiol Popul HealthAug 2024;72(4):202744. 10.1016/j.jeph.2024.202744 [DOI] [PubMed] [Google Scholar]
- 20. Verchinina L, Ferguson L, Flynn A, Wichorek M, Markel D. Computable phenotypes: standardized ways to classify people using electronic health record data. Perspect Health Inf Manag. 2018;10:1–8. [Google Scholar]
- 21. Chung CP, Rohan P, Krishnaswami S, McPheeters ML. A systematic review of validated methods for identifying patients with rheumatoid arthritis using administrative or claims data. Vaccine 2013;31Suppl 10:K41–61. 10.1016/j.vaccine.2013.03.075 [DOI] [PubMed] [Google Scholar]
- 22. Marrache M, Prasad N, Margalit A, Nayar SK, Best MJ, Fritz JM, Skolasky RL. Initial presentation for acute low back pain: is early physical therapy associated with healthcare utilization and spending? A retrospective review of a National Database. BMC Health Serv Res 2022;22(1):851. 10.1186/s12913-022-08255-0 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Freburger JK, Pastva AM, Coleman SW, Peter KM, Kucharska-Newton AM, Johnson AM, Psioda MA, Duncan PW, Bushnell CD, Rosamond WD, Jones SB. Skilled nursing and inpatient rehabilitation facility use by Medicare fee-for-service beneficiaries discharged home after a stroke: findings from the COMPASS trial. Arch Phys Med Rehabil 2022;103(5):882–890.e2. 10.1016/j.apmr.2021.10.015 [DOI] [PubMed] [Google Scholar]
- 24. Juster FT, Suzman R. An overview of the health and retirement study. J Hum Resour. 1995;30(Special Issue):S7–S56. 10.2307/146277 [DOI] [Google Scholar]
- 25. Freedman VA, Kasper JD. Cohort profile: the National Health and aging trends study (NHATS). Int J Epidemiol 2019;48(4):1044–1045g. 10.1093/ije/dyz109 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Doll KM, Rademaker A, Sosa JA. Practical guide to surgical data sets: surveillance, epidemiology, and end results (SEER) database. JAMA Surg 2018;153(6):588–589. 10.1001/jamasurg.2018.0501 [DOI] [PubMed] [Google Scholar]
- 27. Wang X, Ji X. Sample size estimation in clinical research: from randomized controlled trials to observational studies. Chest 2020;158(1S):S12–S20. 10.1016/j.chest.2020.03.010 [DOI] [PubMed] [Google Scholar]
- 28. ResDAC . Home Health Agency (Fee-for-Service). Research Data Assistance Center. Accessed Februay 28, 2024. https://resdac.org/cms-data/files/hha-ffs
- 29. ResDAC . Home Health Agency (Encounter). Research Data Assistance Center. Accessed Februay 28, 2024. https://resdac.org/cms-data/files/hha-encounter
- 30. ResDAC . Outpatient (Encounter). Research Data Assistance Center. Februay 28, 2024. https://resdac.org/cms-data/files/op-encounter
- 31. ResDAC . Outpatient (Fee-for-Service). Research Data Assistance Center. February 28, 2024. https://resdac.org/cms-data/files/op-ffs
- 32. ResDAC . Carrier (Encounter). Research Data Assistance Center. Februay 28, 2024. https://resdac.org/cms-data/files/carrier-encounter
- 33. CMS . CMS Research Identifiable File Data Use Agreement Policy Guide. Centers for Medicare and Medicaid Services. https://www.cms.gov/files/document/research-identifiable-file-data-use-agreement-policies.pdf
- 34. ResDAC . Find the CMS Data File you Need. Research Data Assistance Center. Accessed May 20, 2024. https://resdac.org/cms-data?tid%5B6056%5D=6056&tid%5B4931%5D=4931
- 35. ResDAC . Carrier (Fee-for-Service). Research Data Assistance Center. Accessed Februay 28, 2024. Carrier (Fee-for-Service). https://resdac.org/cms-data/files/carrier-ffs
- 36. Elixhauser A, Steiner C, Harris DR, Coffey RM. Comorbidity measures for use with administrative data. Med Care 1998;36(1):8–27. [DOI] [PubMed] [Google Scholar]
- 37. Charlson ME, Carrozzino D, Guidi J, Patierno C. Charlson comorbidity index: a critical review of Clinimetric properties. Psychother Psychosom 2022;91(1):8–35. 10.1159/000521288 [DOI] [PubMed] [Google Scholar]
- 38. Von Korff M, Wagner EH, Saunders K. A chronic disease score from automated pharmacy data. J Clin Epidemiol 1992;45(2): 197–203, 10.1016/0895-4356(92)90016-G [DOI] [PubMed] [Google Scholar]
- 39. Freburger JK, Li D, Fraher EP. Community use of physical and cccupational therapy after stroke and risk of hospital readmission. Arch Phys Med Rehabil 2018;99(1):26–34.e5. 10.1016/j.apmr.2017.07.011 [DOI] [PubMed] [Google Scholar]
- 40. Hornbrook MC, Hurtado AV, Johnson RE. Health care episodes: definition, measurement and use. Medical care Review 1985;42(2), 163, 218, 10.1177/107755878504200202 [DOI] [PubMed] [Google Scholar]
- 41. ResDAC . Revenue Center Code (FFS). Research Data Assistance Center 2024. https://resdac.org/cms-data/variables/revenue-center-code-ffs
- 42. ResDAC . Diagnosis and Procedure Coding Resources. Research Data Assistance Center. 07/14/2017 2017. https://resdac.org/articles/diagnosis-and-procedure-coding-resources
- 43. ResDAC . Revenue Center HCFA Common Procedure Coding System. Research Data Assistance Center. Accessed May 28, 2024. https://resdac.org/cms-data/variables/revenue-center-hcfa-common-procedure-coding-system
- 44. APTA: Coding and Billing . American Physical Therapy Association. Accessed February 28, 2024. https://www.apta.org/your-practice/payment/coding-billing
- 45. Practice Essentials: Coding and Billing Resources. American Occupational Therapy Association. Accessed February 24, 2024. https://www.aota.org/practice/practice-essentials/coding
- 46. CMS . Annual Therapy Update. Centers for Medicare & Medicaid Services. CMS.gov. Accessed February 28, 2024. https://www.cms.gov/medicare/coding-billing/therapy-services/annual-therapy-update
- 47. ResDAC . HCPCS Initial Modifier Code (FFS). Research Data Assistance Center. https://resdac.org/cms-data/variables/hcpcs-initial-modifier-code-ffs
- 48. ResDAC . Data Documentation: Revenue Center Codes. Research Data Assistance Center. https://resdac.org/sites/datadocumentation.resdac.org/files/Claim%20or%20Revenue%20Center%20Rendering%20Physician%20Specialty%20Code%20Code%20Book%20%28FFS%29.txt
- 49. How to Develop a Reporting Guideline. Equator Network. Accessed November 6, 2024. https://www.equator-network.org/toolkits/developing-a-reporting-guideline/
- 50. Strengthening the Reporting of Observational Studies in Epidemiology. STROBE. Accessed November 6, 2024. https://www.strobe-statement.org/checklists/
- 51. Final NIH Policy for Data Management and Sharing. Vol. NOT-OD-21-013. October 29, 2020. https://grants.nih.gov/grants/guide/notice-files/NOT-OD-21-013.html
- 52. Capo-Lugo CE, Kho AN, O'Dwyer LC, Rosenman MB. Data sharing and data registries in physical medicine and rehabilitation. PM R 2017;9(5S):S59–S74. 10.1016/j.pmrj.2017.04.003 [DOI] [PubMed] [Google Scholar]
- 53. Richesson RL, Smerek MM, Blake Cameron C. A framework to support the sharing and reuse of computable phenotype definitions across health care delivery and clinical research applications. EGEMS (Wash DC) 2016;4(3):1232. 10.13063/2327-9214.1232 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 54. GitHub . 11/6/2024. https://github.com/collections
- 55. Foster EF, Deardorff A.. Open Science Framework (OSF). J Med Libr Assoc. 105(2):203–206. 10.5195/jmla.2017 [DOI] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The Medicare claims data used for the case example are not shareable under the rules of our executed Data Use Agreement with the Center for Medicare and Medicaid Services.




