Abstract
Background
Commercial insurance claims data are a stable and consistent source of information on Lyme disease diagnoses in the United States and can contribute to our understanding of overall disease burden and the tracking of epidemiological trends. Algorithms consisting of diagnosis codes and antimicrobial treatment information have been used to identify Lyme disease diagnoses in claims data, but there might be opportunity to improve their accuracy.
Methods
We developed three modified versions of our existing claims-based Lyme disease algorithm; each reflected refined criteria regarding antimicrobials prescribed and/or maximum days between diagnosis code and qualifying prescription claim. We applied each to a large national commercial claims database to identify Lyme disease diagnoses during 2016–2019. We then compared characteristics of Lyme disease diagnoses identified by each of the modified algorithms to those identified by our original algorithm to assess differences from expected trends in demographics, seasonality, and geography.
Results
Observed differences in characteristics of patients with diagnoses identified by the three modified algorithms and our original algorithm were minimal, and differences in age and sex, in particular, were small enough that they could have been due to chance. However, one modified algorithm resulted in proportionally more diagnoses in men, during peak summer months, and in high-incidence jurisdictions, more closely reflecting epidemiological trends documented through public health surveillance. This algorithm limited treatment to only first-line recommended antimicrobials and shortened the timeframe between a Lyme disease diagnosis code and qualifying prescription claim.
Conclusions
As compared to our original algorithm, a modified algorithm that limits the antimicrobials prescribed and shortens the timeframe between a diagnosis code and a qualifying prescription claim might more accurately identify Lyme disease diagnoses when utilizing insurance claims data for supplementary, routine identification and monitoring of Lyme disease diagnoses.
Supplementary Information
The online version contains supplementary material available at 10.1186/s12879-024-10195-5.
Keywords: Lyme disease, Surveillance, Epidemiology
Background
Lyme disease, caused by Borrelia burgdorferi sensu lato spirochetes and transmitted to humans by Ixodes spp. ticks, is the most commonly reported vector-borne disease in the United States, with cases geographically focused in the Northeast, mid-Atlantic, and upper Midwest regions [1, 2]. Lyme disease data are voluntarily reported to the Centers for Disease Control and Prevention (CDC) by state and local public health agencies via the National Notifiable Disease Surveillance System (NNDSS) [3]. During 2008–2021, 30,000–40,000 cases of Lyme disease were typically reported each year; trends consistently indicated a peak in the summer months, a slight predominance in men, and a bimodal age distribution, with the highest rates among younger children and older adults [1, 4].
Historically, substantial underreporting of Lyme disease cases has occurred, most commonly in highly endemic areas where the case burden is greatest and significant human resources are required by health departments to investigate cases [5–7]. Additionally, while the case definition used for public health surveillance for Lyme disease is specific and meant to identify high-risk groups and track disease trends and geographic distribution over time, it does not capture all Lyme disease cases in the United States [8]. Alternative data sources are needed to improve understanding of the true burden of Lyme disease and to allow for supplementary monitoring of epidemiological trends [9].
Insurance claims databases have been used as an alternative data source for identifying Lyme disease diagnoses as they contain standardized diagnosis codes, information on prescription medication claims and, in some cases, laboratory test orders [9–23]. Our group previously evaluated the utility of a national commercial claims database for routine identification of Lyme disease diagnoses and found that during 2010–2018, the estimated annual incidence was 6–8 times as high as that observed for cases reported through public health surveillance [12]. This study used an algorithm initially developed to maximize sensitivity for identifying Lyme disease diagnoses in claims data, first described in Nelson et al. [11]; it defined an outpatient Lyme disease diagnosis as the first outpatient encounter in a calendar year with an International Classification of Diseases (ICD) diagnosis code for Lyme disease and a prescription claim for ≥ 7 days of a qualifying antimicrobial drug within ± 30 days of the encounter date.
Schwartz et al. [12] found that while age and sex distributions for identified Lyme disease diagnoses were similar to those demonstrated in public health surveillance, more diagnoses occurred outside of typical peak summer months, among females, and in low-incidence jurisdictions compared to cases reported through surveillance, indicating suboptimal specificity of this algorithm. Another study assessed the validity of the algorithm against medical chart review and found high (93.8%) positive predictive value (PPV) for identifying confirmed, probable, or suspect cases of Lyme disease but a somewhat lower PPV (66.4%) for identifying confirmed and probable cases [24]. This study was unable to calculate the algorithm’s specificity.
We saw two opportunities to potentially improve the accuracy of this established claims-based algorithm for identifying Lyme disease diagnoses. First, the list of qualifying antimicrobial drugs for Lyme disease was broad; in addition to first-line recommended therapies, it included second-line therapies and some drugs that are no longer recommended for the treatment of Lyme disease [25]. Second, it established a broad, 30-day allowable time window for a prescription claim around an ICD code for Lyme disease, which could potentially lead to capture of prescription claims unrelated to the Lyme disease ICD code.
We sought to develop and evaluate three modified versions of our existing treatment-based Lyme disease algorithm [11, 12], intended to improve accuracy by restricting to first-line recommended antimicrobials and reducing the allowable time window between an ICD code and a prescription claim. We compared characteristics of identified diagnoses to those identified by the original algorithm and to epidemiological trends consistently seen with public health surveillance. Our ultimate goal was to optimize our claims-based algorithm for routine tracking of Lyme disease diagnoses using this data source.
Methods
Data source
We used data from the Merative MarketScan Commercial Claims and Encounters Databases [26], which contains deidentified insurance claims records for > 25 million privately insured individuals aged < 65 years annually for inpatient, outpatient, and drug prescription services. The MarketScan database is demographically similar to the U.S. population aged < 65 years and has been found to be a stable source of data for routine analyses of epidemiologic trends in Lyme disease diagnoses [12]. For this analysis, we restricted the MarketScan population to a cohort of outpatient enrollees with insurance coverage for the entire calendar year each year during 2016–2019.
Identification of Lyme disease diagnoses
We developed three modified versions of our previously developed algorithm (hereafter referred to as the original algorithm (OA)) [11, 12], which includes ICD codes for Lyme disease and prescription claims for appropriate antimicrobials consistent with established treatment recommendations during 2016–2019 (hereafter referred to as qualifying prescription claims) (Table 1) [27]. We did not include laboratory data in our algorithm as it is available for only a small subset of enrollees in the MarketScan database and would greatly limit our sample size.
Table 1.
Criteria for algorithms used to identify Lyme disease diagnoses in MarketScan database, 2016–2019
Algorithm name | Criteria |
---|---|
Original Algorithm (OA)a | ICD-10-CMb code A69.2x AND a prescription claim for ≥ 7 days treatment with any appropriate antimicrobial drugc within ± 30 days. |
Modified Algorithm (MA) 1 | ICD-10-CM code A69.2x AND a prescription claim for ≥ 7 days treatment with a first-line antimicrobial drugb within ± 30 days. |
Modified Algorithm (MA) 2 | ICD-10-CM code A69.2x AND a prescription claim for ≥ 7 days treatment with any appropriate antimicrobial drugb within ± 14 days. |
Modified Algorithm (MA) 3 | ICD-10-CM code A69.2x AND a prescription claim for ≥ 7 days treatment with a first-line antimicrobial drugb within ± 14 days. |
We applied each of the three modified algorithms (MAs) and the OA to the 2016–2019 MarketScan cohort. For each algorithm, we retained a patient’s first Lyme disease diagnosis in a calendar year (i.e., first encounter where an ICD code for Lyme disease was assigned with a qualifying prescription claim within the specified timeframe); patients could have more than one Lyme disease diagnosis in the four-year study period but only one diagnosis per calendar year. For patients who had multiple qualifying prescription claims within the specified timeframe around the encounter when an ICD code was assigned, we retained only the prescription claim that occurred closest in time to (before or after) the encounter date. When more than one qualifying prescription claim occurred within the same number of days from the ICD code, we prioritized by line of therapy in accordance with the established treatment recommendations, retaining only the drug of the highest line of therapy [27]. When more than one prescription claim for drugs of the same line of therapy occurred within the same number of days from the ICD code, we retained the drug that came first in alphabetical order based on its generic name to simplify data management and analysis.
Comparison of Lyme disease diagnoses identified by modified algorithms
We compared the number of Lyme disease diagnoses identified by each of the MAs and the relative proportions of their characteristics to those identified by the OA and to trends demonstrated through public health surveillance. Characteristics compared were age, sex, seasonality, and geographic risk level. Geographic risk level was assigned based on annual average incidence of a patient’s jurisdiction of residence between 2016 and 2019; high-incidence jurisdictions included those with ≥ 10 confirmed cases of Lyme disease per 100,000 population during the reporting period, and all other jurisdictions were classified as either unknown incidence or low incidence (Appendix A). Patients with unknown jurisdiction of residence but who were assigned a geographic region of “Northeast” were assigned to the high-incidence category because all states or jurisdictions included in this region were considered high incidence during the study period. Patients who were assigned a state of residence of “New England Division, unknown state” or “Middle Atlantic Division, unknown state” were also included in the high-incidence category (Appendix A). To assess seasonality, month of Lyme disease diagnosis was defined by the date of the encounter when the ICD code was assigned.
Additionally, we calculated the mean and standard deviation (SD)), median, and inter quartile range (IQR) of the number of days between assignment of an ICD code for Lyme disease and a prescription claim for a qualifying antimicrobial and compared by algorithm and drug. All analyses were conducted in R version 4.0.2 and SAS version 9.4 [28, 29].
Results
During 2016–2019, we identified a total of 55,661 Lyme disease diagnoses using the OA; 50,052 Lyme disease diagnoses using MA 1, which restricted qualifying prescription claims to only first-line recommended antimicrobial drugs; 50,617 Lyme disease diagnoses using MA 2, which restricted the timeframe between an ICD code and qualifying prescription claim to ± 14 days; and 45,878 Lyme disease diagnoses using MA 3, which combined MA 1 and MA 2, restricting qualifying prescription claims to only first-line recommended antimicrobial drugs and the timeframe between an ICD code and prescription claim to ± 14 days. For all four algorithms, 2017 had the highest number of Lyme disease diagnoses and 2019 had the lowest (Table 2).
Table 2.
Characteristics among Lyme disease diagnoses identified by algorithms in MarketScan database, 2016–2019
Characteristic | OAa, b N = 55,661 |
MAc 1d N = 50,052 |
MA 2e N = 50,617 |
MA 3f N = 45,878 |
||||
---|---|---|---|---|---|---|---|---|
No. | (%) | No. | (%) | No. | (%) | No. | (%) | |
Year | ||||||||
2016 | 14,243 | 25.6 | 12,789 | 25.6 | 12,989 | 25.7 | 11,746 | 25.6 |
2017 | 15,890 | 28.5 | 14,372 | 28.7 | 14,495 | 28.6 | 13,201 | 28.8 |
2018 | 13,275 | 23.8 | 11,900 | 23.8 | 12,082 | 23.9 | 10,932 | 23.8 |
2019 | 12,253 | 22.0 | 10,991 | 22.0 | 11,051 | 21.8 | 9,999 | 21.8 |
Sex | ||||||||
Male | 27,157 | 48.8 | 25,146 | 50.2 | 25,030 | 49.5 | 23,335 | 50.9 |
Female | 28,504 | 51.2 | 24,906 | 49.8 | 25,587 | 50.5 | 22,543 | 49.1 |
Age Group (years) | ||||||||
0–17 | 11,386 | 20.5 | 10,368 | 20.7 | 10,682 | 21.1 | 9,784 | 21.3 |
18–34 | 9,693 | 17.4 | 8,645 | 17.3 | 8,779 | 17.3 | 7,877 | 17.2 |
35–44 | 8,327 | 14.9 | 7,388 | 14.8 | 7,428 | 14.7 | 6,643 | 14.5 |
45–54 | 12,402 | 22.3 | 10,961 | 21.9 | 11,200 | 22.1 | 10,005 | 21.8 |
55–64 | 13,853 | 24.9 | 12,690 | 25.3 | 12,528 | 24.8 | 11,569 | 25.2 |
Geographic Risk Levelg | ||||||||
High incidence | 40,917 | 73.5 | 37,440 | 74.8 | 37,420 | 73.9 | 34,502 | 75.2 |
Low incidence | 10,410 | 18.7 | 8,679 | 17.3 | 9,282 | 18.4 | 7,779 | 17.0 |
Unknown incidence | 4,334 | 7.8 | 3,933 | 7.9 | 3,915 | 7.7 | 3,597 | 7.8 |
Summer Diagnosish | ||||||||
Yes | 31,519 | 56.6 | 29,516 | 59.0 | 29,343 | 58.0 | 27,636 | 60.2 |
No | 24,142 | 43.4 | 20,536 | 41.0 | 21,274 | 42.0 | 18,242 | 39.8 |
aOriginal Algorithm
bICD-10-CM code A69.2x AND a prescription claim for ≥ 7 days treatment with any appropriate antimicrobial drugb within ± 30 days
cModified Algorithm
dICD-10-CM code A69.2x AND a prescription claim for ≥ 7 days treatment with a first-line antimicrobial drugb within ± 30 days
eICD-10-CM code A69.2x AND a prescription claim for ≥ 7 days treatment with any appropriate antimicrobial drugb within ± 14 days
fICD-10-CM code A69.2x AND a prescription claim for ≥ 7 days treatment with a first-line antimicrobial drugb within ± 14 days
gBased on patient’s state of residence
hDiagnosis code was assigned in the months of May, June, July, or August
Comparison of characteristics of Lyme disease diagnoses by algorithm
Overall, characteristics of Lyme disease diagnoses did not differ greatly between the three modified case definitions and the OA, and thus it is possible that observed differences were due to chance. We did observe a slight male predominance among diagnoses identified by MA 1 and MA 3, and a slight female predominance among diagnoses identified by MA 2 and the OA (Table 2). MA 3 had the highest proportion of diagnoses in the 5-9-year age group (7.4%) compared to the OA, which had the lowest (7.0%), and MA 1 had the highest proportion of diagnoses in the 55-59-year age group (13.9%) compared to MA 2, which had the lowest (13.6%) (Fig. 1).
Fig. 1.
Age distribution among Lyme disease diagnoses identified by algorithms in MarketScan database, 2016–2019. Legend: OA = Original Algorithm; MA 1 = Modified Algorithm 1; MA 2 = Modified Algorithm 2; MA 3 = Modified Algorithm 3
All three MAs had a higher proportion of diagnoses in high-incidence jurisdictions compared to the OA, with MA 3 having the highest, followed by MA 1 and MA 2 (Table 2). All three MAs also had a higher proportion of diagnoses during the months of May-August compared to the OA, with MA 3 being the highest, followed by MA 1 and MA 2 (Fig. 2). The majority of diagnoses during the months of May-August were among men, with 52.0% identified by the OA, 52.5% identified by MA 2, 52.9% identified by MA 1, and 53.4% identified by MA 3.
Fig. 2.
Seasonality of Lyme diagnoses identified by algorithms in MarketScan database, 2016–2019. Legend: OA = Original Algorithm; MA 1 = Modified Algorithm 1; MA 2 = Modified Algorithm 2; MA 3 = Modified Algorithm 3. Month of diagnosis refers to the month that the ICD code was assigned
Mean number of days between a Lyme disease diagnosis code and a qualifying prescription claim
The overall mean number of days from an ICD code encounter to the closest qualifying prescription claim was less than one day for all four algorithms (MA 1: -0.27 ± 8.53 (SD); MA 2: 0.14 ± 4.41 (SD); MA 3: 0.06 ± 4.16 (SD); OA: -0.05 ± 9.01 (SD)). The median and IQR were both zero for all four algorithms.
When looking at the qualifying prescription claim that occurred closest in time to a patient’s ICD code encounter, doxycycline was the most commonly prescribed drug across all four algorithms (range 34,747 − 38,194 prescriptions) and had the lowest mean number of days from ICD code assignment to prescription claim among all qualifying drugs (Fig. 3 and Appendix B). Among MA 1 and MA 3, which limited qualifying drugs to first-line therapies only, ceftriaxone sodium was the least common drug to occur closest in time to a patient’s ICD code encounter (183 prescriptions for both MA 1 and MA 3), and among MA 2 and the OA, which did not limit qualifying drugs to first-line therapies only, erythromycin was the least commonly prescribed drug to occur closest in time to a patient’s ICD code encounter (52 and 60 prescriptions, respectively). Overall, ceftriaxone sodium was the drug with the highest mean number of days from ICD code assignment to prescription claim when it was the drug prescribed closest in time to a patient’s ICD code encounter (Fig. 3 and Appendix B).
Fig. 3.
Days from diagnosis code to prescription claim among Lyme disease diagnoses in MarketScan database, 2016–2019. Legend: This plot displays data for only first-line drugs for all four algorithms, including MA 2 and OA, which, by definition, do not restrict qualifying drugs to first-line only. MA 1 = Modified Algorithm 1; MA 2 = Modified Algorithm 2; MA 3 = Modified Algorithm 3; OA = Original Algorithm. OA: Amoxicillin (Mean ± SD: 0.2 ± 9.0, Median: 0, IQR: 0); Cefuroxime axetil (Mean ± SD: 1.7 ± 10.2, Median: 0, IQR: 5), Ceftriaxone sodium (Mean: 6.5 ± 10.8, Median: 5, IQR: 12.8); Doxycycline (Mean ± SD: -0.7 ± 7.7, Median: 0, IQR: 0). MA 1: Amoxicillin (Mean ± SD: 0.3 ± 9.5, Median: 0, IQR: 0); Cefuroxime axetil (Mean ± SD: 2.2 ± 10.9, Median: 0, IQR: 6); Ceftriaxone sodium (Mean ± SD: 7.3 ± 11.8, Median: 5, IQR: 15.5); Doxycycline (Mean ± SD: -0.6 ± 8.0, Median: 0, IQR: 0). MA 2: Amoxicillin (Mean ± SD: 0.1 ± 4.0, Median: 0, IQR: 0); Cefuroxime axetil (Mean ± SD: 1.1 ± 5.1, Median: 0, IQR: 2); Ceftriaxone sodium (Mean ± SD: 4.1 ± 5.8, Median: 3, IQR: 9); Doxycycline (Mean ± SD: -0.1 ± 3.9, Median: 0, IQR: 0). MA 3: Amoxicillin (Mean ± SD: 0.2 ± 4.2, Median: 0, IQR: 0); Cefuroxime axetil (Mean ± SD: 1.2 ± 5.3, Median: 0, IQR: 3); Ceftriaxone sodium (Mean ± SD: 4.0 ± 6.1, Median: 3, IQR: 9); Doxycycline (Mean ± SD: -0.1 ± 4.0, Median: 0, IQR: 0)
Discussion
In this study, we sought to improve our confidence in the accuracy of a claims-based algorithm for Lyme disease by applying and comparing diagnoses resulting from three different modifications to our existing algorithm: (1) restriction to prescription claims for first-line recommended antimicrobials for Lyme disease only (MA 1), (2) reduction of the amount of time between a diagnosis code for Lyme disease and a qualifying prescription claim from ± 30 days to ± 14 days (MA 2), and (3) a combination of both of these modifications (MA 3). While observed differences in characteristics of diagnoses identified by each algorithm, particularly age and sex, were small enough that they could have been artifacts of random data variability, MA 3 resulted in the most diagnoses in men, high-incidence jurisdictions, and summer months compared to the other algorithms, rendering it the most consistent with trends observed in public health surveillance for Lyme disease.
The slight female predominance in diagnoses identified by MA 2 and the OA and less pronounced peak in diagnoses among those 5–9 years of age is consistent with results previously demonstrated by the OA in Schwartz et al. [12]; other studies that utilized claims data to identify Lyme disease diagnoses also observed a slight female predominance [9, 10]. However, one study that looked at Medicaid claims data found that males were significantly more likely to be diagnosed with Lyme disease during the months of June-August and less likely to be diagnosed during the months of December-March compared to females [19]. This aligns with our finding that the majority of diagnoses in the months of May-August were among men, with MA 3 and MA 1 having the highest proportions, though differences were small.
Notably, the overall mean number of days from Lyme disease ICD code assignment to a prescription claim were negative for MA 1 and the OA, and when we limited to prescription claims for doxycycline only, it was negative across all four algorithms. This indicates that the prescription claim occurred before an ICD code for Lyme disease was assigned and could be reflective of coding practices, such as cases where a code might have been applied retrospectively. It could also reflect presumptive treatment of Lyme disease by a clinician before an in-person visit could occur, such as in the case of telehealth visits, which we did not specifically explore.
Diagnostic coding practices are known to vary [30, 31], and ICD codes that occur in proximity to a prescription claim for an appropriate antimicrobial could be unrelated. For example, an ICD code may linger in a patient’s record from a previous diagnosis or represent clinician suspicion or differential diagnosis. Additionally, some of the second-line antimicrobials included in the OA and MA 3 algorithms are infrequently used to treat Lyme disease and no longer recommended [25]. The steps we took to restrict to first-line recommended antimicrobials and limit the timeframe between occurrence of an ICD code and a qualifying prescription claim reduces the potential that our algorithm is identifying these scenarios as incident Lyme disease diagnoses, increasing our overall confidence in the validity of diagnoses identified.
While Lyme disease diagnoses identified by MA 3 displayed trends most consistent with public health surveillance data, trends in diagnoses identified by all four algorithms generally reflected those seen in national surveillance data, and the close temporal distribution of prescription claims for first-line recommended antimicrobials around the day an ICD code for Lyme disease was assigned lends confidence to the use of these algorithms for supplemental, routine tracking of Lyme disease diagnoses in insurance claims data.
Cocoros et al. [24], calculated PPV of the same claims-based algorithm (the OA) that we propose modifications to here. While PPV is a measure particularly useful in clinical settings, we took steps intended to increase the overall accuracy of a claims-based algorithm, recognizing that while Lyme disease diagnoses identified in claims data represent only a fraction of true Lyme disease cases, they are a measurable and trackable fraction. We echo the sentiments of other authors that further efforts are needed to directly assess performance of our proposed modified algorithms and other algorithms developed for Lyme disease diagnosis identification and tracking in claims data [32].
Limitations
Results generated using claims-based case definitions may not be representative of the general population; while MarketScan is a large commercial insurance claims database and has been determined to be geographically representative and demographically similar to the U.S. population [12], it does not contain information for persons who are uninsured, ≥ 65 years of age, or military personnel, impacting generalizability. However, the trends identified in this study are similar to those seen in other studies that utilized insurance claims-based systems, which lends confidence to our findings. Additionally, claims-based algorithms rely on diagnosis codes, which are often under-captured in administrative claims data and subject to error from the clinicians and billing specialists who assign them. This could lead to misclassification and the potential for both overreporting and underreporting of Lyme disease diagnoses.
Because of variations in coding practices, it is also possible that some patients identified as having Lyme diseases diagnoses by our algorithms received treatment for another condition unrelated to the ICD code or were treated empirically; thus, diagnoses, as defined here, do not necessarily equate to true infection with B. burgdorferi. Additionally, to simplify our database query, our algorithms required a minimum of seven days of treatment with a qualifying antimicrobial, but established guidelines at the time recommended a minimum duration longer than seven days for some qualifying antimicrobials, increasing the potential of capturing diagnoses that were not actually Lyme disease. We attempted to limit this through our modifications to restrict to first-line recommended antimicrobials only and by shortening the timeframe between an ICD code for Lyme disease and a prescription claim. The finding that treatment was prescribed, on average, less than one day from when an ICD code for Lyme disease was assigned for all four algorithms further increased our confidence that we are accurately capturing treatments prescribed for Lyme disease.
A small proportion of patients had prescription claims for more than one first-line recommended antimicrobial on the same day (OA: 0.5%, MA 1: 0.6%, MA 2: 0.5%, MA 3: 0.6%). Because we retained only the name of the generic drug that came first in alphabetical order in these cases, the antimicrobial that we categorized these patients as having received to treat a Lyme disease diagnosis may be misclassified or skewed towards generic drug names that come earlier in the alphabet. However, doxycycline, the most commonly recommended first line drug for treatment of Lyme disease [27], was by far the most commonly prescribed drug in prescription claims across all four algorithms, despite coming last in alphabetical order among the first-line drugs.
Lastly, by dichotomizing geographical heterogeneity of Lyme disease risk into high and low incidence, we did not separately describe the frequency or characteristics of Lyme disease diagnoses among patients living in jurisdictions with emerging Lyme disease incidence, particularly those that border high-incidence jurisdictions. Thus, the frequency of Lyme disease diagnoses in low-incidence states might be inflated due to inclusion of diagnoses from areas with emerging risk.
Conclusions
Modification of our claims-based algorithm for Lyme disease to limit to only first-line recommended antimicrobials and shorten the timeframe between a Lyme disease diagnosis code and a prescription claim to ± 14 days might improve accuracy of identification of Lyme disease diagnoses. This modified algorithm may be useful for other investigators in identifying and monitoring Lyme disease diagnoses in both commercial claims and health-system-based administrative databases; standardized use of such an algorithm could facilitate comparisons across studies.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Acknowledgements
We thank Grace Marx and Amy Beeson for providing valuable input on refinement of algorithm criteria.
Abbreviations
- CDC
Centers for Disease Control and Prevention
- ICD
International Classification of Diseases
- IQR
Inter quartile range
- MA
Modified algorithm
- NNDSS
National Notifiable Disease Surveillance System
- OA
Original algorithm
- PPV
Positive predictive value
- SD
Standard deviation
Author contributions
KJK and AFH were responsible for conceptualization and study design; CCN and AE were responsible for data analysis; CCN was responsible for drafting the manuscript; KJK, AFH, SAH, and ARE were responsible for reviewing and revising the manuscript. All authors reviewed and approved the final manuscript.
Funding
This study was supported by the Centers for Disease Control and Prevention.
Data availability
The data that support the findings of this study are available from Merative but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of Merative.
Declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Footnotes
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
- 1.Schwartz AM, Hinckley AF, Mead PS, Hook SA, Kugeler KJ. Surveillance for Lyme Disease - United States, 2008–2015. MMWR Surveill Summ. 2017;66(22):1–12. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2.Steere AC, Strle F, Wormser GP, Hu LT, Branda JA, Hovius JW, et al. Lyme borreliosis. Nat Rev Dis Primers. 2016;2:16090. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Centers for Disease Control and Prevention. National Notifiable Diseases Surveillance System (NNDSS). 2022. https://www.cdc.gov/nndss/index.html. Accessed 16 Oct 2023.
- 4.Centers for Disease Control and Prevention. Lyme Disease Surveillance Data. 2024. https://www.cdc.gov/lyme/data-research/facts-stats/surveillance-data-1.html. Accessed 16 Aug 2024.
- 5.Mead PS. Epidemiology of Lyme disease. Infect Dis Clin North Am. 2015;29(2):187–210. [DOI] [PubMed] [Google Scholar]
- 6.Schiffman EK, McLaughlin C, Ray JAE, Kemperman MM, Hinckley AF, Friedlander HG, Neitzel DF. Underreporting of Lyme and other Tick-Borne diseases in residents of a High-Incidence County, Minnesota, 2009. Zoonoses Public Health. 2018;65(2):230–7. [DOI] [PubMed] [Google Scholar]
- 7.White J, Noonan-Toly C, Lukacik G, Thomas N, Hinckley A, Hook S, Backenson PB. Lyme Disease Surveillance in New York State: an Assessment of Case Underreporting. Zoonoses Public Health. 2018;65(2):238–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Council of State and Territorial Epidemiologists. Modification of Lyme Disease Case Definition. 2022. https://cdn.ymaws.com/www.cste.org/resource/resmgr/ps/ps2021/21-ID-05_Lyme_Disease.pdf. Accessed 16 Aug 2024.
- 9.Tseng YJ, Cami A, Goldmann DA, DeMaria A Jr., Mandl KD. Vector Borne Zoonotic Dis. 2015;15(10):591–6. Using Nation-Wide Health Insurance Claims Data to Augment Lyme Disease Surveillance. [DOI] [PubMed]
- 10.Adrion ER, Aucott J, Lemke KW, Weiner JP. Health care costs, utilization and patterns of care following Lyme disease. PLoS ONE. 2015;10(2):e0116767. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11.Nelson CA, Saha S, Kugeler KJ, Delorey MJ, Shankar MB, Hinckley AF, Mead PS. Incidence of clinician-diagnosed Lyme Disease, United States, 2005–2010. Emerg Infect Dis. 2015;21(9):1625–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Schwartz AM, Kugeler KJ, Nelson CA, Marx GE, Hinckley AF. Use of Commercial Claims Data for evaluating trends in Lyme Disease diagnoses, United States, 2010–2018. Emerg Infect Dis. 2021;27(2):499–507. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Kugeler KJ, Schwartz AM, Delorey MJ, Mead PS, Hinckley AF. Estimating the frequency of Lyme Disease diagnoses, United States, 2010–2018. Emerg Infect Dis. 2021;27(2):616–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 14.Montejano LB. Assessing the incidence and treatment of post-treatment Lyme disease syndrome in an administrative claims database. Value Health. 2014;17(3):A267. [Google Scholar]
- 15.Beach CM, Hart SA, Nowalk A, Feingold B, Kurland K, Arora G. Increasing Burden of Lyme Carditis in United States children’s hospitals. Pediatr Cardiol. 2020;41(2):258–64. [DOI] [PubMed] [Google Scholar]
- 16.Clayton JL, Jones SG, Dunn JR, Schaffner W, Jones TF. Enhancing Lyme Disease Surveillance by using administrative Claims Data, Tennessee, USA. Emerg Infect Dis. 2015;21(9):1632–4. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17.Jones SG, Coulter S, Conner W. Using administrative medical claims data to supplement state disease registry systems for reporting zoonotic infections. J Am Med Inf Assoc. 2013;20(1):193–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Goodlet KJ, Fairman KA. Adverse events Associated with antibiotics and Intravenous therapies for Post-lyme Disease Syndrome in a commercially insured sample. Clin Infect Dis. 2018;67(10):1568–74. [DOI] [PubMed] [Google Scholar]
- 19.Rebman AW, Wang L, Yang T, Marsteller JA, Murphy SME, Uriyo M, et al. Incidence of Lyme Disease diagnosis in a Maryland Medicaid Population, 2004–2011. Am J Epidemiol. 2018;187(10):2202–9. [DOI] [PubMed] [Google Scholar]
- 20.Rebman AW, Yang T, Wang L, Marsteller JA, Murphy SME, Uriyo M, Aucott JN. Outpatient visits before and after Lyme disease diagnosis in a Maryland employer-based health plan. BMC Health Serv Res. 2023;23(1):919. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Schwartz AM, Shankar MB, Kugeler KJ, Max RJ, Hinckley AF, Meltzer MI, Nelson CA. Epidemiology and cost of Lyme disease-related hospitalizations among patients with employer-sponsored health insurance-United States, 2005–2014. Zoonoses Public Health. 2020;67(4):407–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Tseng YJ, DeMaria A Jr., Goldmann DA, Mandl KD. Claims-based diagnostic patterns of patients evaluated for Lyme Disease and given extended antibiotic therapy. Vector Borne Zoonotic Dis. 2017;17(2):116–22. [DOI] [PubMed] [Google Scholar]
- 23.Jones SG, Conner W, Song B, Gordon D, Jayakaran A. Comparing spatio-temporal clusters of arthropod-borne infections using administrative medical claims and state reported surveillance data. Spat Spatiotemporal Epidemiol. 2012;3(3):205–13. [DOI] [PubMed] [Google Scholar]
- 24.Cocoros NM, Kluberg SA, Willis SJ, Forrow S, Gessner BD, Nutt CT, et al. Validation of Claims-based algorithm for Lyme Disease, Massachusetts, USA. Emerg Infect Dis. 2023;29(9):1772–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Lantos PM, Rumbaugh J, Bockenstedt LK, Falck-Ytter YT, Aguero-Rosenfeld ME, Auwaerter PG, et al. Clinical practice guidelines by the Infectious Diseases Society of America (IDSA), American Academy of Neurology (AAN), and American College of Rheumatology (ACR): 2020 guidelines for the Prevention, diagnosis, and treatment of Lyme Disease. Arthritis Rheumatol. 2021;73(1):12–20. [DOI] [PubMed] [Google Scholar]
- 26.Merative. Merative Marketscan Research Databases. 2023. https://www.merative.com/documents/brief/marketscan-explainer-general. Accessed 16 Oct 2023.
- 27.Wormser GP, Dattwyler RJ, Shapiro ED, Halperin JJ, Steere AC, Klempner MS, et al. The clinical assessment, treatment, and prevention of lyme disease, human granulocytic anaplasmosis, and babesiosis: clinical practice guidelines by the Infectious Diseases Society of America. Clin Infect Dis. 2006;43(9):1089–134. [DOI] [PubMed] [Google Scholar]
- 28.R Core Team. R: a language and environment for statistical computing. Austria: R Foundation for Statistical Computing. Vienna; 2018. [Google Scholar]
- 29.SAS Institute. SAS OnDemand for Academics. 9.4 ed. Cary, NC. 2013.
- 30.Sickbert-Bennett EE, Weber DJ, Poole C, MacDonald PD, Maillard JM. Utility of International classification of diseases, Ninth Revision, clinical modification codes for communicable disease surveillance. Am J Epidemiol. 2010;172(11):1299–305. [DOI] [PubMed] [Google Scholar]
- 31.Thomas N, Rutz HJ, Hook SA, Hinckley AF, Lukacik G, Backenson BP, et al. Assessing diagnostic coding practices among a sample of healthcare facilities in Lyme disease endemic areas: Maryland and New York - a brief report. Zoonoses Public Health. 2018;65(2):275–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Nam YH, Willis SJ, Mendelsohn AB, Forrow S, Gessner BD, Stark JH, et al. Healthcare claims-based Lyme disease case-finding algorithms in the United States: a systematic literature review. PLoS ONE. 2022;17(10):e0276299. [DOI] [PMC free article] [PubMed] [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Supplementary Materials
Data Availability Statement
The data that support the findings of this study are available from Merative but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are, however, available from the authors upon reasonable request and with permission of Merative.