Abstract
Background:
Despite extensive research regarding the etiology of Huntington’s disease, relatively little is known about the epidemiology of this rare disorder, particularly in the United States where there are no national-scale estimates of the disease.
Objectives:
To provide national-scale estimates of Huntington’s disease in a U.S. population and to test if disease rates are increasing, and whether frequency varies by race, ethnicity or other factors.
Methods:
Using an insurance database of over 67 million enrollees, we retrospectively identified a cohort of 3,707 individuals diagnosed with Huntington’s disease between 2003–2016. We estimated annual incidence, annual diagnostic frequency, and tested for trends over time and differences in diagnostic frequency by sociodemographic characteristics.
Results:
During the observation period, the age-adjusted cumulative incidence rate was1.22 per 100,000 persons (95% confidence interval: 1.53, 1.65), and age-adjusted diagnostic frequency was 6.52 per 100,000 persons (95% confidence interval: 5.31, 5.66); both rates remained relatively stable over the 14-year period. We identified several previously unreported differences in Huntington’s disease frequency by self-reported gender, income and race/ethnicity. However, racial/ethnic differences were of lower magnitude than have previously been reported in other country-level studies.
Conclusions:
In these large-scale estimates of U.S. Huntington’s disease epidemiology, we found stable disease frequency rates that varied by several sociodemographic factors. These findings suggest that disease patterns may be more driven by social or environmental factors than has previously been appreciated. Results further demonstrate the potential utility of administrative Big Data in rare disease epidemiology when other data sources are unavailable.
INTRODUCTION
Huntington’s disease (HD) is a rare but devastating neurodegenerative disease marked by the progressive development of psychiatric symptoms, cognitive impairment, and involuntary jerky movements called chorea. An inherited autosomal dominant disorder, HD is caused by a cytosine-adenine-guanine (CAG) trinucleotide repeat expansion at the 5’ end of the huntingtin gene (1). CAG expansion length is the strongest risk factor for developing HD, which generally occurs in individuals with 36 or more repeats. Greater numbers of repeats are associated with earlier age of onset and may potentially influence clinical progression (2). Evidence also suggests that a number of single nucleotide polymorphisms contribute to HD phenotypic expression (3, 4), though the specific mechanisms of CAG expansion remain unclear.
This complex genetic landscape contributes to an uncertain epidemiology in which HD estimates appear to vary by both time and place. Among many unknowns is whether incidence and prevalence is increasing, as has recently been suggested (5), and the degree to which geographically-patterned genetic differences contribute to overall disease risk and age of onset. Epidemiologic studies conducted in countries with majority Caucasian populations have consistently suggested up to a 10-fold higher HD prevalence as compared to studies conducted in majority Asian countries (6–10). These group differences have motivated significant research to identify the corresponding genetic factors that may be at play. Recent evidence has also suggested a potential trend of increasing HD rates, particularly in studies conducted in populations of majority Caucasian descent, that do not appear to be fully explained by variations in diagnosis methods or case-ascertainment rates (8, 11–14).
However, interpreting this evidence is challenging because few studies have been conducted in a single, yet racially/ethnically diverse population observed over time. Rather, evidence has been drawn primarily from meta-analyses where estimates are compared across studies conducted in different regions of the world with differing demographic population structures, case-ascertainment methods, and varying levels of access to health care and genetic testing. Among the few single population studies, comparisons are typically dichotomous, for instance comparing individuals of European ancestry to all other backgrounds (13, 15). This issue is compounded by the fact that, as a Mendelian disorder, Huntington’s disease is extremely rare, thus estimating standard epidemiologic features using population samples is infeasible. Instead, HD cases are typically identified using administrative billing codes, (8, 10–12) with data primarily drawn from primary care databases or registries compiled across countries (12, 16, 17). The lack of such integrated databases in the U.S. has limited research efforts, with only two previous U.S. studies conducted, both of which were limited to extremely small geographic areas in Minnesota (18), and Maryland (15).
In this analysis, we leveraged 14 years of administrative data on a dynamic population of approximately 67 million individuals to provide the first national-level U.S. estimates of HD. Our primary objective was to describe the diagnostic frequency, incidence and sociodemographic features of the disease. We further sought to evaluate whether the racial/ethnic patterns in disease presentation that have been previously described in the literature hold in racially and ethnically diverse population, and to characterize other important sources of variation.
METHODS
Study population
We retrospectively assessed de-identified, individual-level data from the Optum Clinformatics Datamart (OptumInsight, Eden Prairie, MN), an administrative claims database of privately insured enrollees. For each member, the dataset contains demographic and health plan information as well as documentation of all inpatient and outpatient encounters. Because it is not a population-based sample, the cohort as a whole is not representative of the overall U.S. population, and is limited to persons with private insurance. However, this dynamic cohort covers nearly 10% of the U.S. population with members residing in all 50 states. Given the generally low prevalence of HD, approaches utilizing nationally representative samples would have orders-of-magnitude smaller sample sizes—providing low power to detect the group differences of interest here.
Measures
We accessed data for the 2003–2016 period, identifying patients diagnosed with HD according to International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), code 333.4, for 2003 - September 30, 2015, and Tenth Revision (ICD-10-CM), code G10, for September 30, 2015 onwards. For each claim line item, we established whether the ICD code for HD was included as one of the five primary diagnosis codes recorded. Incidence rates were estimated based on cases where we observed at least one prior claim absent an HD diagnosis, followed by at least one claim where a HD diagnosis was indicated. In sensitivity analyses we restricted the incident claims to those where the first non-HD claim proceeded the HD claim by a 3-month or 6-month window, and we additionally restricted the sample to individuals with at least 6-months of continuous insurance enrollment prior to the first HD claim. Because data were abstracted from a commercial insurance dataset rather than representatively sampled patient population, we refer to the number of persons with HD in the time period in terms of the diagnostic frequency, rather than the prevalence, of HD. Socio-demographic variables were retrieved from Optum membership files. Because each plan member reported these sociodemographic characteristics at a single point in time, we did not have the ability to track longitudinal measures of income or education. The ISMMS IRB (HS #17–01260) and the Stanford University IRB (e-Protocol #43176) approved this protocol; participant consent was waived because the analysis used preexisting data.
Statistical analysis
We estimated crude annual diagnostic frequency rates by dividing the number of HD cases by the population of members in the corresponding year observed for at least one month. Significant trends over time were assessed using asymptotic Cochran-Armitage tests. Annual incidence rates were estimated as a new diagnosis of HD divided by the corresponding number of person years for the entire sample (19). Ninety-five percent confidence intervals for crude incidence rates were calculated assuming a Poisson distribution using an exact method (20). Age-adjusted estimates were adjusted by the direct method to the 2000 U.S. Census population (21). Sociodemographic characteristics of the study sample were assessed with the cases pooled over the entire 2003–2016 period to provide stable estimates. Differences in diagnostic frequency by sociodemographic characteristics were assessed using chi-squared tests with continuity correction; statistical tests were 2-sided. Data were analyzed using SAS, version 9.4 (Statistical Analysis System, Cary, NC, USA) and R, version 3.3.2 (R Core Development Team, Vienna, Austria).
RESULTS
The full analytic sample consisted of 67,582,529 individuals identified between 2003 and 2016. Of these, 3,707 had a claim that included a diagnosis with HD, of whom 2,928 had an incident HD diagnosis claim during the observation period. Incident claims were identified as occurring primarily in neurological (23.57%, n=690) and general hospital (11.95%, n=350), or internal medicine settings (10.18%, n=298), as well as in laboratories (7.31%, n=214) and other settings (46.99%, n=1376). On average, the follow up time for the entire sample was 2.77 years (standard deviation (SD) = 0.008). Among those with an HD diagnosis, average follow-up time was 4.40 years (SD = 0.009).
Yearly diagnostic frequency and incidence rates are presented in Tables 1 and 2. The 14-year, crude diagnostic frequency rate was 5.49 per 100,000 persons, (95% CI: 5.31, 5.66). The age-adjusted rate was 6.52 per 100,000 persons (95% CI: 6.42, 6.62). Across the entire 2003–2016 period there was no significant diagnostic frequency increase after age-standardization, however, we did note some variation; frequency peaked at 7.77 per 100,000 persons (95% CI: 7.37, 8.17) in 2010. In terms of incidence, the crude cumulative rate was 1.59 per 100,000 person-years (95% CI: 1.53, 1.65). After age-standardization, the cumulative incidence was 1.22 per 100,000 person-years (95% CI: 1.06, 1.15) and remained stable across the period. Results were similar when restricting the sample to cases where there was at least a 3-month window between first non-HD claim and incident HD claim, but annual incidence rates were slightly, and non-significantly, lower when using the 6-month criteria. When we restricted the sample to those with 6-months of continuous enrollment prior to the HD claim, we observed a similarly non-significant decrease in annual and cumulative incidence (data not shown).
Table 1.
Annual Diagnostic Frequencya Estimates of Huntington’s Disease in a Privately Insured, U.S. Population, 2003–2016.
| Year | No. of Cases | Annual Membership | Annual Diagnostic Frequency per 100,000 persons | 95% CIb | Annual Age-Adjustedc Diagnostic Frequency per 100,000 persons | 95% CI |
|---|---|---|---|---|---|---|
| 2003 | 756 | 16,253,626 | 4.65 | 4.31, 4.98 | 4.80 | 4.46, 5.15 |
| 2004 | 816 | 16,681,672 | 4.89 | 4.56, 5.23 | 5.03 | 4.68, 5.38 |
| 2005 | 995 | 17,566,408 | 5.66 | 5.31, 6.02 | 5.62 | 5.27, 5.97 |
| 2006 | 1,192 | 17,944,977 | 6.64 | 6.27, 7.02 | 6.37 | 6.01, 6.73 |
| 2007 | 1,298 | 17,911,965 | 7.25 | 6.85, 7.64 | 6.96 | 6.58, 7.35 |
| 2008 | 1,355 | 17,000,099 | 7.97 | 7.55, 8.40 | 7.49 | 7.09, 7.89 |
| 2009 | 1,365 | 16,095,874 | 8.48 | 8.03, 8.93 | 7.53 | 7.13, 7.93 |
| 2010 | 1,423 | 15,508,269 | 9.18 | 8.70, 9.65 | 7.77 | 7.37, 8.17 |
| 2011 | 1,447 | 15,509,136 | 9.33 | 8.85, 9.81 | 7.72 | 7.32, 8.12 |
| 2012 | 1,475 | 15,645,263 | 9.43 | 8.95, 9.91 | 7.61 | 7.22, 8.00 |
| 2013 | 1,507 | 16,118,125 | 9.35 | 8.88, 9.82 | 7.44 | 7.06, 7.82 |
| 2014 | 1,424 | 15,761,915 | 9.03 | 8.57, 9.50 | 7.09 | 6.72, 7.46 |
| 2015 | 1,390 | 16,950,207 | 8.20 | 7.77, 8.63 | 6.29 | 5.96, 6.62 |
| 2016 | 1,213 | 18,900,367 | 6.42 | 6.06, 6.78 | 4.73 | 4.46, 5.00 |
| Total | 3,707 | 67,582,529 | 5.49 | 5.31, 5.66 | 6.52 | 6.42, 6.62 |
Diagnostic frequency rates are per 100,000 person-years.
CI, confidence interval.
Age adjusted to the 2000 US population.
Table 2.
Annual Incidencea Estimates of Huntington’s Disease in a Privately Insured, U.S. Population, 2004–2016.
| Year | No. of Cases | Total Person Years | Annual Incidence per 100,000 Person Years | 95% CI b | Age-Adjustedc Annual Incidence per 100,000 Person Years | 95% CI |
|---|---|---|---|---|---|---|
| 2004 | 117 | 12,926,069 | 0.91 | 0.44, 1.07 | 0.97 | 0.81, 1.13 |
| 2005 | 136 | 13,516,341 | 1.01 | 0.84, 1.18 | 0.66 | 0.55, 0.77 |
| 2006 | 197 | 13,921,818 | 1.42 | 1.22, 1.61 | 0.86 | 0.74, 0.98 |
| 2007 | 184 | 14,011,451 | 1.31 | 1.12, 1.5 | 0.79 | 0.68, 0.90 |
| 2008 | 203 | 13,517,434 | 1.5 | 1.3, 1.71 | 1.13 | 0.97, 1.29 |
| 2009 | 234 | 13,066,931 | 1.79 | 1.56, 2.02 | 1.23 | 1.07, 1.39 |
| 2010 | 181 | 12,666,205 | 1.43 | 1.22, 1.64 | 0.87 | 0.74, 1.00 |
| 2011 | 243 | 12,811,864 | 1.92 | 1.66, 2.14 | 1.15 | 1.00, 1.30 |
| 2012 | 231 | 13,015,802 | 1.77 | 1.55, 2.00 | 1.24 | 1.08, 1.4 |
| 2013 | 245 | 13,276,662 | 1.85 | 1.61, 2.08 | 1.26 | 1.1, 1.42 |
| 2014 | 245 | 12,721,108 | 1.93 | 1.68, 2.17 | 1.31 | 1.15, 1.47 |
| 2015 | 294 | 13,564,544 | 2.17 | 1.92, 2.42 | 1.68 | 1.49, 1.87 |
| 2016 | 267 | 15,198,207 | 1.76 | 1.55, 2.42 | 1.18 | 1.04, 1.32 |
| Total | 2,777 | 174,214,435 | 1.59 | 1.53, 1.65 | 1.22 | 1.06, 1.15 |
Diagnostic frequency rates are per 100,000 person-years.
CI, confidence interval.
Age adjusted to the 2000 US population.
The distribution of individuals diagnosed with HD by sociodemographic characteristics is presented in Table 3. Among participants in the HD sample, 9.4% were missing all socio-demographic data (n = 348). Overall, the age of participants ranged broadly from under 10 to over 80. The mean age of incident cases was 53.81 (SD = 17.46), and the age at which cases were analyzed was 62.56 years (SD = 16.34). Among those diagnosed, 1.18% (n = 44) were identified as 20 years old or less, indicating juvenile onset. Age-adjusted diagnostic frequency was higher among women than men (7.05 per 100,000; 95% CI: 6.91, 7.19, versus 6.10 per 100,000, 95% CI: 5.96, 6.24, P < 0.01). Individuals who identified as White had the highest diagnostic frequency (7.76 per 100,000 persons, 95% CI: 7.63, 7.89), and were lowest among those who identified as Asian (3.58 per 100,000 persons, 95% CI: 3.19, 3.97) or Hispanic (4.34 per 100,000 persons, 95% CI: 4.09, 4.59; P < 0.01). Among individuals who identified as Black or African American, age-adjusted diagnostic frequency was 7.38 per 100,000 persons, (95% CI: 7.03, 7.73), and not significantly different from those who identified as White.
Table 3.
Demographic-Specific Diagnostic Frequencya Estimates of Huntington’s Disease in a Privately Insured, U.S. Population, 2003–2016.
| No. of Cases | Annual Membership | Diagnostic Frequency per 100,000 persons | 95% CIb | Age-Adjustedc Diagnostic Frequency per 100,000 persons | 95% CI | |
|---|---|---|---|---|---|---|
| Gender | ||||||
| Female | 2,094 | 34,099,686 | 6.14 | 5.88, 6.40 | 7.05 | 6.91, 7.19 |
| Male | 1,611 | 33,466,536 | 4.81 | 4.58, 5.05 | 6.10 | 5.96, 6.24 |
| Missing | 2 | 16,307 | ||||
| Race/Ethnicity | ||||||
| Asian | 56 | 2,692,791 | 2.08 | 1.53, 2.62 | 3.58 | 3.19, 3.97 |
| Black | 312 | 5,524,546 | 5.65 | 5.02, 6.27 | 7.38 | 7.03, 7.73 |
| Hispanic | 210 | 7,173,906 | 2.93 | 2.53, 3.32 | 4.34 | 4.09, 4.59 |
| White | 2,648 | 39,341,133 | 6.73 | 6.47, 6.99 | 7.76 | 7.63, 7.89 |
| Missing | 481 | 2,694,411 | ||||
| Annual Household Income | ||||||
| <49,000 | 740 | 8,681,881 | 8.52 | 7.91, 9.14 | 8.32 | 8.00, 8.64 |
| 50,000–99,000 | 831 | 11,255,052 | 7.39 | 6.88, 7.89 | 7.59 | 7.41, 7.77 |
| 100,000 or more | 570 | 11,911,302 | 4.79 | 4.39, 5.18 | 6.54 | 6.37, 6.71 |
| Missing | 1,566 | 25,578,552 | ||||
| Educational Attainment | ||||||
| High School or less | 1,066 | 18,175,085 | 5.86 | 5.51, 6.22 | 7.08 | 6.89, 7.27 |
| Some College | 1,846 | 28,977,425 | 6.37 | 6.08, 6.66 | 7.16 | 6.96, 7.36 |
| College or professional | 429 | 9,809,263 | 4.37 | 3.95, 4.79 | 5.98 | 5.74, 6.22 |
| Missing | 366 | 465,014 |
Diagnostic frequency rates are per 100,000 person-years.
CI, confidence interval.
Age adjusted to the 2000 US population.
We identified an inverse relationship between income and HD diagnosis; as annual household income decreased, frequency of HD increased monotonically (P < 0.01). In terms of educational attainment, age-adjusted HD frequency was lowest among those with a college of professional degree (5.98 per 100,000 persons, 95% CI: 5.74, 6.22), but we did not observe the same graded relationship as with income. A substantial proportion of the sample were missing income data (42.24%), and to a lesser extent race/ethnicity (12.98%) or educational data (9.87%).
DISCUSSION
We used a commercial insurance database of over 67 million health plan members identified between 2003 and 2016 to provide the first large-scale, Huntington’s Disease estimates for the U.S. We estimated a cumulative age-adjusted HD frequency rate of 6.52 per 100,000 persons and a cumulative incidence rate of 1.22 per 100,000 person years. We noted fluctuating trends in overall frequency and incidence across the 14-year observation period, however these were relatively stable, and non-significant, after age-standardization. We further identified—in a setting where all individuals were insured—several important differences in diagnostic frequency by sociodemographic factors, including self-reported gender, race/ethnicity and income.
While research conducted in other countries has documented substantial increases in the prevalence of HD over time (6), evidence for a concomitant increase in incidence is more questionable. The majority of HD studies have been limited by extremely small sample sizes with only four previous incidence studies including more than 100 affected individuals (6). A number of these prior studies were also conducted before the genetic test for HD became available in the early 1990s. In the largest prior study identifying new diagnoses, Wexler et al. (12), analyzed 393 adult-onset HD patients using a United Kingdom primary care database. They found that between 1990 and 2010, incidence remained constant at 7.2 (95% CI 6.5, 7.9) per million person-years, despite previously documenting a two-fold increase in prevalence during the same period (8). In comparison, we noted both lower incidence overall, and no meaningful increase in HD diagnostic frequency among this U.S. cohort between 2003 and 2016 after age standardization. It should be noted however, that cases included in the present study were identified during a period where genetic testing was widely available.
In addition to examining HD trends overall, we identified several important differences in diagnostic frequency by sociodemographic characteristics that deserve greater exploration. Consistent with previous literature, we found that HD was diagnosed in this sample most frequently among individuals identifying as White, and least frequently among those identifying as Asian. However, in this U.S.-based sample of Asian Americans, the crude HD diagnostic frequency rate was over twice as large (2.08 per 100,000) as those that has previously been reported in studies conducted in majority Asian countries (Taiwan 0.08–0.42 per 100,000 (10); Japan 0.65–0.72 per 100,000 (22); 0.25 per 100,000 Hong Kong (23). Age adjustment further magnified differences (3.58 per 100,000) between the current study and previous reports in the literature, however, it should be noted that these estimates are based on only 56 participants. Among Hispanic or Latino Americans, crude and age-adjusted diagnostic frequency per 100,000 persons was 2.93 (95% CI: 2.53, 3.32) and 4.34 (95% CI: 4.09, 4.59), respectively. To our knowledge, no previous studies have reported HD rates among Hispanic or Latino Americans, though one epidemiologic study conducted in Venezuela showed a low prevalence of 0.35 per 100,000 (24). Though previous work has highlighted the high frequency of HD among individuals of Caucasian descent specifically, we did not identify any difference in diagnostic frequency between individuals in our sample who identified as White compared to those identifying as African American or Black. Because it is not possible to differentiate HD from HD-like illness (HD2) using diagnostic codes, it is possible that we may have included individuals with HD2 in the present analyses. Given that the majority of individuals with HD2 have African ancestry, this could have resulted in an overestimation of the diagnostic frequency rate among African American or Black patients (25). As HD2 appears to be exceedingly rare, it is unclear to what extent this potential source of misclassification could have biased the observed results. Beyond the differences identified by race/ethnicity, we further noted slightly higher HD prevalence rates among women as compared to men (7.05 per 100,000, 95% CI: 6.91, 7.19 versus 6.10 per 100,100, 95% CI: 5.96, 6.24). Prior studies that have examined gender-differences CAG expansion rates have resulted in mixed results; one study suggested some evidence for a relationship between gender and HD progression while the other identified no significant relationship (26, 27).
Finally, we found an inverse relationship between income and HD diagnosis. Individuals with a self-reported annual income of $49,000 or less had an increased frequency of HD diagnosis compared to those who reported an annual income of $100,000 or more (age-adjusted cumulative diagnostic frequency rate: 8.32 per 100,000 versus 6.54 per 100,000 persons). This association was graded consistently at all levels of income distribution. The same graded relationship did not hold when comparing individuals by educational attainment, though diagnosis was lowest among individuals who reported graduating from college or professional/graduate degrees compared to other education attainment categories. These results have not previously been reported in the HD literature, though the relationship between lower socioeconomic status and increased rates of several other cognitive disorders including Parkinson’s disease (28), Alzheimer’s (29) and dementia (30, 31), is established.
There are several potential reasons why HD might be more common among lower income individuals. First, it is possible that individuals who develop HD drift down the social gradient as psychiatric and cognitive symptoms manifest. This explanation, termed social selection (32), is well-documented for certain psychiatric disorders with known heritable components such as schizophrenia (33, 34). Second, social or environmental factors may be important modifiers of disease experience, even for a monogenetic disorder like HD (35). Wexler et al. have noted that as much as 60% of variation in CAG expansion may be due to non-genetic, environmental factors (26) and some animal models have suggested the potential for environmental modifiers for age of HD onset (36, 37). Third, because individuals with HD often have a parent with HD, intergenerational transmission of income could also play a role. To the extent that patients with HD leave the workforce earlier in life due to illness, their children may be raised in lower income households and thus have a tendency to have lower incomes as adults (38). Finally, it is possible that higher income individuals may be less likely to seek treatment or disclose HD status to avoid potential employment discrimination, or other forms of discrimination. Further research is needed to confirm these findings in other settings, particularly because the population was restricted to insured persons and a large proportion of our sample was missing socioeconomic indicators (42.27%, n=1567).
Some important limitations of this study should be highlighted when considering the findings. First, because we relied on evidence of a clinical diagnosis of HD documented in administrative data, cases may have been missed or misclassified, and it is difficult to determine the direction of this potential bias. On the one hand, given that clinical onset of the disease occurs gradually, often manifesting with early psychiatric or behavioral changes that can be misdiagnosed, we could have under-ascertained cases with pre-manifest disease. (40). Relatedly, individuals with HD may not want to disclose their status until they become symptomatic and require treatment, to avoid stigma or be labeled with a preexisting medical condition (41); these issues may have also contributed to under-ascertainment. On the other hand, HD diagnosis typically involves a definitive genetic test, however because we did not have access to other information to validate the diagnosis, such as electronic health records, mutation carriers who not necessarily display motor symptoms could have mistakenly been included as cases. The inclusion of pre-symptomatic individuals as cases could lead to over-ascertainment, however we expect this form of misclassification to be relatively minimal (39).
Relatedly, we defined newly diagnosed cases as the first documentation of an HD diagnostic code during the observation period. However, because of the dynamic nature of this administrative claims cohort, and the fact that past diagnoses may be less accurately recorded in administrative data as compared to EHR data, it is likely that a portion of cases were incorrectly misclassified as having a new, rather than existing, diagnosis. For example, in the primary analyses, we did not require that incident cases be observed for a set look back period prior to diagnosis and this approach could have contributed to misclassification. Misclassifying existing cases as incident could explain the somewhat higher than expected mean age of onset in the sample which was 53.81 (SD = 17.46). Prior incidence studies have generally found that age of onset varies between 37 and 52 using a variety of different ascertainment methods (12). Following prior work (12), we attempted to reduce the likelihood of this type of error by requiring that cases have at least one recorded contact with the health care system prior to diagnosis with HD to be classified as newly diagnosed. We further conducted several sensitivity analyses employing alternative criteria to distinguish incident from prevalent cases such as requiring a window between the non-incident and incident claim, and requiring a 6-month fixed period of insurance enrollment prior to diagnosis. However, these restrictions may have also led us to under-ascertain incident cases among individuals who are enrolled in private insurance but who use health care services at low rates, or among those who have frequent changes in their insurance coverage. Given that the average observation period was less than three years for the overall sample, using longer look-back restrictions to define incidence has the potential to underestimate incidence, however short look-back windows can lead to failure to adequately distinguish prevalent claims. The potential for residual misclassification in either direction remains an important limitation of this analysis and other studies that use administrative claims to capture epidemiologic trends.
Finally, because we relied on administrative data, these results are not representative of the entire U.S. population of HD patients. Moreover, as all of the individuals included in this study were privately insured, these results may more closely estimate the epidemiologic characteristics of HD in a privately insured population, and may differ from studies conducted with population-based samples. For example, compared to the general U.S. population and privately insured U.S. population, participants in the Optum cohort identified as: Asian (4% vs. 2010 Census: 5%, privately insured: 6%), Black (9% vs. 2010 Census: 13%, privately insured: 10%), Hispanic (11% vs. 2010 Census: 16%, privately insured: 11%) (42). However, while a representative sample would be ideal, the nature of this rare genetic disorder precludes common population sampling approaches. To offset these data limitations several international HD registries have recently been established, however voluntary registry data suffer from similar validity concerns including under-ascertainment of rural populations with limited access to specialty neurological services, and those with less severe or earlier onset symptoms (43, 44). By repurposing this large commercial insurance data set, we are able to estimate the epidemiologic characteristics of HD in the U.S. in a manner that is similar to previous work conducted in countries with large scale clinical databases, yet has to date been impossible in the U.S. In the case of HD, and perhaps several other rare diseases, the validity challenges associated with this type of secondary, Big Data, claims analysis should be weighed against the limited availability of rare disease epidemiologic estimates (45, 46).
To our knowledge, this is the largest study of Huntington’s disease epidemiology to date, and the first to estimate the epidemiologic characteristics of HD in a U.S. sample with national scope. It is also the first to directly examine differences in HD diagnostic frequency by multiple sociodemographic characteristics within a multiethnic cohort of individuals. Enhancing the descriptive epidemiology of HD can contribute our understanding of rare genetic disorders and neurodegenerative disease etiology broadly, and may eventually inform patient care. Several of the findings we report here deserve greater exploration including confirming the observed stability in incidence and diagnostic frequency trends and characterizing the potential relationship between socioeconomic position and HD disease progression. Further research is similarly needed to examine the extent to which health care treatment patterns and access to care, disclosure decisions, or other factors, in addition to genetic explanations, may be driving patterns of Huntington’s disease epidemiology in both the U.S. and other settings.
Acknowledgments
FINANCIAL DISCLOSURES
Emilie Bruzelius is supported by the National Science Foundation (1464297); is employed at the Icahn School of Medicine at Mount Sinai; and is a student at Columbia University. Joseph Scarpa is supported the National Institute of Mental Health of the National Institutes of Health (F30MH106293); he was a post-doctoral fellow at the Icahn School of Medicine at Mount Sinai. Yiyi Zhao is a student Columbia University, she was a student intern at the Icahn School of Medicine at Mount Sinai. Sanjay Basu is supported the National Institute On Minority Health and Health Disparities (DP2MD010478); he is employed by Stanford University School of Medicine. James H. Faghmous is supported by the National Science Foundation (1464297); he was employed by the Icahn School of Medicine at Mount Sinai; and a visiting scholar at Stanford University School of Medicine. Aaron Baum is supported by the National Institute on Minority Health and Health Disparities (1U01HG009610), is employed by the Icahn School of Medicine at Mount Sinai; and is a visiting scholar at the Stanford University School of Medicine.
Funding: This work was supported by grants DP2MD010478 and 1U01HG009610 from the National Institute On Minority Health and Health Disparities, grant F30MH106293 from the National Institute of Mental Health, grant 1464297 from the National Science Foundation, and grant 2028 from the Rare Disease Foundation.
Footnotes
Conflict of Interest: No conflicts to disclose
REFERENCES
- 1.A novel gene containing a trinucleotide repeat that is expanded and unstable on Huntington’s disease chromosomes. The Huntington’s Disease Collaborative Research Group. Cell 1993; 72(6):971–83. [DOI] [PubMed] [Google Scholar]
- 2.Duyao M, Ambrose C, Myers R, et al. Trinucleotide repeat length instability and age of onset in Huntington’s disease. Nat Genet 1993; 4(4):387–92. doi: 10.1038/ng0893-387 [DOI] [PubMed] [Google Scholar]
- 3.Genetic Modifiers of Huntington’s Disease C. Identification of Genetic Factors that Modify Clinical Onset of Huntington’s Disease. Cell 2015; 162(3):516–26. doi: 10.1016/j.cell.2015.07.003 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4.Gusella JF, MacDonald ME, Lee JM. Genetic modifiers of Huntington’s disease. Mov Disord 2014; 29(11):1359–65. doi: 10.1002/mds.26001 [DOI] [PubMed] [Google Scholar]
- 5.Sipila JO, Paivarinta M. Why We Still Need More Research on the Epidemiology of Huntington’s Disease. Neuroepidemiology 2016; 46(2):154–5. doi: 10.1159/000444230 [DOI] [PubMed] [Google Scholar]
- 6.Pringsheim T, Wiltshire K, Day L, et al. The incidence and prevalence of Huntington’s disease: a systematic review and meta-analysis. Mov Disord 2012; 27(9):1083–91. doi: 10.1002/mds.25075 [DOI] [PubMed] [Google Scholar]
- 7.Quarrell O, O’Donovan KL, Bandmann O, et al. The Prevalence of Juvenile Huntington’s Disease: A Review of the Literature and Meta-Analysis. PLoS Curr 2012; 4:e4f8606b742ef3. doi: 10.1371/4f8606b742ef3 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8.Rawlins MD, Wexler NS, Wexler AR, et al. The Prevalence of Huntington’s Disease. Neuroepidemiology 2016; 46(2):144–53. doi: 10.1159/000443738 [DOI] [PubMed] [Google Scholar]
- 9.Sipila JO, Hietala M, Siitonen A, et al. Epidemiology of Huntington’s disease in Finland. Parkinsonism Relat Disord 2015; 21(1):46–9. doi: 10.1016/j.parkreldis.2014.10.025 [DOI] [PubMed] [Google Scholar]
- 10.Chen YY, Lai CH. Nationwide population-based epidemiologic study of Huntington’s Disease in Taiwan. Neuroepidemiology 2010; 35(4):250–4. doi: 10.1159/000319462 [DOI] [PubMed] [Google Scholar]
- 11.Evans SJ, Douglas I, Rawlins MD, et al. Prevalence of adult Huntington’s disease in the UK based on diagnoses recorded in general practice records. J Neurol Neurosurg Psychiatry 2013; 84(10):1156–60. doi: 10.1136/jnnp-2012-304636 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Wexler NS, Collett L, Wexler AR, et al. Incidence of adult Huntington’s disease in the UK: a UK-based primary care study and a systematic review. BMJ Open 2016; 6(2):e009070. doi: 10.1136/bmjopen-2015-009070 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 13.Fisher ER, Hayden MR. Multisource ascertainment of Huntington disease in Canada: prevalence and population at risk. Mov Disord 2014; 29(1):105–14. doi: 10.1002/mds.25717 [DOI] [PubMed] [Google Scholar]
- 14.Hoppitt T, Pall H, Calvert M, et al. A systematic review of the incidence and prevalence of long-term neurological conditions in the UK. Neuroepidemiology 2011; 36(1):19–28. doi: 10.1159/000321712 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Folstein SE, Leigh RJ, Parhad IM, et al. The diagnosis of Huntington’s disease. Neurology 1986; 36(10):1279–83. [DOI] [PubMed] [Google Scholar]
- 16.Sackley C, Hoppitt TJ, Calvert M, et al. Huntington’s disease: current epidemiology and pharmacological management in UK primary care. Neuroepidemiology 2011; 37(3–4):216–21. doi: 10.1159/000331912 [DOI] [PubMed] [Google Scholar]
- 17.Al-Jader LN, Harper PS, Krawczak M, et al. The frequency of inherited disorders database: prevalence of Huntington disease. Community Genet 2001; 4(3):148–57. doi:51175 [DOI] [PubMed] [Google Scholar]
- 18.Kokmen E, Ozekmekci FS, Beard CM, et al. Incidence and prevalence of Huntington’s disease in Olmsted County, Minnesota (1950 through 1989). Arch Neurol 1994; 51(7):696–8. [DOI] [PubMed] [Google Scholar]
- 19.Vandenbroucke JP, Pearce N. Incidence rates in dynamic populations. Int J Epidemiol 2012; 41(5):1472–9. doi: 10.1093/ije/dys142 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Rothman KJ, Greenland S, Lash TL. Modern Epidemiology, 3rd Edition: Lippincott, Williams and Wilkins; 2008. [Google Scholar]
- 21.Klein RJ, Schoenborn CA. Age adjustment using the 2000 projected U.S. population. Healthy People 2010 Stat Notes 2001; (20):1–10. [PubMed] [Google Scholar]
- 22.Adachi Y, Nakashima K. [Population genetic study of Huntington’s disease--prevalence and founder’s effect in the San-in area, western Japan]. Nihon Rinsho 1999; 57(4):900–4. [PubMed] [Google Scholar]
- 23.Chang CM, Yu YL, Fong KY, et al. Huntington’s disease in Hong Kong Chinese: epidemiology and clinical picture. Clin Exp Neurol 1994; 31:43–51. [PubMed] [Google Scholar]
- 24.Paradisi I, Hernandez A, Arias S. Huntington disease mutation in Venezuela: age of onset, haplotype analyses and geographic aggregation. J Hum Genet 2008; 53(2):127–35. doi: 10.1007/s10038-007-0227-1 [DOI] [PubMed] [Google Scholar]
- 25.Walker R, Jankovic J, O’Hearn E, et al. Phenotypic features of Huntington’s disease‐like 2. Movement disorders 2003; 12(1257–1530). [DOI] [PubMed] [Google Scholar]
- 26.Wexler NS, Lorimer J, Porter J, et al. Venezuelan kindreds reveal that genetic and environmental factors modulate Huntington’s disease age of onset. Proc Natl Acad Sci U S A 2004; 101(10):3498–503. doi: 10.1073/pnas.0308679101 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zielonka D, Marinus J, Roos RA, et al. The influence of gender on phenotype and disease progression in patients with Huntington’s disease. Parkinsonism Relat Disord 2013; 19(2):192–7. doi: 10.1016/j.parkreldis.2012.09.012 [DOI] [PubMed] [Google Scholar]
- 28.Lix LM, Hobson DE, Azimaee M, et al. Socioeconomic variations in the prevalence and incidence of Parkinson’s disease: a population-based analysis. J Epidemiol Community Health 2010; 64(4):335–40. doi: 10.1136/jech.2008.084954 [DOI] [PubMed] [Google Scholar]
- 29.Karp A, Kareholt I, Qiu C, et al. Relation of education and occupation-based socioeconomic status to incident Alzheimer’s disease. Am J Epidemiol 2004; 159(2):175–83. [DOI] [PubMed] [Google Scholar]
- 30.van de Vorst IE, Koek HL, Stein CE, et al. Socioeconomic Disparities and Mortality After a Diagnosis of Dementia: Results From a Nationwide Registry Linkage Study. Am J Epidemiol 2016; 184(3):219–26. doi: 10.1093/aje/kwv319 [DOI] [PubMed] [Google Scholar]
- 31.Zeki Al Hazzouri A, Haan MN, Kalbfleisch JD, et al. Life-course socioeconomic position and incidence of dementia and cognitive impairment without dementia in older Mexican Americans: results from the Sacramento area Latino study on aging. Am J Epidemiol 2011; 173(10):1148–58. doi: 10.1093/aje/kwq483 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Eaton WW, Muntaner C, Sapag JC. Socioeconomic Stratification and Mental Disorder. In: Scheid T, Brown TN, editors. A Handbook for the Study of Mental Health: Social Contexts, Theories, and Systems Cambridge: Cambridge University Press.; 2010. [Google Scholar]
- 33.Dohrenwend BP, Levav I, Shrout PE, et al. Socioeconomic status and psychiatric disorders: the causation-selection issue. Science 1992; 255(5047):946–52. [DOI] [PubMed] [Google Scholar]
- 34.Kroger H, Pakpahan E, Hoffmann R. What causes health inequality? A systematic review on the relative importance of social causation and health selection. Eur J Public Health 2015; 25(6):951–60. doi: 10.1093/eurpub/ckv111 [DOI] [PubMed] [Google Scholar]
- 35.Vaessen N, van Duijn CM. Opportunities for population-based studies of complex genetic disorders after the human genome project. Epidemiology 2001; 12(3):360–4. [DOI] [PubMed] [Google Scholar]
- 36.Mo C, Renoir T, Hannan AJ. Effects of chronic stress on the onset and progression of Huntington’s disease in transgenic mice. Neurobiol Dis 2014; 71:81–94. doi: 10.1016/j.nbd.2014.07.008 [DOI] [PubMed] [Google Scholar]
- 37.van Dellen A, Hannan AJ. Genetic and environmental factors in the pathogenesis of Huntington’s disease. Neurogenetics 2004; 5(1):9–17. doi: 10.1007/s10048-003-0169-5 [DOI] [PubMed] [Google Scholar]
- 38.Chetty R, Hendren N, Lin F, et al. Childhood Environment and Gender Gaps in Adulthood. National Bureau of Economic Research Working Paper,,. 2016.
- 39.Novak MJ, Tabrizi SJ. Huntington’s disease. BMJ 2010; 340:c3109. doi: 10.1136/bmj.c3109 [DOI] [PubMed] [Google Scholar]
- 40.Huntington Study Group PI, Biglan KM, Shoulson I, et al. Clinical-Genetic Associations in the Prospective Huntington at Risk Observational Study (PHAROS): Implications for Clinical Trials. JAMA Neurol 2016; 73(1):102–10. doi: 10.1001/jamaneurol.2015.2736 [DOI] [PubMed] [Google Scholar]
- 41.Stigma Wexler A., history, and Huntington’s disease. The Lancet 2010; 376(9734):18–9. doi: 10.1016/S0140-6736(10)60957-9 [DOI] [PubMed] [Google Scholar]
- 42.United States Census Bureau. Health Insurance Coverage Status and Type of Coverage by Selected Characteristics (H1–01). Current Population Survey Annual Social and Economic (ASEC) Supplement. 2016 https://www.census.gov/data/tables/time-series/demo/income-poverty/cps-hi/hi-01.html.
- 43.Orth M, Handley OJ, Schwenke C, et al. Observing Huntington’s Disease: the European Huntington’s Disease Network’s REGISTRY. PLoS Curr 2010; 2:RRN1184. doi: 10.1371/currents.RRN1184 [DOI] [PMC free article] [PubMed] [Google Scholar]
- 44.Landwehrmeyer GB, Fitzer‐Attas CJ, Giuliano JD, et al. Data Analytics from Enroll‐HD, a Global Clinical Research Platform for Huntington’s Disease. Movement Disorders Clinical Practice. . Mov Disord 2017; 4(2):212–24. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45.Ward MM. Estimating rare disease prevalence from administrative hospitalization databases. Epidemiology 2005; 16(2):270–1. [DOI] [PubMed] [Google Scholar]
- 46.Mooney SJ, Westreich DJ, El-Sayed AM. Commentary: Epidemiology in the era of big data. Epidemiology 2015; 26(3):390–4. doi: 10.1097/EDE.0000000000000274 [DOI] [PMC free article] [PubMed] [Google Scholar]
