Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2022 Jun 24.
Published in final edited form as: J Clin Lipidol. 2016 Aug 6;10(5):1230–1239. doi: 10.1016/j.jacl.2016.08.001

Rapid identification of familial hypercholesterolemia from electronic health records: The SEARCH study

Maya S Safarova 1, Hongfang Liu 1, Iftikhar J Kullo 1,*
PMCID: PMC9229555  NIHMSID: NIHMS1815915  PMID: 27678441

Abstract

BACKGROUND:

Little is known about prevalence, awareness, and control of familial hypercholesterolemia (FH) in the United States.

OBJECTIVE:

To address these knowledge gaps, we developed an ePhenotyping algorithm for rapid identification of FH in electronic health records (EHRs) and deployed it in the Screening Employees And Residents in the Community for Hypercholesterolemia (SEARCH) study.

METHODS:

We queried a database of 131,000 individuals seen between 1993 and 2014 in primary care practice to identify 5992 (mean age 52 ± 13 years, 42% men) patients with low-density lipoprotein cholesterol (LDL-C) ≥190 mg/dL, triglycerides <400 mg/dL and without secondary causes of hyperlipidemia.

RESULTS:

Our EHR-based algorithm ascertained the Dutch Lipid Clinic Network criteria for FH using structured data sets and natural language processing for family history and presence of FH stigmata on physical examination. Blinded expert review revealed positive and negative predictive values for the SEARCH algorithm at 94% and 97%, respectively. The algorithm identified 32 definite and 391 probable cases with an overall FH prevalence of 0.32% (1:310). Only 55% of the FH cases had a diagnosis code relevant to FH. Mean LDL-C at the time of FH ascertainment was 237 mg/dL; at follow-up, 70% (298 of 423) of patients were on lipid-lowering treatment with 80% achieving an LDL-C ≤100 mg/dL. Of treated FH patients with premature CHD, only 22% (48 of 221) achieved an LDL-C ≤70 mg/dL.

CONCLUSIONS:

In a primary care setting, we found the prevalence of FH to be 1:310 with low awareness and control. Further studies are needed to assess whether automated detection of FH in EHR improves patient outcomes.

Keywords: Familial, hypercholesterolemia, Screening, Prevalence, Awareness, Control, Hypercholesterolemia, Electronic phenotyping, eEpidemiology, Electronic health records, Informatics

Introduction

An important paradigm of precision medicine is to screen individuals for disease before overt clinical manifestations particularly when treatment is available and beneficial. An example is familial hypercholesterolemia (FH) where major adverse events such as sudden cardiac death, myocardial infarction, and stroke can be prevented by timely initiation of lipid-lowering therapy. A relatively common genetic disorder, FH is associated with dramatically increased lifetime risk of premature atherosclerotic cardiovascular disease (ASCVD) due to elevated plasma low-density lipoprotein cholesterol (LDL-C) levels.1,2 FH has been labeled a Tier 1 public health genomics condition3 and is one of the few genetic diseases that meets the World Health Organization criteria for population-based large-scale screening programs aimed at early disease detection and timely treatment.4 Increased attention has focused on FH recently as advances in genomics are providing important insights into the genetic architecture of lipid disorders,5,6 and novel classes of lipid-lowering drugs are opening up new avenues of therapy in these high-risk patients.710

However, substantial gaps in our knowledge of prevalence, awareness, and control of FH remain. Few studies have specifically addressed the prevalence of heterozygous FH. Almost half a century ago, the prevalence of heterozygous FH was estimated at 1:500 among relatives of survivors of myocardial infarction.11 Excluding specific populations with a “founder effect,” the reported estimates of prevalence of FH vary widely.12 In the Danish population, prevalence was reported to be 1:137 (0.7%),13 whereas in a study from neighboring Finland, the prevalence was 1:600 (0.2%).14 Based on genetic screening, the prevalence of FH in the Netherlands’ population was 1:200.15 In the US National Health and Nutrition Examination Survey (NHANES), the prevalence of FH diagnosed using clinical criteria was estimated to be 1:250 (0.4%).16 This significant difference in reported prevalence rates motivates study of the prevalence of FH in a community-based setting in the United States. Furthermore, little is known about the extent to which FH is underdiagnosed and undertreated in the United States. Indeed, some have projected that <10% of prevalent cases are diagnosed and treated.17,18

To address these knowledge gaps, we undertook the Mayo Screening Employees And Residents in the Community for Hypercholesterolemia (SEARCH) study in the Mayo Employee and Community Health (ECH) system that delivers primary care to residents of Olmsted County and southeastern Minnesota. A unified electronic data trust that includes comprehensive clinical records of ECH patients enabled the research described in this report. We developed an electronic phenotyping algorithm to mine electronic health records (EHR) to identify patients who met the Dutch Lipid Clinic Network (DLCN) criteria for FH with the long-term goal of addressing knowledge gaps in prevention, awareness, control of FH, and the prevention of premature ASCVD.

Material and methods

Study population and settings

This cross-sectional study was approved by the Institutional Review Board of Mayo Clinic, Rochester, Minnesota. Individuals in the Mayo ECH system who had given permission for their medical records to be used for research and had clinical data available in the EHR were considered eligible for the study. Lipid levels were extracted from structured laboratory databases from June 21, 1993, to December 31, 2014. The index date was defined as the date of the earliest LDL-C level ≥190 mg/dL. Race was categorized as “white,” “black or African American,” “Asian,” “other,” and “choose not to disclose.” Initially, EHRs of a random sample of 115 patients with severe hypercholesterolemia were manually reviewed to inform development of the electronic phenotyping algorithm for FH. Because a variety of medical conditions (hypothyroidism, cholestatic liver diseases, nephrotic syndrome, severe renal failure, and pregnancy)19 can increase LDL-C levels, patients who had such conditions within 1 year before the index date were excluded. We identified these conditions using Logical Observation Identifiers Names and Codes (LOINC) for renal, thyroid, and liver function laboratory parameters as well as the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes for pregnancy.

Defining FH case status

To identify cases of FH, we used a modified numerical score system of the DLCN criteria.13 From the original set of clinical criteria,20 we excluded two criteria: a first-degree relative with tendon xanthomas or corneal arcus, and children aged <18 years with severe hypercholesterolemia. This decision was based on initial manual review of EHR of patients with LDL-C ≥190 mg/dL, which revealed that these variables were not recorded by providers. Variables that were incorporated in the SEARCH ePhenotyping algorithm (version 1.0) are listed in Table 1. FH case status was assigned as “definite” if the score was >8 points or “probable” when the score was 6–8, with the remaining individuals identified as “possible” FH (scored 3–5 points) serving as controls. The number of definite and probable FH cases was combined to estimate the prevalence of FH.

Table 1.

Ascertaining modified Dutch Lipid Clinic Network criteria for heterozygous FH: use of structured data sets and natural language processing

Criteria Method Points
Family history: first-degree relative with hypercholesterolemia or premature ASCVD NLP of PPI and clinical notes 1
Personal history of premature coronary heart disease Two ICD-9-CM and/or CPT-4 codes 2
Personal history of premature cerebrovascular or peripheral arterial disease 1
Presence of tendon xanthomas NLP of clinical notes 6
Presence of corneal arcus before age 45 years 4
LDL-C, >325 mg/dL Laboratory data sets 8
LDL-C, 251–325 mg/dL 5
LDL-C, 191–250 mg/dL 3

ASCVD, atherosclerotic cardiovascular disease, including coronary heart disease (CHD), cerebral (CVD) or peripheral arterial disease (PAD); ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; CPT, current procedural terminology codes; LDL-C, low-density lipoprotein cholesterol; NLP, Natural Language Processing; PPI, patient provided information.

Premature ASCVD cases were defined if two or more relevant diagnosis codes were present in EHR before age 55 years in men and 65 years in women. Assigned codes were evaluated at discharge from each encounter during the prevalence period. CHD was defined as angina pectoris, myocardial infarction, coronary atherosclerosis/chronic ischemic heart disease, percutaneous coronary revascularization, coronary bypass surgery, cerebrovascular disease included ischemic stroke, transient ischemic attack, carotid artery disease, carotid revascularization, and peripheral arterial disease: lower extremity arterial disease, surgical and percutaneous vascular interventions. Qualifying LDL-C was scored at the index date. Because all patients had LDL-C ≥190 mg/dL, according to the DLCN criteria, all these patients had a minimum score of 3 points. All other parameters were extracted within the prevalence period: June 21, 1993, till December 31, 2014.

Ascertaining FH criteria from the EHR

We mined both structured data sets and unstructured clinical text to ascertain FH criteria from the EHR (Table 1 and Fig. 1). Structured EHR data included (1) diagnosis codes such as the Hospital Adaptation of the International Classification of Diseases and the ICD-9-CM21 codes as well as relevant current procedural terminology (CPT-4)22 codes for each ASCVD subtype. We required occurrence of at least two codes within a year to define presence of an ASCVD subtype. Key rules and concepts from previously validated electronic phenotyping algorithms23 available on the Phenotype KnowledgeBase Website (https://phekb.org/) of the National Human Genome Research Institute’s Electronic Medical Records and Genomics (https://emerge.mc.vanderbilt.edu/) Network24 were used. We created and validated a list of ASCVD-related ICD-9-CM diagnosis and procedure codes; (2) laboratory data were identified with LOINC. Results of sequencing of LDLR and genotyping of APOB for R3500Q and R3500W, if available, were retrieved from the genetic testing laboratory database; and (3) medication use was ascertained by using drug annotations with their RxNorm codes.

Figure 1.

Figure 1

Data elements required for the Mayo SEARCH Algorithm. ASCVD, atherosclerotic cardiovascular disease; EHR, electronic health record; NLP, natural language processing; PPI, patient provided information; LOINC, Logical Observation Identifiers Names and Codes; ICD-9-CM, International Classification of Diseases, Ninth Revision, Clinical Modification; CPT, current procedural terminology codes.

To process unstructured clinical text, we exported the annotated information to a format that could be used for downstream NLP, including sentence tokenization, sectionalizing, concept identification, and negation.2529 After assembling a list of clinical terms relevant to hypercholesterolemia and ASCVD, the NLP system30,31 was adjusted and deployed to ascertain family history of hypercholesterolemia and ASCVD from the “Patient Provided Information” form. The MedTagger25 program was used to detect positive mentions of “xanthomas” and “corneal arcus” in clinical notes defined using regular expressions. We aided this rule-based concept-extraction engine with adapted versions of SecTag sectionizer and a ConText27,32 annotator for a status modifier, that is, positive, probable, or negative.

Awareness and control of FH

As a surrogate for awareness of FH, we estimated the number of patients with definite and probable FH case status who had the ICD-9 code 272.0 that is used broadly for hypercholesterolemia (there is no specific ICD-9-CM code for FH). We assessed “control” as the proportion of FH patients who had an LDL-C level ≤100 mg/dL and the proportion of FH patients with premature CHD achieving an LDL-C level ≤70 mg/dL on treatment based on the last available plasma LDL-C level in the EHR (date closest to December 2014).

Assessing the accuracy of FH phenotyping algorithm

Demographic and clinical characteristics were presented as number (percentage) and mean and standard deviation (SD), as appropriate. For the electronic phenotyping algorithm, sensitivity, specificity, positive, and negative predictive values (PPV and NPV) were computed based on the gold standard of manual EHR review. Using random numbers generated with R-package,33 20 individuals from each step of the phenotyping algorithm were selected. Validation sets comprised (1) a random sample of 260 patients (20 random individuals iteratively selected from each step of the algorithm) and (2) 20 individuals identified as FH by the electronic phenotyping algorithm. To evaluate predictive value of the algorithm, 20 randomly selected cases and 20 randomly selected controls were reviewed, in a random order, by an expert reviewer (MSS) who was blinded to the case or control status, using standardized data collection forms. Quantitative traits such as laboratory measurements were accepted as recorded in EHR except for a focused review of outliers. The PPV for being a case was determined as number of cases identified by the phenotyping algorithm and confirmed by review divided by the total number of cases identified by the algorithm. The NPV for being a control was defined as number of true negatives in the numerator with total number of controls classified by the algorithm in the denominator. The iterative process was terminated when the positive and negative predictive values were each >85%.

Results

Patient characteristics

Figure 2 illustrates how we used an EHR-based approach to identify FH cases. The prevalence of severe hypercholesterolemia (any LDL-C level ≥190 mg/dL) in the ECH cohort (n = 131,000 patients with measured lipid profile and research authorization) was 5% (n = 6547). Clinical characteristics of these patients are summarized in Table 2. Mean age at the time of qualifying LDL-C measurement was 52 years, and 41% were men. The average time interval between the first LDL-C ≥190 mg/dL (index date) and last lipid panel in the EHR was 11.4 ± 8.7 years. In the EHR, information on lipid-lowering drug prescription was available from September 1995, leaving 1549 (24%) individuals with missing data on statin use before the index date. Of the remaining 4998 participants, 555 (11%) patients were on lipid-lowering treatment.

Figure 2.

Figure 2

The Mayo SEARCH study flow chart and stepwise illustration of the approach for identifying FH from electronic health records. Individuals were classified as definite (>8 points), probable (6–7), and possible (3–5) FH.

Table 2.

Characteristics of patients in the ECH system with severe hypercholesterolemia (n = 6547)

Variable ECH patients
Age at the first LDL-C ≥190 mg/dL, y 52.0 ± 13.1
Age at the last LDL-C measurement, y 64.7 ± 14.7
Men, n (%) 2931 (45)
Ethnicity/race, n
 Whites 6177
 Asian 121
 African American 80
 American Indian/Alaskan Native 10
 Unknown or not reported 159
Family history
 Premature ASCVD, n (%) 1101 (17%)
 ASCVD, n (%) 5050 (77%)
 Hypercholesterolemia, n (%) 3754 (57%)
Personal history of premature ASCVD
 Coronary heart disease, n (%) 643 (9.8%)
  Age at overt CHD in males, y 48.3 ± 5.2
  Age at overt CHD in females, y 54.8 ± 7.1
 Cerebrovascular disease, n (%) 148 (2.3%)
  Age at overt CVD in males, y 47.4 ± 7.2
  Age at overt CVD in females, y 53.6 ± 9.9
 Peripheral arterial disease, n (%) 50 (0.8%)
  Age at overt PAD in males, y 44.5 ± 14.1
  Age at overt PAD in females, y 53.2 ± 10.1
Physical examination
 Xanthomas, n (%) 10 (0.2%)
 Corneal arcus, n (%) 4 (0.06%)
Secondary causes 555
 Hypothyroidism 82
 Liver and/or biliary disease 394
 Chronic kidney disease 28
 Nephrotic syndrome 38
 Pregnancy 13
Familial hypercholesterolemia 5992
 Definite 32
 Probable 391
 Possible 5569

ECH, Employee and Community Health; ASCVD, atherosclerotic cardiovascular disease, including coronary heart disease (CHD), cerebral (CVD) or peripheral arterial disease (PAD); LDL-C, low-density lipoprotein cholesterol.

Data are expressed as means ± SD or a fraction (percentages).

Race was self-reported. A family history of premature coronary artery disease was defined as the presence of CHD in a first-degree male relative aged 55 years or younger or in a first-degree female relative 65 years or younger. According to the Dutch Lipid Clinic Network criteria, individuals were classified as definite (>8 points), probable (6–7), and possible (3–5) FH. For secondary causes of hypercholesterolemia, a condition was deemed active when abnormal laboratory values, defined as thyroid-stimulating hormone ≥10 mIU/L, alkaline phosphatase ≥200 IU/L, serum albumin <2.5 g/dL, creatinine >2.6 mg/dL, protein in a 24-h urine collection >3g, estimated glomerular filtration rate (eGFR) < 30 mL/min/BSA within 1 year before the index date. ICD-9 codes were used to ascertain pregnancy. The electronic phenotyping algorithm and pertinent data dictionary are available on https://phekb.org/website.

Self-reported family history of hypercholesterolemia and premature ASCVD were identified in 60% and 20% of patients, respectively. Features of FH on physical examination such as xanthomas and corneal arcus were captured from clinical text in 0.2% and 0.06%, respectively. Personal history of premature CHD was recognized in 10% of participants; of these, 33 patients had a mention of early coronary and carotid vascular beds disease, 21 patients had coronary and peripheral artery disease (PAD), and 9 patients had involvement of all three vascular beds. Isolated early cerebrovascular disease (CVD) and PAD were described in 148 and 35 patients, respectively. Mean age at manifestation of CHD, CVD, and PAD was 51.7 ± 7.1, 52.0 ± 9.7, and 51.0 ± 9.3 years, respectively. We found 9% (555 of 6547) of patients with severe hypercholesterolemia had an underlying secondary cause (metabolic disorder or pregnancy) for hypercholesterolemia. The most common secondary cause was cholestatic liver disease accounting for 71% of all cases, followed by hypothyroidism (15%). More than one possible underlying secondary cause of hypercholesterolemia was identified in 5% of patients.

Application of the algorithm for ascertaining FH

Of 6547 ECH patients with LDL ≥190 mg/dL, we applied clinical diagnostic criteria for FH in 5992 (92%) patients after excluding those with severe hypertriglyceridemia and secondary causes of hypercholesterolemia. As genetic testing for FH was rarely performed, we focused our algorithm on mining lipid and nonlipid clinical criteria for FH. Definite/probable FH (≥6 points) was ascertained in 7% (423 of 5992) of participants, which translates into a prevalence estimate of 1:310 (0.32%) in the entire ECH cohort. The proportion of individuals classified as definite FH (DLCN criteria ≥8 points) was 1:4100 (0.02%), probable FH (5–8 points) was 1:335 (0.30%), and possible FH (3–5 points) one in 24 (4.3%) (Table 2). Table 3 details clinical characteristics of patients with definite/probable FH.

Table 3.

Clinical characteristics of patients with definite/probable FH based on the DLCN criteria in the ECH cohort

Variable Definite/Probable FH
Number 423
Age, y 45.6 ± 11.6
Female, n (%) 229 (55%)
BMI, kg/m2 31.4 ± 7.1
Obesity, n (%) 217 (51%)
Smoking 307 (76%)
Family history of premature ASCVD, n (%) 162 (38%)
Tendon xanthomas, n (%) 7 (2%)
Corneal arcus, n (%) 4 (1%)
Diabetes mellitus, n (%) 149 (35%)
Hypertension, n (%) 310 (73%)
Personal history of premature ASCVD
 Coronary heart disease 269 (63.6%)
 Cerebrovascular disease 26 (6.6%)
 Peripheral arterial disease 16 (3.8%)
Mean DLCN criteria score 6.4 ± 1.2
Chronic kidney disease, n (%) 3 (0.7%)
Total cholesterol, mg/dL 323.7 ± 54.6
LDL-C, mg/dL 236.9 ± 51.1
HDL-C, mg/dL 48.9 ± 12.2
Triglycerides, mg/dL 189.1 ± 78.8
Lipid-lowering medications, n (%) 46 (11%)
Genetic testing (LDLR sequencing, APOB Arg3500Gln and Arg3500Trp genotypes) 3

ASCVD, atherosclerotic cardiovascular disease, including CHD, CVD or PAD; HDL-C, high-density lipoprotein cholesterol; LDL-C, low-density lipoprotein cholesterol.

Data are expressed as means ± SD or a fraction (percentages).

Race was self-reported. All parameters were ascertained from EHRs using electronic phenotyping algorithms validated by the eMERGE investigators and available on the phekb.org website. Lipid parameters were obtained at the index date (date of the qualifying LDL-C levels). Smoking status and lipid-lowering medications were extracted at the date closest to the index date within 1 year period prior. Logical Observation Identifiers Names and Codes (LOINC) pertinent to genetic testing were derived throughout the EHR timeline from 2008 to 2014.

Of 423 patients with definite (n = 32; mean DLCN score 10.2 ± 1.7) or probable (n = 391; mean DLCN score 6.1 ± 0.4) FH, only 55% (n = 233) had ICD-9-CM diagnosis code 272.0 of “pure hypercholesterolemia.” Among these patients (mean age of 46 years, 55% female), 64% had a history of premature CHD, 73% had hypertension, and 35% had diabetes. Only 3 patients had an FH panel genetic testing order. At the time of LDL-C ascertainment, 64% of FH patients had history of premature CHD, 82% (221 of 269) were on lipid-lowering drugs, but only 22% (48 of 221) of treated individuals achieved an LDL-C ≤70 mg/dL. Overall, in patients with severe hypercholesterolemia, we observed a trend toward a decrease in LDL-C levels within the study period (Table 4).

Table 4.

Baseline and mean changes in lipid and lipoprotein levels over period prevalence in ECH patients with severe hypercholesterolemia (n = 6547)

Variable ECH patients
Total cholesterol, mg/dL 293.4 ± 27.6
 Change, Δ% −30.1 ± 16.7
LDL-C, mg/dL 207.4 ± 21.8
 Change, Δ% −42.1 ± 21.4
HDL-C, mg/dL 52.3 ± 13.5
 Change, Δ% 6.5 ± 24.7
Triglycerides, mg/dL 168.4 ± 69.3
 Change, Δ% −3.8 ± 47.4

LDL-C, low-density lipoprotein cholesterol; HDL-C, high-density lipoprotein cholesterol.

Data are expressed as means ± SD.

Change in lipoprotein levels was calculated as (X2-X1)/X1 × 100, where X1—levels at the index date, X2—levels at the last date.

Accuracy of the Mayo SEARCH algorithm

A single expert reviewer (MSS) blinded to case-control status performed manual EHR review of 13 random sample sets of 20 patients for each categorical data elements (total 260 individual EHRs). Table 5 summarizes elements of the Mayo SEARCH algorithm for rapid identification of FH from the EHR. After iterative validation of performance of each step of the electronic phenotyping algorithm, in a randomly chosen sample of 105 algorithm-derived cases with definite/probable FH (≥6 points) and 99 controls (3–5 points) expert review revealed sensitivity and specificity at 97% and 94%, positive and negative predictive values at 94% and 97%, respectively. The automated identification of definite/probable FH cases when compared with the gold standard mis-scored 9.4% of individuals (19 of 202), that is, the final DLCN score was overestimated in 13 (6.4%) patients, and in 6 (3%), there was an underestimation. Five (2.5%) patients were reclassified definite/probable FH to possible FH, and only one patient was incorrectly grouped into possible FH while having a probable FH diagnosis. The main reason for misclassification was due to incorrect ascertainment of family history. In the remaining, misclassification was due to erroneous attribution of personal history of CVD and PAD.

Table 5.

Diagnostic accuracy of the Mayo SEARCH algorithm for ascertaining FH criteria (based on manual chart review)

Variable Sensitivity Specificity PPV NPV
Xanthomas (v3.0) 89 82 80 90
Corneal arcus ≤45 y (v2.0) 100 100 100 100
Family history of ASCVD (v1.0) 100 71 60 100
Family history of hypercholesterolemia (v1.0) 83 69 56 90
Premature CHD (v2.0) 100 83 80 100
Premature CVD (v2.0) 100 77 70 100
Premature PAD (v2.0) 100 77 70 100

ASCVD, atherosclerotic cardiovascular disease, including coronary heart disease (CHD), cerebral (CVD), or peripheral arterial disease (PAD); PPV, positive predictive value; NPV, negative predictive value.

Each sample set consisted of 20 random patients (10 cases and 10 controls).

Discussion

To the best of our knowledge, the present study is the first to report the prevalence, detection, and control of FH in a large US primary care system using an automated ePhenotyping approach. In pursuit of a rapid and efficient method for ascertaining FH cases, we developed an EHR-based phenotyping algorithm that included NLP and had reasonable accuracy in identifying FH case status. We demonstrated the prevalence of clinical FH to be higher than the commonly reported projected estimate of 1:500, consistent with epidemiologic data from European countries.13,15,34 Our investigation revealed significant under-recognition of clinical FH with less than half of patients having a diagnosis code related to primary hypercholesterolemia and achieving LDL-C ≤100 mg/dL on treatment or LDL-C ≤70 mg/dL, when CHD was present.

No community-based studies of the prevalence of FH based on genetic screening in the United States have been reported. It is estimated that 1 in 250–500 individuals in the general population has FH12,17,18,35; however, this estimate of FH burden may not reflect prevalence of clinically defined FH in the United States. In the NHANES Survey of a 2001–2012 data set (36,949 adult participants), the prevalence of FH based on the DLCN criteria was 1:25016 with similar rates in women and men, no difference between blacks and whites (0.47%, 1:211 vs 0.40%, 1:249, respectively) but a lower prevalence in Mexican Americans (0.24%, 1:414). However, in this analysis, genetic testing, family history of hypercholesterolemia, personal history of PAD, and FH stigmata on physical examination were not considered. In a cohort of patients seen in a large primary care practice setting, we demonstrated that FH was present in 1 in 310 individuals suggesting that ~1.0 million individuals in the United States may be affected. We used a validated set of the Dutch diagnostic criteria for heterozygous FH recommended for use in clinical practice by the National Lipid Association guidelines.36 We estimated that 1 of 14 (7%) patients with severe hypercholesterolemia had a clinical diagnosis of definite/probable FH based on the DLCN criteria, which in turn are predictive of FH-mutation carrier status.34 In a recent report, sequencing of LDLR, APOB, and PCSK9 in individuals with LDL-C ≥190 mg/dL revealed that 1 of 50 carried a pathogenic mutation. However, in those with unknown pre-treatment LDL-C levels, values were approximated using one standardized correction coefficient, and secondary causes of hypercholesterolemia were not ascertained, which may have affected the final estimate of mutation burden in this population.

Current evidence suggests that >80% of patients with FH in Western countries are undetected,37 and the proportion of undetected cases in the United States may be greater than 95%.17 Our results also indicate that FH may be significantly underdiagnosed and undertreated in the United States. Only half of FH patients had an FH-related diagnosis code, suggesting underawareness among care providers.38,39 These findings are consistent with data from a national FH registry (n = 1605) in the United States40,41 revealing low awareness of FH in a community, with only 43% of 1295 adult FH patients on high-intensity statin therapy, and of these, an LDL-C <100 mg/dL was achieved in ~25%, and a ≥50% reduction in the LDL-C levels was seen in 41%.42 In our cohort, an LDL-C ≤100 mg/dL on treatment was achieved in 47% (197 of 423) of subjects with FH. The relatively high prevalence of FH coupled with low levels of awareness and control motivate implementation of novel preventive strategies, such as the eEpidemiology approach described in this report.

An Internet-based survey conducted among US cardiologists highlights the importance of disseminating this knowledge, because 70% of respondents showed strong interest in learning more about this condition with only 10% being confident in FH management.43 A similar proportion of patients (70%, n = 571) participating in this study was not familiar with FH, highlighting serious knowledge gaps among both caregivers and patients in perceiving and interpreting the risk of premature ASCVD. In a primary care setting in Australia, only 38% of general practitioners correctly recognized clinical features of FH.44 Lack of knowledge about the importance of family screening for FH among health care providers coupled with the suboptimal control of LDL-C levels in FH suggests the need for clinical decision support within the EHR to facilitate cascade screening and achievement of target LDL goals.

As part of the electronic MEdical Records and Genomics (eMERGE)24 consortium, we have previously described performance and portability of EHR-based phenotyping algorithms23 that include billing and diagnosis codes, laboratory measurements, medication data, and NLP.25,45 The eAlgorithm for FH that we developed could be deployed in health care institutions across the country as the elements are independent of EHR structure and the “pseudocode”46 provides a map for data extraction and defines all algorithm-derived variables and rules to combine them. We acknowledge, however, that the reported accuracy of our eAlgorithm may be an overestimate due to “overfitting”, and validation in additional settings will be needed.

EHR data may be useful to assess both the epidemiology and management of severe hypercholesterolemia in primary care practices. The work described here provides the basis for several potential future directions to improve awareness and control of FH. EHR-based phenotyping provides a platform that could facilitate automated surveillance for FH, enable genetic cascade screening, increase FH awareness among health care providers, and promote the practice of precision medicine for FH patients and their families. An eEpidemiology approach for rapid FH identification in EHRs may also allow studies of the natural history of FH and investigation of real-world treatment patterns and outcomes. However, there are contradictory data on the impact of health information technology interventions on patient-centered health metrics,4749 and further research is needed to assess whether EHR-based surveillance and clinical decision support systems improve patient outcomes.50

Strengths of the present study include a large cohort from a contemporary primary care practice setting and the use of both structured and unstructured EHR data to identify FH cases. Several study limitations need to be mentioned. First, using a targeted screening approach focused on LDL-C level of ≥190 mg/dL rather than age, sex, and race-specific cutoffs may not be ideal. Although this LDL-C cutoff was successfully adopted as the prime benchmark by the British Simon Broom Registry criteria,1 nearly 20% of FH-mutation carriers have LDL-C level below this diagnostic cutoff.51 Also there is an overlap in the distributions of untreated LDL-C levels in mutation-positive and mutation-negative relatives of heterozygous FH patients, which means that diagnostic criteria based on LDL-C levels may not be fully reliable. Genetic testing was not available in the vast majority of the study cohort and would be helpful in establishing accuracy of the SEARCH algorithm and assessing the contribution of genetic variants to phenotypic differences among FH patients. Second, because EHR data were collected during routine medical care and not for research purposes, they are subject to misclassification, which could introduce bias in the estimation of true prevalence of FH in the cohort. As part of the eAlgorithm, we excluded individuals with severe hypertriglyceridemia due to potentially inaccurate LDL-C estimates based on the Friedewald’s equation and accounted for laboratory abnormalities suggestive of secondary causes of hypercholesterolemia. However, we did not account for poor diet and medication use (e.g., steroids) as potential causes of secondary hypercholesterolemia. Third, use of an ICD-9 code 272.0 likely overestimated the extent of awareness of FH because this code is not specific for FH. The recent approval of two ICD-10 codes for FH by the Center for Medicare and Medicaid Services (effective on October 1, 2016), that is, E78.01 for FH and Z83.42 for family history of FH is expected to increase awareness of FH and optimize care delivery to FH patients.

Although we used a validated set of clinical criteria for FH, we found that some variables were more prone to error when ascertained retrospectively. Because information on lipid-lowering drug use was available in the EHR from 2005 onward and only 11% were on a statin at the time of LDL-C ascertainment, we chose a single threshold for the LDL-C levels without adjustment for lipid-lowering therapy. It is important to note that imputation of LDL-C levels using normalized correction factors has inherent pitfalls of both overestimating and underestimating true levels in treated individuals.16 A major challenge that we encountered was variability in the recording of family history of ASCVD in the EHR. Our algorithm could have performed better if family history of ASCVD and hypercholesterolemia were routinely recorded during a clinical visit.52 Another challenge was that terms as “tendon xanthomas” and “corneal arcus” that are highly ranked in the DLCN scoring system were mainly reported in the clinical notes of particular clinical specialists, such as endocrinologists, cardiologists, and ophthalmologists. Increasing FH awareness among primary care providers may improve documentation of these findings and thereby increase the precision of the algorithm. In addition, further research is needed to evaluate and improve the NLP algorithms with detailed annotation of the corpus of clinical notes and knowledge engineering. The relatively high median age in our cohort suggests that much work is needed to detect patients with definite/probable FH at an early age. An EHR-based phenotyping algorithm for rapid and automated FH case identification such as the one described here is a step in this direction, laying the groundwork for a clinical decision support system to assist health care providers in managing patients and families with FH.

Conclusion

An EHR-based phenotyping algorithm had high accuracy in ascertaining FH cases among individuals with severe hypercholesterolemia. By implementing this algorithm in a large primary care practice cohort, we were able to estimate the proportion of patients with LDL-C ≥190 mg/dL who met the DLCN criteria for FH. We noted prevalence of FH to be higher than commonly presumed. Only half of the FH patients had a “pure hypercholesterolemia” diagnosis code or optimal LDL-C levels on follow-up. Our results indicate that FH is significantly underdiagnosed and undertreated in the United States. Further refinement of the planned algorithm will allow rapid and efficient identification of FH cases in the EHR facilitating awareness, surveillance, and detection of FH, and thereby improving treatment and outcomes in this high risk category of patients.

Acknowledgment

We are indebted to the staff and patients of the Mayo Employee and Community Health program. We acknowledge the assistance of Saeed Mehrabi, Majid Rastegar Mojarad and Carin Y. Smith (Mayo Clinic, Rochester, MN) for assistance with deploying the ePhenotyping algorithm.

Dr. Safarova is supported by American Heart Association Postdoctoral Fellowship Award 16POST27280004. Dr. Kullo is funded by the National Human Genome Research Institute’s electronic Medical Records and Genomics Network through grants HG04599 and HG006379 to Mayo Clinic. The National Human Genome Research Institute and American Heart Association had no role in the design and conduct of the work; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Footnotes

Financial disclosure

None.

References

  • 1.Mortality in treated heterozygous familial hypercholesterolaemia: implications for clinical management. Scientific Steering Committee on behalf of the Simon Broome Register Group. Atherosclerosis. 1999; 142:105–112. [PubMed] [Google Scholar]
  • 2.Austin MA, Hutter CM, Zimmern RL, Humphries SE. Familial hypercholesterolemia and coronary heart disease: a HuGE association review. Am J Epidemiol. 2004;160:421–429. [DOI] [PubMed] [Google Scholar]
  • 3.Green RF, Dotson WD, Bowen S, et al. Genomics in public health: perspective from the Office of Public Health Genomics at the Centers for Disease Control and Prevention (CDC). Healthcare (Basel). 2015; 3(3):830–837. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Watts GF, Lewis B, Sullivan DR. Familial hypercholesterolemia: a missed opportunity in preventive medicine. Nat Clin Pract Cardiovasc Med. 2007;4:404–405. [DOI] [PubMed] [Google Scholar]
  • 5.Teslovich TM, Musunuru K, Smith AV, et al. Biological, clinical and population relevance of 95 loci for blood lipids. Nature. 2010;466: 707–713. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Futema M, Plagnol V, Li K, et al. Whole exome sequencing of familial hypercholesterolaemia patients negative for LDLR/APOB/PCSK9 mutations. J Med Genet. 2014;51:537–544. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 7.Raal FJ, Stein EA, Dufour R, et al. PCSK9 inhibition with evolocumab (AMG 145) in heterozygous familial hypercholesterolaemia (RUTH-ERFORD-2): a randomised, double-blind, placebo-controlled trial. Lancet. 2015;385:331–340. [DOI] [PubMed] [Google Scholar]
  • 8.Kastelein JJ, Robinson JG, Farnier M, et al. Efficacy and safety of alirocumab in patients with heterozygous familial hypercholesterolemia not adequately controlled with current lipid-lowering therapy: design and rationale of the ODYSSEY FH studies. Cardiovasc Drugs Ther. 2014;28:281–289. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.Cuchel M, Meagher EA, du Toit Theron H, et al. Efficacy and safety of a microsomal triglyceride transfer protein inhibitor in patients with homozygous familial hypercholesterolaemia: a single-arm, open-label, phase 3 study. Lancet. 2013;381:40–46. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Santos RD, Duell PB, East C, et al. Long-term efficacy and safety of mipomersen in patients with familial hypercholesterolaemia: 2-year interim results of an open-label extension. Eur Heart J. 2015;36:566–575. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 11.Goldstein JL, Schrott HG, Hazzard WR, Bierman EL, Motulsky AG. Hyperlipidemia in coronary heart disease. II. Genetic analysis of lipid levels in 176 families and delineation of a new inherited disorder, combined hyperlipidemia. J Clin Invest. 1973;52:1544–1568. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Safarova MS, Kullo IJ. My Approach to the Patient With Familial Hypercholesterolemia. Mayo Clin Proc. 2016;91:770–786. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 13.Benn M, Watts GF, Tybjaerg-Hansen A, Nordestgaard BG. Familial hypercholesterolemia in the danish general population: prevalence, coronary artery disease, and cholesterol-lowering medication. J Clin Endocrinol Metab. 2012;97:3956–3964. [DOI] [PubMed] [Google Scholar]
  • 14.Lahtinen AM, Havulinna AS, Jula A, Salomaa V, Kontula K. Prevalence and clinical correlates of familial hypercholesterolemia founder mutations in the general population. Atherosclerosis. 2015;238:64–69. [DOI] [PubMed] [Google Scholar]
  • 15.Sjouke B, Kusters DM, Kindt I, et al. Homozygous autosomal dominant hypercholesterolaemia in the Netherlands: prevalence, genotype-phenotype relationship, and clinical outcome. Eur Heart J. 2015;36:560–565. [DOI] [PubMed] [Google Scholar]
  • 16.de Ferranti SD, Rodday AM, Mendelson MM, Wong JB, Leslie LK, Sheldrick RC. Prevalence of Familial Hypercholesterolemia in the 1999 to 2012 United States National Health and Nutrition Examination Surveys (NHANES). Circulation. 2016;133:1067–1072. [DOI] [PubMed] [Google Scholar]
  • 17.Nordestgaard BG, Chapman MJ, Humphries SE, et al. Familial hypercholesterolaemia is underdiagnosed and undertreated in the general population: guidance for clinicians to prevent coronary heart disease: consensus statement of the European Atherosclerosis Society. Eur Heart J. 2013;34:3478–3490a. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 18.Neil HA, Hammond T, Huxley R, Matthews DR, Humphries SE. Extent of underdiagnosis of familial hypercholesterolaemia in routine practice: prospective registry study. BMJ. 2000;321:148. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Vodnala D, Rubenfire M, Brook RD. Secondary causes of dyslipidemia. Am J Cardiol. 2012;110:823–825. [DOI] [PubMed] [Google Scholar]
  • 20.World Health Organization. Familial hypercholesterolemia: report of a second WHO Consultation. Geneva, Switzerland: World Health Organization; 1999 WHO publication No. WHO/HGN/FH/CONS/99.2. [Google Scholar]
  • 21.International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) 2008 U.S. Dept. of Health and Human Services, Centers for Disease Control and Prevention, Centers for Medicare and Medicaid Services. International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM). Available at: http://www.cdc.gov/nchs/icd/icd9cm.htm. Assessed September 2015.
  • 22.Association. A.M. CPT Physicians’ current procedural terminology. Chicago: American Medical Association. 1997–2008. [Google Scholar]
  • 23.Kullo IJ, Fan J, Pathak J, Savova GK, Ali Z, Chute CG. Leveraging informatics for genetic studies: use of the electronic medical record to enable a genome-wide association study of peripheral arterial disease. J Am Med Inform Assoc. 2010;17:568–574. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 24.Gottesman O, Kuivaniemi H, Tromp G, et al. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet Med. 2013;15:761–771. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Liu H, Bielinski SJ, Sohn S, et al. An Information Extraction Framework for Cohort Identification Using Electronic Health Records. AMIA Jt Summits Transl Sci Proc. 2013;2013:149–153. [PMC free article] [PubMed] [Google Scholar]
  • 26.Ferrucci D, Lally A. UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat Lang Eng. 2004;10:327–348. [Google Scholar]
  • 27.Harkema H, Dowling JN, Thornblade T, Chapman WW. ConText: an algorithm for determining negation, experiencer, and temporal status from clinical reports. J Biomed Inform. 2009;42:839–851. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 28.Savova GK, Masanz JJ, Ogren PV, et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010;17: 507–513. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Wu ST, Sohn S, Ravikumar KE, et al. Automated chart review for asthma cohort identification using natural language processing: an exploratory study. Ann Allergy Asthma Immunol. 2013;111:364–369. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 30.Mehrabi S, Krishnan A, Roch AM, et al. Identification of Patients with Family History of Pancreatic Cancer - Investigation of an NLP System Portability. Stud Health Technol Inform. 2015;216:604–608. [PMC free article] [PubMed] [Google Scholar]
  • 31.Mehrabi S, Krishnan A, Roch A, et al. Identification of Patients with Family History of Pancreatic Cancer - Investigation of an NLP system Portability. São Paulo, Brazil: MEDINFO; 2015. [PMC free article] [PubMed] [Google Scholar]
  • 32.Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. Evaluation of negation phrases in narrative clinical reports. Proc AMIA Symp. 2001;105–109. [PMC free article] [PubMed] [Google Scholar]
  • 33.R Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2014. Available at: http://www.R-project.org/. [Google Scholar]
  • 34.Benn M, Watts GF, Tybjaerg-Hansen A, Nordestgaard BG. Mutations causative of familial hypercholesterolaemia: screening of 98 098 individuals from the Copenhagen General Population Study estimated a prevalence of 1 in 217. Eur Heart J. 2016;37:1384–1394. [DOI] [PubMed] [Google Scholar]
  • 35.Hopkins PN, Toth PP, Ballantyne CM, Rader DJ. Familial hypercholesterolemias: prevalence, genetics, diagnosis and screening recommendations from the National Lipid Association Expert Panel on Familial Hypercholesterolemia. J Clin Lipidol. 2011;5:S9–S17. [DOI] [PubMed] [Google Scholar]
  • 36.Goldberg AC, Hopkins PN, Toth PP, et al. Familial hypercholesterolemia: screening, diagnosis and management of pediatric and adult patients: clinical guidance from the National Lipid Association Expert Panel on Familial Hypercholesterolemia. J Clin Lipidol. 2011;5: S1–S8. [DOI] [PubMed] [Google Scholar]
  • 37.Civeira F Guidelines for the diagnosis and management of heterozygous familial hypercholesterolemia. Atherosclerosis. 2004;173:55–68. [DOI] [PubMed] [Google Scholar]
  • 38.Huijgen R, Vissers MN, Defesche JC, Lansberg PJ, Kastelein JJ, Hutten BA. Familial hypercholesterolemia: current treatment and advances in management. Expert Rev Cardiovasc Ther. 2008;6:567–581. [DOI] [PubMed] [Google Scholar]
  • 39.Versmissen J, Oosterveer DM, Yazdanpanah M, et al. Efficacy of statins in familial hypercholesterolaemia: a long term cohort study. BMJ. 2008;337:a2423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 40.O’Brien EC, Roe MT, Fraulo ES, et al. Rationale and design of the familial hypercholesterolemia foundation CAscade SCreening for Awareness and DEtection of Familial Hypercholesterolemia registry. Am Heart J. 2014;167:342–349.e17. [DOI] [PubMed] [Google Scholar]
  • 41.O’Brien EC, DeGoma E, Moriarty P, et al. Initial results from the CASCADE-FH registry: cascade screening for awareness and detection of familial hypercholesterolemia. J Am Coll Cardiol. 2015;65: A1372. [Google Scholar]
  • 42.deGoma EM, Ahmad ZS, O’Brien E, et al. LDL-C Levels and Treatment Patterns Among Adults With Heterozygous Familial Hypercholesterolemia in the United States: Data From the CASCADE-FH Registry. Circulation. 2015;132:A12169. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 43.American College of Cardiology. Familial Hypercholesterolemia: Cardiologist and Patient Perspectives. Available at: http://www.acc.org/membership/member-benefits-and-resources/acc-member-publications/cardiosurve/newsletter/archive/2012/07/familial%20hypercholesterolemia%20cardiologist%20and%20patient%20perspectives. Assessed November 2015.
  • 44.Bell DA, Garton-Smith J, Vickery A, et al. Familial hypercholesterolaemia in primary care: knowledge and practices among general practitioners in Western Australia. Heart Lung Circ. 2014;23:309–313. [DOI] [PubMed] [Google Scholar]
  • 45.Fan J, Arruda-Olson AM, Leibson CL, et al. Billing code algorithms to identify cases of peripheral artery disease from administrative data. J Am Med Inform Assoc. 2013;20:e349–e354. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 46.Newton KM, Peissig PL, Kho AN, et al. Validation of electronic medical record-based phenotyping algorithms: results and lessons learned from the eMERGE network. J Am Med Inform Assoc. 2013;20: e147–e154. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 47.Aspry KE, Furman R, Karalis DG, et al. Effect of health information technology interventions on lipid management in clinical practice: a systematic review of randomized controlled trials. J Clin Lipidol. 2013;7:546–560. [DOI] [PubMed] [Google Scholar]
  • 48.Sidebottom AC, Johnson PJ, VanWormer JJ, Sillah A, Winden TJ, Boucher JL. Exploring electronic health records as a population health surveillance tool of cardiovascular disease risk factors. Popul Health Manag. 2015;18:79–85. [DOI] [PubMed] [Google Scholar]
  • 49.Moja L, Kwag KH, Lytras T, et al. Effectiveness of computerized decision support systems linked to electronic health records: a systematic review and meta-analysis. Am J Public Health. 2014;104:e12–e22. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 50.Thompson G, O’Horo JC, Pickering BW, Herasevich V. Impact of the Electronic Medical Record on Mortality, Length of Stay, and Cost in the Hospital and ICU: A Systematic Review and Metaanalysis. Crit Care Med. 2015;43:1276–1282. [DOI] [PubMed] [Google Scholar]
  • 51.Damgaard D, Larsen ML, Nissen PH, et al. The relationship of molecular genetic to clinical diagnosis of familial hypercholesterolemia in a Danish population. Atherosclerosis. 2005;180:155–160. [DOI] [PubMed] [Google Scholar]
  • 52.Yoon PW, Scheuner MT, Jorgensen C, Khoury MJ. Developing Family Healthware, a family history screening tool to prevent common chronic diseases. Prev Chronic Dis. 2009;6:A33. [PMC free article] [PubMed] [Google Scholar]

RESOURCES