Skip to main content
NIHPA Author Manuscripts logoLink to NIHPA Author Manuscripts
. Author manuscript; available in PMC: 2021 Jan 1.
Published in final edited form as: Eur J Heart Fail. 2019 Nov 20;22(1):159–161. doi: 10.1002/ejhf.1658

Therapeutic futility and phenotypic heterogeneity in HFpEF: what is the role of bionic learning?

David Kao 1, Suneet Purohit 2, Pardeep Jhund 3
PMCID: PMC7301725  NIHMSID: NIHMS1556653  PMID: 31749260

With the recent completion of PARAGON-HF,1 the angiotensin-neprilysin inhibitor sacubitril-valsartan has been added to the growing list of medications without convincing benefit in heart failure with preserved ejection fraction (HFpEF). The cardiology community is once again faced with the question of how to move forward in identifying therapies for HFpEF. One approach has centered on clinical and structural phenotype heterogeneity that can have differing effects on patient outcomes.2,3 Instead of targeting therapies at the HFpEF population as a whole, certain phenotypic subgroups might derive greater medical benefit than others for a given treatment. It is in this realm where machine learning (ML) algorithms may aid in our understanding of heterogeneous diseases. ML algorithms can find patterns in a hypothesis-free manner using multi-dimensional data to identify complex phenotypes based on aggregate features and interactions rather than individual characteristics in isolation. Unsupervised ML has been used previously to explore other heterogeneous disease states such as atrial fibrillation, asthma, and ADHD.46 HFpEF is an ideal domain in which to apply ML in light of the failure to identify an effective therapy across all patients with HFpEF and the general acceptance of significant phenotypic heterogeneity among those patients. The underlying assumption is that incompletely described complex phenotypes each have unique pathophysiology, prognosis, and treatment response. Quantitative elucidation of these phenotypes is critical to test this general hypothesis.

It is in this context that Segar et al. report their phenotype analysis in subjects from the Americas cohort of TOPCAT. Using a 61 clinical, laboratory, and echocardiographic variables, they identified and characterized three distinct phenotypes that had significantly different risks for adverse clinical events. Subjects from the RELAX trial were used to validate the authors’ algorithm by using the top 20 discriminatory variables in TOPCAT to classify each subject from RELAX as one of the three phenotypes producing patient groups with similar clinical profiles and relative outcomes as in the TOPCAT cohort. The authors’ analysis supports prior investigations showing HFpEF to be a spectrum of overlapping entities where diverse patient characteristics intersect in varied combinations to identify clusters of similar individuals with implications for prognosis. These clinical phenotypes also appear to recapitulate certain clinical observations and correlate with risk of adverse cardiovascular outcomes. For example while there is increasing evidence demonstrating the obesity phenotype is associated with increased risk in HFpEF,7,8 the cardiometabolic phenotypes derived from ML algorithms demonstrate an increased risk of adverse clinical outcomes independent of individual metabolic abnormalities such as abdominal obesity and diabetes alone.

There are now multiple reports of complex HFpEF phenotypes identified from independent datasets using different unsupervised ML methods. These analyses have identified between three and six phenotypes depending on the method and dataset used. These and other novel HFpEF phenotypes derived in a similar manner do appear to have variable prognoses and possibly treatment response. In each case, descriptive statistics of these phenotypes reveals some patterns that seen in clinical practice such as the obese, cardiometabolic phenotype. The appearance of these latent patterns in multiple datasets using multiple unsupervised ML approaches in combination with clinical intuition suggests that these may be genuine complex phenotypes with clinical importance.

In ML analyses it is common to include as many different data elements as possible, given that the relationships and interactions between them may be complex and obscure. Using this approach, it is possible to find truly novel classification strategies to address the problem at hand, in this case identification of important HFpEF phenotypes. However most ML methods do not incorporate underlying assumptions as to any organizing principles or rules governing any underlying latent patterns. In the case of clinical conditions, there are limits of mechanistic feasibility related to pathophysiology and drug mechanism wherein certain associations between features either have no known physiologic basis or are not clinically useful in therapeutic development (e.g. country of enrollment and serum chloride). A collection of features may therefore have utility for understanding epidemiology (e.g. social determinants) but is highly unlikely to yield a potential drug target. This raises the question of whether there must be an informed or expert based feature selection step to limit exploration to factors that have biologic or clinical plausibility rather than including all features that meet certain quantitative criteria such as completeness and collinearity. We propose that features included in ML analyses should be themselves classified into domains like ‘etiology,’ ‘heart failure severity,’ and ‘cardiac structure/function’ (Table). This approach allows for the possibility that HFpEF driven by variable underlying pathophysiology (e.g. hypertrophic CM vs. cardiometabolic vs. senile CM) may each pass through similar stages of HF progression rather than associating a single degree of functional HF phenotype (e.g. diastolic dysfunction with high LA pressure) with a single set of comorbidities.

Table 1 –

Suggested dimensions for unsupervised machine learning and cluster analysis to identify biologically plausible and clinically relevant HFpEF phenotypes

Dimension Components
Etiology Demographics, comorbidities, medical/surgical/family history, environment/exposure
Severity Vital signs, end organ function, symptom burden, exercise capacity
Structure Echocardiography, cardiac magnetic resonance imaging

It is theoretically possible to select features for ML analysis in an automated or computational fashion using semantically rich, holistically integrated knowledge bases,11 a practice sometimes called ‘cyborg learning.’12 Such approaches are dependent on the maturity and granularity of structured knowledge bases, which are limited in representation of complex pathophysiology. In clinical conditions such as HFpEF, biologically informed feature selection is presently accomplished more efficiently by domain experts such as clinicians who manage HFpEF. In doing so, the underlying mechanism for any latent patterns identified may be more apparent, thereby facilitating hypothesis generation for potential therapeutic targets. This strategy in effect mimics conventional clinical syndrome discovery (e.g. metabolic syndrome, systemic lupus erythematosus, Crohn’s disease), which usually evolves slowly over many years but can occur at an accelerated pace enabled by ML and computational methods applied to large sets of high-quality data. This we think of as ‘bionic learning.’

There is much debate regarding the importance of using ‘explainable’ ML where the relationship between features and classification can be described versus nonexplainable ML where the relationship between features and classification remains opaque. Phenotypes generated using unexplainable ML may be summarized using descriptive statistics, but the specific interactions that define those phenotypes cannot be determined. Conversely with explainable methods, the exact characteristics most important for phenotype identification are known, possibly reflecting the underlying pathophysiology and suggesting potential therapeutic targets. It is unclear whether explainability of an ML method is critical for cluster-based phenotyping to identify novel, clinically plausible phenotypes and treatments, but at present explainability does appear crucial for acceptance by the clinical community. At present explainability is required in the European Union for any algorithm used for decision-making in clinical practice9 and may also be required by the FDA for clinical decision support software.10

There are challenges in refining ML-derived phenotype descriptions into consistent forms to use for further study. In other words, what do we do with the results of these ML analyses? Visual inspection identifies common phenotypes between the results of at least three studies including Segar, et al. producing a general and informal sense of phenotype selection by consensus, which is somewhat analogous to ensemble ML methods. However this qualitative approach does not produce a consistent, common classification strategy to use in clinical study design or in practice. In addition to the biologic plausibility issues discussed above, this practical consideration of clinical application may inform the requirements for selecting an ML method and/or defining archetypal HFpEF phenotypes. In the three primary analyses discussed here (Shah et al. Segar et al., Kao et al.,), ML-derived phenotypes were identified in a training set and associations validated in an independent test dataset (Northwestern HFpEF program; TOPCAT/RELAX, IPRESERVE/CHARM).13,14 These three examples demonstrate that phenotype cluster definitions may be applied to independent datasets, although they are presented with variable degrees of replication. Two analyses including that presented by Segar et al. use the ‘predict’ functions unique to the R packages used (mclust, VarSelLCM) to apply the relevant model to the independent dataset.13 To use these classification models an interested reader would have to obtain the R frame containing the predictive models as well as full details for data pre-processing to apply them to the independent datasets. The third provides as part of the supplemental material specific coefficients, formulas, and tools to perform independent classification directly from the publication.14 In all cases, a relatively sophisticated computational resource is necessary to perform phenotype classification. Alternatively, simplified classification methods such as classification and regression trees could be used to derive clinical criteria that are easier to use in routine care. Whatever the approach used, it is becoming increasingly important to develop a means of a) of identifying phenotypes accounting for multiple ML analyses and b) developing practical means of applying those phenotype definitions to patients in order to realize the potential of novel prognostic and therapeutic models.

There are currently no medical therapies that have convincingly reduced morbidity or mortality in HFpEF. With the use of ML algorithms, we are now able to study increasingly complex sets of data to understand the clinical heterogeneity that exists. The knowledge generated can be used to design future trials with patients of similar clinical profile to better understand their underlying physiology and create a personalized approach to treatment.

REFERENCES

  • 1.Solomon SD, McMurray JJ V, Anand IS, Ge J, Lam CSP, Maggioni AP, Martinez F, Packer M, Pfeffer MA, Pieske B, Redfield MM, Rouleau JL, Veldhuisen DJ van, Zannad F, Zile MR, Desai AS, Claggett B, Jhund PS, Boytsov SA, Comin-Colet J, Cleland J, Düngen H-D, Goncalvesova E, Katova T, Kerr Saraiva JF, Lelonek M, Merkely B, Senni M, Shah SJ, Zhou J, et al. Angiotensin-Neprilysin Inhibition in Heart Failure with Preserved Ejection Fraction. N Engl J Med 2019;1–11. [DOI] [PubMed] [Google Scholar]
  • 2.Zile MR, Gottdiener JS, Hetzel SJ, McMurray JJ, Komajda M, McKelvie R, Baicu CF, Massie BM, Carson PE, Investigators I-P. Prevalence and significance of alterations in cardiac structure and function in patients with heart failure and a preserved ejection fraction. Circulation United States; 2011;124:2491–2501. [DOI] [PubMed] [Google Scholar]
  • 3.Dunlay SM, Roger VL, Redfield MM. Epidemiology of heart failure with preserved ejection fraction Nat Rev Cardiol Nature Publishing Group; 2017;14:591–602. [DOI] [PubMed] [Google Scholar]
  • 4.Kirchhof P, Breithardt G, Aliot E, Khatib S Al, Apostolakis S, Auricchio A, Bailleul C, Bax J, Benninger G, Blomstrom-Lundqvist C, Boersma L, Boriani G, Brandes A, Brown H, Brueckmann M, Calkins H, Casadei B, Clemens A, Crijns H, Derwand R, Dobrev D, Ezekowitz M, Fetsch T, Gerth A, Gillis A, Gulizia M, Hack G, Haegeli L, Hatem S, Georg Häusler K, et al. Personalized management of atrial fibrillation: Proceedings from the fourth Atrial Fibrillation competence NETwork/European Heart Rhythm Association consensus conference. Europace 2013;15:1540–1556. [DOI] [PubMed] [Google Scholar]
  • 5.Elia J, Arcos-Burgos M, Bolton KL, Ambrosini PJ, Berrettini W, Muenke M. ADHD latent class clusters: DSM-IV subtypes and comorbidity Psychiatry Res Elsevier Ltd.; 2009;170:192–198. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Siroux V, Basagan X, Boudier A, Pin I, Garcia-Aymerich J, Vesin A, Slama R, Jarvis D, Anto JM, Kauffmann F, Sunyer J. Identifying adult asthma phenotypes using a clustering approach. Eur Respir J 2011;38:310–317. [DOI] [PubMed] [Google Scholar]
  • 7.Tsujimoto T, Kajio H. Abdominal Obesity Is Associated With an Increased Risk of All-Cause Mortality in Patients With HFpEF J Am Coll Cardiol Elsevier; 2017;70:2739–2749. [DOI] [PubMed] [Google Scholar]
  • 8.Obokata M, Reddy YNV, Pislaru SV., Melenovsky V, Borlaug BA. Evidence Supporting the Existence of a Distinct Obese Phenotype of Heart Failure with Preserved Ejection Fraction. Circulation 2017;136:6–19. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 9.London AJ. Artificial Intelligence and Black-Box Medical Decisions: Accuracy versus Explainability. Hastings Cent Rep 2019;49:15–21. [DOI] [PubMed] [Google Scholar]
  • 10.US Food & Drug Administration. Clinical Decision Support Software - Draft Guidance for Industry and Reviewers and Food and Drug Administration Staff. US Department of Health and Human Services; 2019. https://www.fda.gov/media/109618/download [Google Scholar]
  • 11.Livingston KM, Bada M, Baumgartner WA, Hunter LE. KaBOB: ontology-based semantic integration of biomedical databases BMC Bioinformatics England; 2015;16:126. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Zeng D, Wu Z. From Artificial Intelligence to Cyborg Intelligence. IEEE Intel Sys 2014;29:2–4. [Google Scholar]
  • 13.Shah SJ, Katz DH, Selvaraj S, Burke MA, Yancy CW, Gheorghiade M, Bonow RO, Huang C-CC, Deo RC. Phenomapping for novel classification of heart failure with preserved ejection fraction Circulation United States; 2015;131:269–279. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 14.Kao DP, Lewsey JD, Anand IS, Massie BM, Zile MR, Carson PE, McKelvie RS, Komajda M, McMurray JJ, Lindenfeld JA. Characterization of subgroups of heart failure patients with preserved ejection fraction with possible implications for prognosis and treatment response. Eur J Heart Fail 2015;17:925–935. [DOI] [PMC free article] [PubMed] [Google Scholar]

RESOURCES