Skip to main content
International Journal of Cardiology Congenital Heart Disease logoLink to International Journal of Cardiology Congenital Heart Disease
. 2024 Jul 2;17:100524. doi: 10.1016/j.ijcchd.2024.100524

Phenotypic clustering of repaired Tetralogy of Fallot using unsupervised machine learning

Xander Jacquemyn a,b, Bhargava K Chinni a, Ashish N Doshi a, Shelby Kutty a, Cedric Manlhiot a,
PMCID: PMC11658329  PMID: 39711763

Abstract

Objective

Repaired Tetralogy of Fallot (rTOF), a complex congenital heart disease, exhibits substantial clinical heterogeneity. Accurate prediction of disease progression and tailored patient management remain elusive. We aimed to categorize rTOF patients into distinct phenotypes based on clinical variables and variables obtained from cardiac magnetic resonance (CMR) imaging.

Methods

A retrospective observational cohort study of rTOF patients with at least two CMR assessments between 2005 and 2022 was performed. From patient records, clinical variables, CMR measurements, and electrocardiogram data were collected and processed. Baseline and follow-up variables between subsequent CMR studies were used to assess both inter- and intrapatient disease heterogeneity. Subsequently, unsupervised machine learning was performed, involving dimensionality reduction using principal component analysis and K-means clustering to identify different phenotypic clusters.

Results

In total, 155 patients (54.2 % male, median 14.9 years) were included and followed for a median duration of 9.9 years. A total of 459 CMR studies were included in analysis for the identification of phenotypic clusters. Following analysis, we identified four distinct rTOF phenotypes: (1) stable/slow deteriorating, (2) deteriorating, structural remodeling, (3) deteriorated indicated for pulmonary valve replacement, and lastly (4) younger patients with coexisting anomalies. These phenotypes exhibited differential clinical profiles (p < 0.01), cardiac remodeling patterns (p < 0.01), and intervention rates (p < 0.01).

Conclusions

Unsupervised machine learning analysis unveiled four discrete phenotypes within the rTOF population, elucidating the substantial disease heterogeneity on both a population- and patient-level. Our study underscores the potential of unsupervised machine learning as a valuable tool for characterizing complex congenital heart disease and potentially tailoring interventions.

Keywords: CMR, Risk stratification, Tetralogy of Fallot, Machine learning, Phenotypic clustering

Graphical abstract

Image 1

Highlights

  • Conventional approaches to monitoring patients with rTOF focus on imaging, and utilize clinical risk prediction scores.

  • Machine learning analyzes patterns within a population, presenting a novel pathway for refining risk stratification and patient phenotyping.

  • Four distinct phenogroups with variations in clinical profiles, remodeling patterns, and interventions were identified.

  • Insights from machine learning should be integrated into future research, as it may offer a more targeted approach.

1. Introduction

Tetralogy of Fallot (TOF) is a complex form of congenital heart disease (CHD) and the most common cyanotic heart disease [1]. With an estimated incidence of 5 per 10,000 live births, TOF remains an important clinical concern [2]. Considerable progress has been achieved in the repair and management of patients with TOF, driven by improved understanding of cardiac physiology and surgical techniques. Observational studies have reported promising 25-year survival rates of approximately 95 % following successful repair (rTOF), necessitating a change in focus from perioperative management towards lifelong monitoring and care [3]. Cardiac magnetic resonance (CMR) imaging remains the reference standard imaging modality for the evaluation of volumetric measurements, biventricular function, and quantification of valvular regurgitation [4,5]. Serial CMR studies are recommended in clinical practice due to the growing awareness of timely identification of ventricular deterioration and reliance on quantification of ventricular parameters and pulmonary regurgitation (PR) for recommending pulmonary valve replacement (PVR) [4,5]. Multiple observational studies have demonstrated an association between ventricular dysfunction, PR, and myocardial fibrosis with adverse outcomes [[6], [7], [8], [9]]. To address this observation, clinical risk prediction scores including hemodynamic, structural, and electrophysiological risk factors have been developed [10,11]. However, it has also been demonstrated that traditional ventricular function parameters themselves are not sensitive in the long-term prediction of deterioration of ventricular performance in these patients [6,9,12]. In contrast to these traditional methods, unsupervised machine learning approaches offer a novel avenue to enhance risk stratification and patient phenotyping by analyzing patterns and structures within the given population [13]. As such, we hypothesized that clinical variables and the progression of cardiac mechanics can be used to identify distinct phenotypes within the cohort of patients with rTOF, potentially enabling a more granular understanding of the heterogeneity in disease progression over time [14].

2. Methods

2.1. Study population

This single-center study retrospectively assessed all patients diagnosed with rTOF, presenting to our institution between 2005 and 2022. All patients with a history of rTOF were included if they had undergone at least 2 CMR assessments at our institution. Patients with incomplete surgical history or with missing follow-up were excluded. Additionally, patients with poor CMR image quality were excluded from analysis. The study was approved by the Johns Hopkins Medicine institutional review board with waiver of informed consent due to the retrospective nature of the study.

2.2. Clinical variables

A detailed surgical history, description of specific patient characteristics and clinical variables at the time of CMR were obtained from medical records. These clinical variables included age, sex, body surface area (BSA), weight, height, any relevant medications (diuretics, digoxin, beta-blockers, calcium-channel blockers, angiotensin converting enzyme inhibitors, and angiotensin receptor blockers).

2.3. CMR measurements included

The first CMR assessment performed at our center was considered as the patient's initial timepoint. Subsequent CMR studies were considered as follow-up timepoints of interest. CMR studies were performed on a 1.5-T Siemens scanner (Siemens Medical Solutions, Pennsylvania, USA) using a standardized imaging protocol. Volumetric and functional data from the left- and right ventricle were obtained and the pulmonary regurgitant fraction (PRF) was quantified from velocity flow mapping using standard protocols. Z-scores were obtained by comparison to reference ranges for CMR in adults and children [15].

2.4. Data processing

Missing data was handled using a simple imputer for categorical variables and an iterative imputer with decision tree regressor for continuous variables. Time intervals between CMR studies were considered as the primary datapoints, here the baseline CMR parameters (CMRn) were used. Then, to quantify the annualized rate of change for volumetric and functional parameters during the CMR interval (ΔCMRn+1-CMRn), the following formula was used:

AnnualizedRatioVariable=(Variablen+1Variablen)Yearsbetweennandn+1

In addition, constant (time-independent) patient characteristics such as sex, ethnicity, presence of any chromosomal abnormalities, baseline anatomy and surgical history were used. Lastly, reinterventions were only included as data points if the reintervention occurred between respective CMR studies.

2.5. Unsupervised machine learning analysis

Nine multi-collinear features were dropped using a threshold of r = 0.8 to avoid overfitting. After centering and scaling the features using a standard scaler, a linear transformation was applied to conduct principal component analysis (PCA) [16]. Subsequently, sixteen principal components (PC) that accounted for 81.4 % of the variance in the data were selected. These selected PCs were then used as input for unsupervised cluster analysis. After dimensionality reduction using PCA, we have employed K-means clustering algorithm to segment the patient time intervals into distinct groups based on their similarity [17]. We have implemented an iterative process to determine the optimal configuration for K-means clustering, leveraging the silhouette score as the evaluation metric [18]. From the analysis, we identified that the most suitable number of clusters for this dataset was four. This configuration yielded a silhouette score of 0.19 units, employing the k-means++ initialization method and a maximum of 100 iterations assigning patient intervals to the nearest cluster centroid. The proportion of time intervals across clusters resulted in 64.4 %, 10.8 %, 4.4 %, 20.3 % in clusters ‘1’, ‘2’, ‘3’ and ‘4’ respectively.

2.6. Statistical analysis

Normality of the distribution of continuous variables was tested using the Shapiro–Wilk test. Continuous variables are expressed as mean ± standard deviation or median with interquartile range (IQR), as appropriate. Categorical variables are expressed as counts and relative frequency (%). Between-cluster differences were compared using analysis of variance (ANOVA) for normally distributed variables and Kruskal-Wallis for non-normally distributed variables, as appropriate. Post-hoc pairwise comparisons were adjusted for multiple testing using the Tukey correction method and Benjamini-Hochberg method, as appropriate. A 2-tailed p-value <0.05 was considered statistically significant. All analyses were completed with R Statistical Software (version 4.1.1, Foundation for Statistical Computing, Vienna, Austria) and Python (version 3.11.3, Python Software Foundation).

3. Results

3.1. Study population

One-hundred fifty-five patients were eligible for inclusion in the study, after exclusion of 13 due to incomplete history, missing follow-up, or poor-quality imaging (Fig. 1). These 155 patients (median age at baseline CMR of 14.9 years [interquartile range 10.9–20.4], range 2.5 months–62.5 years, 84 males) were followed up over a duration between 1.5 and 19.1 years, with a median of 9.9 years (interquartile range 6.4–13.8). Clinical characteristics of the study cohort are summarized in Table 1, along with the percentage of missingness for that variable. Fifty-one patients (23.9 %) underwent PVR prior to the first CMR. During the study period, 40 patients (25.8 %) underwent PVR (Fig. 2). Among these, 30 were transcatheter PVR procedures (75.0 %). During the study period, repeat PVR was common, with 28 patients (17.5 %) requiring at least 1 reintervention. Additional procedures were frequent, with 12 patients (7.7 %) undergoing implantation of a permanent pacemaker. Regarding clinical outcomes, 4 patients (2.6 %) required hospitalization for heart failure and 3 patients died, 1 of which was a cardiac death (cardiac arrhythmia). During longitudinal follow-up, patients underwent 459 CMR studies in total (with a mean of 2.9 ± 1.3 studies per patient, range 2–8). These resulted in 295 CMR intervals (ΔCMRn+1-CMRn) available for analysis, with a mean of 3.7 ± 2.6 years (median 3.2 [interquartile range 2.0–4.5 years]) between CMR studies.

Fig. 1.

Fig. 1

Patient flowchart and study population.

LEGEND: CONSORT flowchart. Study flow chart according to the Standard Reporting of Observational Studies (STROBE) guidelines.

Table 1.

Clinical characteristics of rTOF cohort.

Characteristic Missing variables (%) rTOF (N = 155)
Demographics
 Age, years 0 (0 %) 14.9 [10.9–20.4]
 Male 0 (0 %) 84 (54.2)
 White 7 (5 %) 105 (67.7)
 Trisomy 21 0 (0 %) 10 (6.5)
 DiGeorge syndrome 0 (0 %) 16 (10.3)
 Scoliosis 0 (0 %) 12 (7.7)
 Asthma 0 (0 %) 17 (11.0)
Anatomy
 Absent PV syndrome 24 (16 %) 13 (8.4)
 Bicuspid PV 24 (16 %) 11 (7.1)
 Aberrant subclavian arteries 8 (5 %) 15 (10.0)
 Persistent LSVC 6 (4 %) 10 (6.5)
 Right aortic arch 0 (0 %) 45 (29.0)
 Anomalous origins of coronary arteries 6 (4 %) 12 (7.7)
 MAPCA 2 (1 %) 26 (16.8)
Surgical Data
 Palliative surgery 0 (0 %) 50 (32.3)
 Age at palliative surgery, days 0 (0 %) 37.0 [5.0–95.0]
 Age at primary repair, years 0 (0 %) 0.6 [0.3–1.4]
 Surgical procedure 34 (22 %)
 Transannular patch 91 (58.7)
 RV-PA conduit 24 (15.5)
 Valve-sparing repair 6 (3.9)

ABBREVIATIONS: LSVC: left superior vena cava, MAPCA: major aortopulmonary collateral arteries, PV: pulmonary valve, RV-PA: right ventricle to pulmonary artery.

Fig. 2.

Fig. 2

Freedom from pulmonary valve replacement.

LEGEND: Kaplan–Meier curves demonstrating freedom from PVR in the overall population (red) following the first baseline CMR study, and freedom from secondary PVR (dark grey) following index PVR in a subset of patients. ABBREVIATIONS: CMR: cardiac magnetic resonance, PVR: pulmonary valve replacement. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

3.2. Exploration of the low-dimensional output space

The PCA analysis consisted of 16 PCs that accounted for 81.4 % of the variance, exploration of the first 3 PCs (explaining 34.9 % of the variance) revealed that (1) PC1 seems to represent a combination of various cardiac functional and mechanical metrics and age at the time of CMR, (2) PC2 consists of features that capture variations related to different cardiac and anatomical metrics, as well as some ECG parameters, and lastly (3) PC3 consists of variations related to specific valve morphology, as well as surgical interventions and some anthropometric measurements. Subsequently, unsupervised k-means clustering identified four distinct clusters, which corresponded to the quadrants identified by PCA (Fig. 3).

Fig. 3.

Fig. 3

Exploration of the low dimensional output space.

LEGEND: Exploration of clusters according to the first 3 principal components in each direction. ABBREVIATIONS: PC: principal component.

3.3. Unsupervised machine learning revealed four distinct patterns

The cluster analysis identified four clusters with differential phenotypes (Fig. 4). As shown in Table 2, the four clusters were phenotypically distinct in anthropometric measurements, cardiac, anatomical, and electrocardiographic metrics. Furthermore, as demonstrated in Table 3, the annualized rates of change in volumetric and functional parameters were phenotypically distinct across clusters. Cluster 1 (namely “stable/slow deteriorating” rTOF, N = 190) had the smallest indexed ventricular volumes and PRF at baseline. While left ventricular (LV) volumes tended to normal, right ventricular (RV) volumes were already elevated. Furthermore, Cluster 1 had a high prevalence of increased BMI, weight and height compared to the other clusters. During follow-up, this cluster demonstrated an increased annualized ratio of RV volumetric parameters, resulting in an increasing RV/LV ratio. Cluster 2 (namely “deteriorating, structural remodeling” rTOF, N = 32) had a higher prevalence of males (75.0 %) and LV dilatation (40.6 % [defined as LVEDVi Z value > 2]). Simultaneously, these patients demonstrated increased LVMi and had an increased cardiac index. Regarding the RV, volumes were already above average, but most did not reach an RV/LV ratio >2 (21.9 %) or RV end-diastolic volume index ≥160 ml/m2 (37.5 %). During follow-up, Cluster 2 demonstrated decreasing ventricular volumes, with the most apparent decrease in LV volume. Cluster 3 (namely “deteriorated” rTOF indicated for PVR, N = 13) had the highest amount of PRF and highest prevalence of RV dilatation with the largest RV volumes. Furthermore, most already had indications for PVR at the index CMR, RV/LV ratio >2 (69.2 %) or RV end-diastolic volume index ≥160 ml/m2 (61.5 %). During follow-up, 76.9 % underwent PVR with subsequent annualized reductions in PRF and RV volumes. Lastly, Cluster 4 (namely “younger” rTOF with coexisting anomalies, N = 60) were significantly younger compared to all clusters, had a higher prevalence of bicuspid PV, persistent left superior vena cava, and demonstrated lower anthropometric measurements such as decreased BMI, weight, and height. In addition, despite their young age, these patients already demonstrated increased RV volume overload (RV/LV ratio >2 [28.3 %] or RV end-diastolic volume index ≥160 ml/m2 [15.0 %]) and a high amount of PRF. During follow-up, Cluster 4 demonstrated increases in annualized rates of ventricular volumes, with the largest increases observed in the RV, and increases in annualized rates of PRF.

Fig. 4.

Fig. 4

Summary of the distinct phenogroups.

LEGEND: Clinical characteristics and features of CMR imaging patterns and annualized rates of change in the 4 clusters. ABBREVIATIONS: BMI: body mass index, CMR: cardiac magnetic resonance, LV: left ventricle, PRF: pulmonary regurgitation, PVR: pulmonary valve (PV) replacement, RV: right ventricle.

Table 2.

Clinical characteristics of rTOF cohort according to identified phenogroups.

Characteristic Cluster 1(N = 190) Cluster 2 (N = 32) Cluster 3 (N = 13) Cluster 4 (N = 60) p-value
Demographics
 Male 91 (47.9) 24 (75.0) 7 (53.8) 34 (56.7) 0.04*
 White 132 (69.5) 23 (71.9) 12 (92.3) 42 (70.0) 0.39
 Trisomy 21 14 (7.4) 0 (0.0) 0 (0.0) 4 (6.7) 0.46
 DiGeorge syndrome 23 (12.1) 4 (12.5) 2 (15.4) 11 (18.3) 0.62
 Scoliosis 10 (5.3) 2 (6.3) 0 (0.0) 9 (15.0) 0.08
 Asthma 21 (11.1) 2 (6.3) 0 (0.0) 7 (11.7) 0.66
Anatomy
 Absent PV syndrome 14 (7.4) 7 (21.9) 1 (7.7) 6 (10.0) 0.09
 Bicuspid PV 6 (3.2) 1 (3.1) 0 (0.0) 8 (13.3) 0.02
 Aberrant subclavian arteries 16 (8.4) 5 (15.6) 5 (38.5) 7 (11.7) 0.06
 Persistent LSVC 8 (4.2) 1 (3.1) 3 (23.1) 7 (11.7) 0.02
 Right aortic arch 55 (28.9) 12 (37.5) 5 (38.5) 25 (41.7) 0.25
 Anomalous origins of coronary arteries 14 (7.4) 1 (3.1) 2 (15.4) 2 (3.3) 0.32
 MAPCA 33 (17.4) 4 (12.5) 2 (15.4) 11 (18.3) 0.87
Baseline CMR
 Age, years 20.2 [16.0–26.0] 14.3 [10.9–19.0] 20.5 [16.8–26.1] 9.50 [6.5–13.5] <0.01*‡¶§#
 BSA, m2 1.8 [1.6–2.0] 1.4 [1.0–1.6] 1.6 [1.6–1.8] 1.0 [0.8–1.2] <0.01*‡¶§#
 BMI z-score 1.0 [0.3–1.7] −0.3 [−1.0–0.7] 0.3 [−0.1–1.4] −0.5 [–1.7–0.5] <0.01*
 Weight-for-age z-score 0.9 [0.1–1.6] −0.5 [−1.5–0.3] 0.2 [−0.3–0.9] −1.3 [–2.1–0.5] <0.01*†‡¶§#
 Height-for-age z-score −0.1 [−0.9–0.7] −0.7 [−1.5–0.1] −0.7 [−0.9–0.5] −1.4 [–2.9–0.4] <0.01*‡#
 LVEDVi, ml/m2 70.6 [61.7–81.5] 96.9 [90.6–124.0] 68.1 [55.9–90.9] 70.8 [63.2–84.8] <0.01*¶§
 LVESVi, ml/m2 29.3 [23.1–34.5] 40.0 [35.5–50.7] 30.7 [21.1–35.6] 29.5 [23.1–34.5] <0.01*¶§
 LVSVi, ml/m2 42.6 [36.0–47.5] 59.2 [53.7–71.9] 50.5 [38.9–56.7] 42.0 [37.6–49.8] <0.01*¶§
 LVEF, % 59.5 [56.0–64.1] 59.5 [54.9–65.6] 61.5 [53.9–67.6] 59.1 [54.0–64.0] 0.76
 LVMi, g/m2 49.1 [43.5–56.0] 72.7 [61.9–75.7] 58.0 [55.6–60.4] 44.7 [37.6–48.5] 0.01*§
 LVCI, l/min/m2 2.9 [2.5–3.3] 4.3 [3.9–4.9] 3.7 [2.6–4.1] 3.6 [2.7–4.4] <0.01*‡¶§
 RVEDVi, ml/m2 115.0 [97.7–131.0] 155.0 [149.0–198.0] 169.0 [157.0–212.0] 117.0 [85.4–146.0] <0.01*†§#
 RVESVi, ml/m2 60.5 [49.1–70.6] 86.9 [76.7–102.0] 92.2 [83.9–117.0] 54.2 [42.1–72.8] <0.01*†§#
 RVSVi, ml/m2 53.5 [42.7–64.5] 70.5 [58.9–85.3] 76.8 [67.3–94.2] 54.1 [43.3–77.2] <0.01*†§#
 RVEF, % 47.5 (8.1) 45.1 (7.3) 46.6 (5.9) 50.2 (8.8) 0.02§
 RVCI, l/min/m2 3.8 [3.1–4.5] 5.0 [4.3–6.5] 6.8 [5.0–7.8] 4.7 [3.5–5.5] <0.01*†‡#
 PRF, % 32.5 [19.0–45.0] 44.1 [23.9–53.6] 60.0 [45.5–61.0] 38.0 [26.0–48.3] <0.01*†§#
 RV/LV ratio 1.6 [1.4–1.9] 1.6 [1.4–1.9] 2.7 [1.9–3.1] 1.5 [1.2–2.1] <0.01†¶#
 AAoi, cm/m2 1.7 [1.5–1.9] 1.8 [1.7–2.2] 2.0 [1.5–2.1] 2.3 [2.0–2.7] <0.01*‡§#
 MPAi, cm/m2 1.5 [1.2–1.7] 2.0 [1.6–2.5] 1.8 [1.6–2.1] 2.15 [1.8–2.5] <0.01*†‡
 RPAi, cm/m2 0.85 [0.71–1.07] 1.13 [0.88–1.31] 1.0 [0.9–1.2] 1.2 [1.1–1.6] <0.01*†‡
 LPAi, cm/m2 0.9 [0.7–1.0] 1.4 [1.1–1.5] 1.0 [0.9–1.3] 1.3 [1.1–1.6] <0.01*†‡#
Electrocardiography
 Heart rate, bpm 74.0 [66.0–82.2] 81.5 [71.0–98.0] 75.5 [64.5–77.2] 86.5 [74.0–105] <0.01*‡#
 QRS, ms 149 [118–162] 150 [132–159] 158 [152–166] 128 [103–138] <0.01‡§#
 QTc, ms 467 [447–489] 489 [468–511] 486 [474–515] 453 [432–483] <0.01*†§#

LEGEND: * Significant p-value between Cluster 1 and Cluster 2, † Significant p-value between Cluster 1 and Cluster 3, ‡ Significant p-value between Cluster 1 and Cluster 4, ¶ Significant p-value between Cluster 2 and Cluster 3, § Significant p-value between Cluster 2 and Cluster 4, # Significant p-value between Cluster 3 and Cluster 4.

ABBREVIATIONS: AAoi: ascending aorta index, BMI: body mass index, BSA: body surface area, CI: cardiac index, CMR: cardiac magnetic resonance, EDVi: end-diastolic volume index, EF: ejection fraction, ESVi: end-systolic volume index, LPAi: left pulmonary artery index, LSVC: left superior vena cava, LV: left ventricle, LVMi: LV mass index, MAPCA: major aortopulmonary collateral arteries, MPAi: main pulmonary artery index, PRF: pulmonary regurgitant fraction, PV: pulmonary valve, RPAi: right pulmonary artery (rpa) index, RV: right ventricle, RV/LV: ratio of RV to LV EDV, SVi: stroke volume index.

Table 3.

Rates of change in volumetric and functional parameters and reinterventions in the rTOF cohort according to identified phenogroups.

Characteristic Cluster 1 (N = 190) Cluster 2 (N = 32) Cluster 3 (N = 13) Cluster 4 (N = 60) p-value
CMR, % per year
LVEDVi, ml/m2 1.1 [−1.3–2.9] −6.9 [−13.6–3.3] 3.1 [0.8–15.0] 1.5 [−1.7–4.3] <0.01*¶§
LVESVi, ml/m2 0.7 [−0.8–2.2] −2.2 [−4.0–0.2] 3.5 [0.7–5.8] 0.7 [−0.4–2.5] <0.01*†¶§
LVSVi, ml/m2 0.2 [−1.2–1.9] −5.2 [−7.9–2.4] 0.5 [−0.9–8.8] 0.8 [−0.7–2.5] <0.01*¶§
LVEF, % −0.5 [−1.7–0.7] −0.4 [−2.1–0.5] −1.8 [−3.0–0.3] −0.5 [−1.3–0.4] 0.61
LVMi, g/m2 −0.2 [−1.6–1.4] −13.6 [−18.0–1.1] 6.1 [2.8–9.4] 1.5 [0.1–4.6] 0.04
LVCI, l/min/m2 0.0 [−0.2–0.1] −0.5 [−0.7–0.3] 0.2 [−0.1–0.4] 0.0 [−0.2–0.2] <0.01*¶§
RVEDVi, ml/m2 0.8 [−3.3–5.0] −2.3 [−11.1–5.0] −34.6 [−41.2–22.6] 2.0 [−3.1–9.7] <0.01†¶§#
RVESVi, ml/m2 0.8 [−2.0–4.2] −2.7 [−8.6–3.5] −14.2 [−22.6–9.8] 1.8 [−0.9–7.0] <0.01†¶§#
RVSVi, ml/m2 0.1 [−2.1–1.8] −2.9 [−6.1–1.9] −19.5 [−22.4–12.0] 0.4 [−2.9–3.3] <0.01†¶#
RVEF, % −0.5 [−1.9–0.5] −0.3 [−1.8–1.4] −1.9 [−4.7–1.8] −0.9 [−2.2–0.1] 0.02†¶
RVCI, l/min/m2 −0.1 [−0.2–0.1] −0.2 [−0.8–0.0] −1.4 [−2.0–1.3] −0.1 [−0.4–0.2] <0.01*†¶#
PRF, % 0.4 [−1.2–2.3] 0.3 [−2.0–3.3] −33.2 [−44.0–10.3] 0.6 [−1.7–2.0] <0.01†¶#
RV/LV ratio 0.9 [−0.4–2.0] 0.0 [−1.9–1.1] −2.5 [−13.7–0.7] 1.0 [−0.7–1.9] <0.01*†#
Reinterventions during interval
Pulmonary valve replacement 18 (9.5) 3 (9.4) 10 (76.9) 13 (21.7) <0.01†‡¶#
Pulmonary valvuloplasty 3 (1.6) 0 (0.0) 0 (0.0) 1 (1.7) 1.00
Balloon angioplasty 6 (3.2) 0 (0.0) 0 (0.0) 3 (5.0) 0.67
Pulmonary artery stenting 9 (4.7) 0 (0.0) 0 (0.0) 5 (8.3) 0.35

LEGEND: * Significant p-value between Cluster 1 and Cluster 2, † Significant p-value between Cluster 1 and Cluster 3, ‡ Significant p-value between Cluster 1 and Cluster 4, ¶ Significant p-value between Cluster 2 and Cluster 3, § Significant p-value between Cluster 2 and Cluster 4, # Significant p-value between Cluster 3 and Cluster 4. Abbreviations as in Table 2.

3.4. Cluster transitions during follow-up

To examine the transition of individual patients between assigned clusters during follow-up, patients with ≥3 CMR assessments (75 patients [48.4 %], resulting in a total of 139 CMR intervals) were evaluated and the cluster transitions were visualized (Fig. 5). Sixteen scenarios were evaluated (44, e.g. patient is assigned to Cluster 1 at the time interval between CMR1 and CMR2 and can potentially be assigned to each of the 4 clusters during the time interval between CMR2 and CMR3, extending to each of the originally assigned clusters). Most patients who were initially assigned to Cluster 1 (“stable/slow deteriorating” rTOF) remained in Cluster 1 (89.5 %). Interestingly, most patients who were assigned to Cluster 3, and thus the “deteriorated” rTOF phenotype indicated for PVR, were reassigned to Cluster 1, the “stable/slow deteriorating” rTOF phenotype, during the subsequent CMR interval (72.7 %), indicating “normalization”.

Fig. 5.

Fig. 5

Cluster transitions during follow-up.

LEGEND: The Sankey plot depicts cluster transitions from the index CMR to the subsequent CMR (CMRn to CMRn+1). The line thickness indicates a larger percentage of cluster transitions.

4. Discussion

There has been a growing utilization of cluster analyses in clinical research [13,19]. The primary objective is to uncover the inherent structure hidden within data, leading to the identification of novel phenogroups associated with a disease or clinical syndrome [13,19]. Insights obtained from this unsupervised cluster phenotyping analysis can be introduced into predictive models and could potentially improve detection of adverse events by incorporating cluster membership as an additional predictor. Leveraging unsupervised machine learning, this study identified four unique rTOF phenotypes with differential clinical profiles, cardiac remodeling, and intervention rates (Graphical Abstract).

Our findings are important, as previous studies have demonstrated that predicting the decline in RV function and the optimal timing of PVR is challenging due to the complex and diverse anatomical variations coupled with the intrinsic limitations to the assessment of ventricular performance [4,5,12]. The identification of these distinct phenogroups could aid to reduce heterogeneity and produce better prediction models, since conventional regression analyses have been unsuccessful. Prior studies have used supervised learning algorithms such as support vector machines (SVM) as methods for classification of deterioration in patients with rTOF. For example, one study categorized deterioration as major, minor, or none based on combined changes in indexed RV end-diastolic volume, as well as both RV and LV ejection fraction during follow-up [20]. The predictive models performed well, with respective AUC scores ranging from 0.70 to 0.87 [20]. Thus, the authors demonstrated that machine-learning techniques uncovered predictive abilities of variables that were previously unrecognized using traditional regression methods. Yet, a common limitation of prior studies is the exclusion of patients with surgical- or catheter-based interventions between CMR scans, which is precisely a population these studies would need to capture to offer clinical prediction benefits. Therefore, although unsupervised methods such as SVM can be tremendously powerful for risk stratification, our emphasis was on highlighting distinct phenogroups of rTOF, which may be driven by fundamentally different underlying pathophysiological mechanisms and thus have distinct variations in occurrence of dysfunction and remodeling, and adverse outcomes.

Various clinical observations of previous observational studies could also be identified in separate phenogroups. For instance, Cluster 2 demonstrated subclinical LV abnormalities (increased LVEDVi, LV end-systolic volume index, and LV mass index, potentially indicating eccentric remodeling) which have been demonstrated earlier in subgroups of patients with rTOF [21,22]. These factors may contribute to the occurrence of later left‐sided heart failure and mortality [23,24]. Cluster 4 consisted of younger patients which demonstrated more rapid progression of RV abnormalities and PRF, potentially signaling more complex disease. We also noted that these patients, despite being younger, frequently had large RV volumes and underwent PVR, which confirms the observation of earlier RV dysfunction and PVR in syndromic/complex patients [25,26]. Lastly, Cluster 3 captured those who underwent successful PVR with subsequent normalization of RV size and PRF (of 44 patients undergoing PVR, the first 8 with highest annualized RV reductions were all assigned to Cluster 3). It is interesting that this phenogroup successfully identified RV normalization, which has been proposed to be unlikely following RV end-systolic volume index ≥80 ml/m2 and the RV end-diastolic volume index ≥180 ml/m2 [27,28]. These findings might be helpful in guiding treatment decisions towards optimal timing for PVR and identifying those who benefit most, although a survival benefit with volume reduction still needs to be demonstrated [29]. An observation was that the systolic function of both the RV and LV when assessed by ejection fractions were similar between clusters, despite fundamentally different observed pathophysiological mechanisms, potentially indicating the lacking sensitivity of ejection fractions [30]. Another interesting observation was that the presence of a transannular patch was not decisive in segmentation of clusters, as these did not significantly differ between the groups (p = 0.10). Nevertheless, studies have demonstrated increasing deterioration in those with a transannular patch [9,20].

The application of unsupervised machine learning in clustering rTOF patients, although beyond current horizons, holds great promise for clinical practice. Since analysis of longitudinal data using unsupervised techniques can offer valuable insights into disease progression and response to medical therapy or PVR, and identify subgroups early on, it enables the implementation of targeted interventions, such as personalized medication regimens, intensified monitoring, or timely surgical procedures, thus paving the path for precision medicine approaches in the management of rTOF.

4.1. Limitations

While acknowledging the strengths of the study, it is essential to address several limitations. First, the retrospective nature of the study conducted in a single center led to a relatively small sample size, introducing potential biases when it comes to external validation of the findings. Second, the presence of missing values requires imputation, which, while mitigating the issue of missingness, there remains a small possibility of potential bias, resulting in improper segmentation of CMR intervals to their respective clusters. Due to the retrospective design, there might be a presence of selection bias in cluster assignment. For example, the decision for PVR was made on an individual basis following the most current guidelines, although there remains distribution across phenotypes regarding interventions. Furthermore, the limited number of adverse events prevented us from identifying meaningful relationship between cluster membership and outcomes, and this must be addressed in future studies.

5. Conclusions

By employing unsupervised machine learning algorithms, we have successfully identified four distinct phenogroups within the rTOF population. These phenotypes exhibit variations in clinical profiles, cardiac remodeling patterns, and rates of intervention. These results highlight the potential of unsupervised machine learning algorithms in effectively grouping rTOF patients and their clinical progression based on shared features.

Funding statement

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

CRediT authorship contribution statement

Xander Jacquemyn: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Validation, Visualization, Writing – original draft, Writing – review & editing. Bhargava K. Chinni: Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Writing – review & editing. Ashish N. Doshi: Conceptualization, Investigation, Supervision, Validation, Writing – review & editing. Shelby Kutty: Conceptualization, Investigation, Methodology, Supervision, Visualization, Writing – review & editing. Cedric Manlhiot: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Supervision, Validation, Visualization, Writing – review & editing.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

XJ was supported by a grant from the Belgian American Educational Foundation.

References

  • 1.Apitz C., Webb G.D., Redington A.N. Tetralogy of Fallot. Lancet (London, England) 2009;374:1462–1471. doi: 10.1016/S0140-6736(09)60657-7. [DOI] [PubMed] [Google Scholar]
  • 2.Mai C.T., Isenburg J.L., Canfield M.A., Meyer R.E., Correa A., Alverson C.J., et al. National population-based estimates for major birth defects, 2010-2014. Birth defects Res. 2019;111:1420–1435. doi: 10.1002/bdr2.1589. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Smith C.A., McCracken C., Thomas A.S., Spector L.G., St Louis J.D., Oster M.E., et al. Long-term outcomes of tetralogy of fallot: a study from the pediatric cardiac care consortium. JAMA Cardiol. 2019;4:34–41. doi: 10.1001/jamacardio.2018.4255. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Stout K.K., Daniels C.J., Aboulhosn J.A., Bozkurt B., Broberg C.S., Colman J.M., et al. 2018 AHA/ACC guideline for the management of adults with congenital heart disease: a report of the American college of cardiology/American heart association task force on clinical practice guidelines. Circulation. 2019;139:e698–e800. doi: 10.1161/CIR.0000000000000603. [DOI] [PubMed] [Google Scholar]
  • 5.Baumgartner H., Backer J De, Babu-Narayan S.V., Budts W., Chessa M., Diller G.-P., et al. 2020 ESC Guidelines for the management of adult congenital heart disease: the Task Force for the management of adult congenital heart disease of the European Society of Cardiology (ESC). Endorsed by: association for European Paediatric and Congenital Card. Eur Heart J. 2021;42:563–645. doi: 10.1093/eurheartj/ehaa554. [DOI] [PubMed] [Google Scholar]
  • 6.Hagdorn Q.A.J., Vos J.D.L., Beurskens N.E.G., Gorter T.M., Meyer S.L., Melle JP Van, et al. CMR feature tracking left ventricular strain-rate predicts ventricular tachyarrhythmia, but not deterioration of ventricular function in patients with repaired tetralogy of Fallot. Int J Cardiol. 2019;295:1–6. doi: 10.1016/j.ijcard.2019.07.097. [DOI] [PubMed] [Google Scholar]
  • 7.Geva T., Mulder B., Gauvreau K., Babu-Narayan S.V., Wald R.M., Hickey K., et al. Preoperative predictors of death and sustained ventricular tachycardia after pulmonary valve replacement in patients with repaired tetralogy of fallot enrolled in the INDICATOR cohort. Circulation. 2018;138:2106–2115. doi: 10.1161/CIRCULATIONAHA.118.034740. [DOI] [PubMed] [Google Scholar]
  • 8.Knauth A.L., Gauvreau K., Powell A.J., Landzberg M.J., Walsh E.P., Lock J.E., et al. Ventricular size and function assessed by cardiac MRI predict major adverse clinical outcomes late after tetralogy of Fallot repair. Heart. 2008;94:211–216. doi: 10.1136/hrt.2006.104745. [DOI] [PubMed] [Google Scholar]
  • 9.Jing L., Wehner G.J., Suever J.D., Charnigo R.J., Alhadad S., Stearns E., et al. Left and right ventricular dyssynchrony and strains from cardiovascular magnetic resonance feature tracking do not predict deterioration of ventricular function in patients with repaired tetralogy of Fallot. J Cardiovasc Magn Reson Off J Soc Cardiovasc Magn Reson. 2016;18:49. doi: 10.1186/s12968-016-0268-8. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 10.Atallah J., Gonzalez Corcia M.C., Walsh E.P. Ventricular arrhythmia and life-threatening events in patients with repaired tetralogy of fallot. Am J Cardiol. 2020;132:126–132. doi: 10.1016/j.amjcard.2020.07.012. [DOI] [PubMed] [Google Scholar]
  • 11.Ghonim S., Gatzoulis M.A., Ernst S., Li W., Moon J.C., Smith G.C., et al. Predicting survival in repaired tetralogy of fallot: a lesion-specific and personalized approach. JACC Cardiovasc Imaging. 2022;15:257–268. doi: 10.1016/j.jcmg.2021.07.026. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 12.Wald R.M., Valente A.M., Gauvreau K., Babu-Narayan S.V., Assenza G.E., Schreier J., et al. Cardiac magnetic resonance markers of progressive RV dilation and dysfunction after tetralogy of Fallot repair. Heart. 2015;101:1724–1730. doi: 10.1136/heartjnl-2015-308014. [DOI] [PubMed] [Google Scholar]
  • 13.Manlhiot C., Eynde J.van.den, Kutty S., Ross H.J. A primer on the present state and future prospects for machine learning and artificial intelligence applications in cardiology. Can J Cardiol. 2022;38:169–184. doi: 10.1016/j.cjca.2021.11.009. [DOI] [PubMed] [Google Scholar]
  • 14.Jacquemyn X., Kutty S., Manlhiot C. The lifelong impact of artificial intelligence and clinical prediction models on patients with Tetralogy of Fallot. CJC Pediat Cong Heart Dis. 2023;2:440–452. doi: 10.1016/j.cjcpc.2023.08.005. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 15.Kawel-Boehm N., Hetzel S.J., Ambale-Venkatesh B., Captur G., Francois C.J., Jerosch-Herold M., et al. Reference ranges (‘normal values’) for cardiovascular magnetic resonance (CMR) in adults and children: 2020 update. J Cardiovasc Magn Reson Off J Soc Cardiovasc Magn Reson. 2020;22:87. doi: 10.1186/s12968-020-00683-3. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 16.Ringnér M. What is principal component analysis? Nat Biotechnol. 2008;26:303–304. doi: 10.1038/nbt0308-303. [DOI] [PubMed] [Google Scholar]
  • 17.Celebi M.E., Kingravi H.A., Vela P.A. A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Syst Appl. 2013;40:200–210. [Google Scholar]
  • 18.Arbelaitz O., Gurrutxaga I., Muguerza J., Pérez J.M., Perona I. An extensive comparative study of cluster validity indices. Pattern Recogn. 2013;46:243–256. [Google Scholar]
  • 19.Quer G., Arnaout R., Henne M., Arnaout R. Machine learning and the future of cardiovascular care: JACC state-of-the-art review. J Am Coll Cardiol. 2021;77:300–313. doi: 10.1016/j.jacc.2020.11.030. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 20.Samad M.D., Wehner G.J., Arbabshirani M.R., Jing L., Powell A.J., Geva T., et al. Predicting deterioration of ventricular function in patients with repaired tetralogy of Fallot using machine learning. Eur Hear journal Cardiovasc Imaging. 2018;19:730–738. doi: 10.1093/ehjci/jey003. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 21.Broberg C.S., Aboulhosn J., Mongeon F.-P., Kay J., Valente A.M., Khairy P., et al. Prevalence of left ventricular systolic dysfunction in adults with repaired tetralogy of fallot. Am J Cardiol. 2011;107:1215–1220. doi: 10.1016/j.amjcard.2010.12.026. [DOI] [PubMed] [Google Scholar]
  • 22.Andrade A.C., Jerosch‐Herold M., Wegner P., Gabbert D.D., Voges I., Pham M., et al. Determinants of left ventricular dysfunction and remodeling in patients with corrected tetralogy of fallot. J Am Heart Assoc. 2019;8 doi: 10.1161/JAHA.118.009618. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Diller G.-P., Kempny A., Liodakis E., Alonso-Gonzalez R., Inuzuka R., Uebing A., et al. Left ventricular longitudinal function predicts life-threatening ventricular arrhythmia and death in adults with repaired tetralogy of fallot. Circulation. 2012;125:2440–2446. doi: 10.1161/CIRCULATIONAHA.111.086983. [DOI] [PubMed] [Google Scholar]
  • 24.Valente A.M., Gauvreau K., Assenza G.E., Babu-Narayan S.V., Schreier J., Gatzoulis M.A., et al. Contemporary predictors of death and sustained ventricular tachycardia in patients with repaired tetralogy of Fallot enrolled in the INDICATOR cohort. Heart. 2014;100:247–253. doi: 10.1136/heartjnl-2013-304958. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Sullivan R.T., Frommelt P.C., Hill G.D. Earlier pulmonary valve replacement in down syndrome patients following tetralogy of fallot repair. Pediatr Cardiol. 2017;38:1251–1256. doi: 10.1007/s00246-017-1653-2. [DOI] [PubMed] [Google Scholar]
  • 26.Calcagni G., Calvieri C., Baban A., Bianco F., Barracano R., Caputo M., et al. Syndromic and non-syndromic patients with repaired tetralogy of fallot: does it affect the long-term outcome? J Clin Med. 2022;11 doi: 10.3390/jcm11030850. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 27.Bokma J.P., Winter M.M., Oosterhof T., Vliegen H.W., Dijk AP Van, Hazekamp M.G., et al. Preoperative thresholds for mid-to-late haemodynamic and clinical outcomes after pulmonary valve replacement in tetralogy of Fallot. Eur Heart J. 2016;37:829–835. doi: 10.1093/eurheartj/ehv550. [DOI] [PubMed] [Google Scholar]
  • 28.Heng E.L., Gatzoulis M.A., Uebing A., Sethia B., Uemura H., Smith G.C., et al. Immediate and midterm cardiac remodeling after surgical pulmonary valve replacement in adults with repaired tetralogy of fallot: a prospective cardiovascular magnetic resonance and clinical study. Circulation. 2017;136:1703–1713. doi: 10.1161/CIRCULATIONAHA.117.027402. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 29.Bokma J.P., Geva T., Sleeper L.A., Babu Narayan S.V., Wald R., Hickey K., et al. A propensity score-adjusted analysis of clinical outcomes after pulmonary valve replacement in tetralogy of Fallot. Heart. 2018;104:738–744. doi: 10.1136/heartjnl-2017-312048. [DOI] [PubMed] [Google Scholar]
  • 30.Ouyang R., Leng S., Sun A., Wang Q., Hu L., Zhao X., et al. Detection of persistent systolic and diastolic abnormalities in asymptomatic pediatric repaired tetralogy of Fallot patients with preserved ejection fraction: a CMR feature tracking study. Eur Radiol. 2021;31:6156–6168. doi: 10.1007/s00330-020-07643-6. [DOI] [PubMed] [Google Scholar]

Articles from International Journal of Cardiology Congenital Heart Disease are provided here courtesy of Elsevier

RESOURCES