Conversational AI for remote monitoring in heart failure: a prospective controlled pilot study

Aleix Olivella; Ana B Méndez Fernández; Emmanuel Giménez García; Alonso Ortega Torregimeno; Eduard Ródenas-Alesina; Raúl Aguilar López; Tania Maza Pelaez; Toni Soriano-Colomé; Augusto Sao Avilés; Aitor Uribarri; Teresa Soriano Sánchez; Carmen Pérez Bocanegra; Eva Domingo Baldrich; Maria José Martinez-Zapata; Maria Rubio-Valera; Ignacio Ferreira-González

doi:10.1093/ehjdh/ztag032

. 2026 Feb 17;7(3):ztag032. doi: 10.1093/ehjdh/ztag032

Conversational AI for remote monitoring in heart failure: a prospective controlled pilot study

Aleix Olivella ^1,^2,^✉,², Ana B Méndez Fernández ^3,⁴, Emmanuel Giménez García ^5,⁶, Alonso Ortega Torregimeno ^7,⁸, Eduard Ródenas-Alesina ^9,^10,¹¹, Raúl Aguilar López ¹², Tania Maza Pelaez ¹³, Toni Soriano-Colomé ^14,^15,¹⁶, Augusto Sao Avilés ¹⁷, Aitor Uribarri ^18,^19,²⁰, Teresa Soriano Sánchez ²¹, Carmen Pérez Bocanegra ²², Eva Domingo Baldrich ²³, Maria José Martinez-Zapata ^24,²⁵, Maria Rubio-Valera ^26,²⁷, Ignacio Ferreira-González ^28,^29,³⁰

¹ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

² Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

³ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

⁴ Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

⁵ Information Systems, Hospital Universitari Vall D'Hebron, Passeig de la Vall d’Hebron 119-129, Barcelona 08035, Spain

⁶ Health Services Research Group, Vall D’Hebron Research Institute (VHIR) Vall D’Hebron Hospital Campus, Barcelona 08035, Spain

⁷ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

⁸ Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

⁹ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

¹⁰ Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

¹¹ Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBER-CV), Monforte de Lemos, 3-5 Madrid 28029, Spain

¹² Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

¹³ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

¹⁴ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

¹⁵ Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

¹⁶ Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBER-CV), Monforte de Lemos, 3-5 Madrid 28029, Spain

¹⁷ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

¹⁸ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

¹⁹ Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

²⁰ Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBER-CV), Monforte de Lemos, 3-5 Madrid 28029, Spain

²¹ Internal Medicine Department, Vall D’Hebron University Hospital, Passeig de la Vall d’Hebron 119-129. 08035, Barcelona, Spain

²² Internal Medicine Department, Vall D’Hebron University Hospital, Passeig de la Vall d’Hebron 119-129. 08035, Barcelona, Spain

²³ Internal Medicine Department, Vall D’Hebron University Hospital, Passeig de la Vall d’Hebron 119-129. 08035, Barcelona, Spain

²⁴ IR Sant Pau, Iberoamerican Cochrane Centre-Public Health and Clinical Epidemiology Service, Hospital la Santa Creu I Sant Pau, Barcelona 08041, Spain

²⁵ Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), Monforte de Lemos, 3-5 Madrid 28029, Spain

²⁶ Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), Monforte de Lemos, 3-5 Madrid 28029, Spain

²⁷ Health Technology Assessment in Primary Care and Mental Health (PRISMA) Research Group, Parc Sanitari Sant Joan de Déu, Institut de Recerca Sant Joan de Déu, St Boi de Llobregat, Catalonia 08830, Spain

²⁸ Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

²⁹ Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain

³⁰ Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), Monforte de Lemos, 3-5 Madrid 28029, Spain

^✉

Corresponding author. Tel: 932746134, Email: aleix.olivella@vallhebron.cat; Tweet: NLP-guided telemonitoring in #HeartFailure was feasible, well received, and low workload for staff. Exploratory data suggest signals of improved outcomes. A scalable, low-barrier strategy to enhance remote HF care. #DigitalHealth #AIHealth

Conflict of interest: None declared.

Roles

Aleix Olivella: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Supervision, Validation, Writing - original draft, Writing - review & editing

Ana B Méndez Fernández: Conceptualization, Methodology

Emmanuel Giménez García: Data curation, Formal analysis, Methodology, Validation

Alonso Ortega Torregimeno: Data curation

Eduard Ródenas-Alesina: Data curation, Formal analysis, Methodology, Validation, Writing - original draft, Writing - review & editing

Raúl Aguilar López: Investigation

Tania Maza Pelaez: Investigation

Toni Soriano-Colomé: Investigation, Supervision

Augusto Sao Avilés: Data curation, Formal analysis

Aitor Uribarri: Validation, Writing - original draft

Teresa Soriano Sánchez: Investigation

Carmen Pérez Bocanegra: Investigation

Eva Domingo Baldrich: Investigation

Maria José Martinez-Zapata: Conceptualization, Methodology

Maria Rubio-Valera: Formal analysis, Methodology

Ignacio Ferreira-González: Conceptualization, Funding acquisition, Methodology, Project administration, Supervision, Validation, Writing - original draft

PMCID: PMC12975182 PMID: 41815105

Abstract

Aims

Heart failure (HF) requires scalable strategies to detect decompensation early and reduce hospitalizations. Existing telemonitoring tools are often invasive, complex, or poorly integrated into routine care. To evaluate the feasibility and clinical impact of a conversational artificial intelligence system (CAIS) for automated telephone follow-up in patients with HF.

Methods and results

We conducted a prospective, non-randomized, controlled feasibility study at a tertiary hospital. Eighty-six outpatients received weekly AI-assisted follow-up via natural language processing calls, collecting symptoms and vital signs. Alerts were reviewed by HF nurses, who determined responses per standard practice. Forty patients received usual care. The primary objective was feasibility and acceptability; exploratory endpoints included all-cause death, cardiovascular death, HF hospitalization, and diuretic intensification at 12 months, analysed with Cox and competing-risks regression. Of 4272 scheduled calls, 3919 were completed (91.7%). CAIS generated 1962 alerts, prompting 648 actions—mainly nurse calls (86.6%) and medication changes (7.9%). Nurse workload was 2.4 min/patient/week. At 12 months, the CAIS group improved KCCQ-12 score by +7.13 points (95% CI 1.19–13.07; P = 0.019), while EQ-5D-5L and PHQ-4 showed no significant change. Satisfaction was high (mean 8.72 ± 1.81). The intervention group had fewer all-cause death or HF hospitalization events (HR 0.39; 95% CI 0.16–0.96; P = 0.041) and lower cardiovascular mortality (1.2% vs. 10.0%; P = 0.035), with a non-significant trend towards fewer HF hospitalizations (sHR 0.43; 95% CI 0.16–1.13; P = 0.087).

Conclusion

CAIS follow-up was feasible, well-received, and low-resource, with exploratory signals of improved outcomes. This pragmatic, scalable approach may enhance HF care and warrants validation in randomized trials.

Keywords: Telemonitoring, Natural language processing, Remote care, Digital health

Graphical Abstract

For image description, please refer to the figure legend and surrounding text. — A conversational artificial intelligence system (CAIS) provided weekly automated phone follow-up for patients with heart failure (HF). Through natural language interactions, the CAIS collected symptoms and vital signs, generated alerts, and prompted nurse-led actions. In this pilot study, the approach was feasible, well-received, and required minimal staff workload. Patients reported improved quality of life (KCCQ-12 OSS), and exploratory analyses suggested hypothesis-generating signals of reduced mortality and HF hospitalizations compared with usual care. CAIS – conversational artificial intelligence system; HF – heart failure; NLP – natural language processing; NPS – Net Promoter Score; KCCQ-12 OSS – Kansas City Cardiomyopathy Questionnaire-12 Overall Summary Score.

Introduction

Heart failure (HF) is the leading cause of hospitalization in individuals over 65 years of age and affects approximately 1%–2% of the general population,¹ with a profound impact on quality of life.² Given the need for close monitoring to improve patient outcomes, telemedicine has been recommended by clinical guidelines to reduce HF hospitalizations and cardiovascular (CV) mortality.³

However, the evidence supporting telemedicine is heterogeneous, encompassing diverse technologies, intensities of follow-up, invasiveness and organizational models,^4–6 which limit comparability and generalizability.^7,8 Some of the most successful strategies⁹ require substantial human resources and may lack scalability. In parallel, the so-called digital divide may prevent digitally illiterate patients from benefiting from some of these technologies, which generally require the use of apps.^10,11

Artificial intelligence (AI) offers a potential solution by enabling automation of complex tasks.¹² In particular, natural language processing (NLP) allows AI systems to interact with patients via conventional communication channels, such as telephone calls, without requiring specific digital skills.

We report the first study comparing conventional care with AI-assisted, automated telephone follow-up using an NLP-enabled virtual assistant in patients with HF.

Methods

Study design

Prospective, non-randomized, controlled feasibility study comparing usual care with a conversational artificial intelligence system (CAIS) for automated telephone follow-up in patients with HF. It was approved by the local Ethics Committee [PR(AG)04/2023] and conducted in accordance with the Declaration of Helsinki.

Patient population

Patients with a confirmed diagnosis of HF according to the European Society of Cardiology criteria³ were prospectively recruited from the outpatient HF unit of our tertiary centre, regardless of disease duration, aetiology, or history of decompensations. Eligible participants were required to speak Spanish, be over 18 years old, and have access to a landline or mobile phone, and have an expected survival of at least one year from non-CV causes. Patients were included if either they or a cohabiting caregiver were capable of completing a telephone conversation.

From March to July 2023, patients attending our HF clinic for standard care—primarily for education, pharmacologic titration, or decongestive management—were invited to participate in a pilot programme incorporating an AI-based NLP assistant. After providing informed consent, participants received brief training on the system’s purpose, call frequency, and use. A contemporaneous prospective control cohort meeting the same inclusion criteria was subsequently recruited and followed under standard care alone, without AI intervention. The intervention cohort was enrolled first; the control cohort was recruited consecutively afterward using the same inclusion and exclusion criteria. Given the feasibility focus of the study, a 2:1 (CAIS:control) allocation was adopted to gather more detailed operational data on the intervention, while maintaining an appropriate comparator group for exploratory analyses.

AI system architecture

LOLA® (Tucuvi, Spain) is a Class IIb Software as a Medical Device certified under the EU Medical Device Regulation (EU 2017/745). It functions as an automated, telephone-based virtual assistant designed to conduct structured, multi-turn clinical conversations using natural language.

The system is composed of three main components:

Automatic speech recognition optimized for telephone audio, capable of handling accent variability.
Natural-language understanding modules—based on a hybrid architecture combining deterministic transformer models and large language models—which convert patient speech into structured clinical variables within a predefined label set.
A deterministic clinical protocol engine, which governs the conversational flow and triggers alerts when predefined symptom or vital-sign thresholds are exceeded.

Although AI is used for speech processing and language understanding, alert generation is entirely rules-based and protocol-driven, not predictive. All alerts are forwarded to the HF clinical team, who retain full responsibility for triage and decision making.

The system incorporates a supervisory layer to keep all conversations within validated clinical scripts, minimizing deviations and preventing unsupported outputs. It does not make autonomous clinical decisions; instead, it automates information gathering, structures patient-reported data, and standardizes remote triage to support clinician review.

LOLA is deployed as a cloud-based platform with encryption, role-based access control, and audit logging, operating under an ISO 13485 quality-management system and ISO 27001–certified information-security framework.

Interventions

All patients continued to receive standard multidisciplinary care at our HF unit (usual care), including in-person and remote visits at a frequency determined by clinical status and preestablished schedules. Patient education and treatment titration followed internal protocols. All patients had access by phone to the HF nursing team during working hours for clinical queries.

In the intervention group, participants received weekly automated phone calls from the CAIS. Calls were scheduled according to patient preference, although morning hours were encouraged to facilitate timely responses to potential alerts. If the patient did not answer the initial call, a second attempt was made after 10 min, a third after 30 min, and a fourth after 60 min if there was still no response. If all attempts failed, an alert was generated indicating unsuccessful contact.

During each call, the CAIS collected structured data on symptoms, vital signs, and open-ended responses regarding clinical status. Data were uploaded to a secure digital platform, where predefined thresholds triggered alerts for potential clinical deterioration (see Supplementary material online, Table S1). Alerts were reviewed daily by a specialized HF nurse, who contacted patients when necessary to assess clinical changes, adjust treatment, or arrange in-person evaluations without a prespecified protocol, using their routine clinical judgment to guide management.

Follow-up

Baseline assessments included clinical and demographic variables, along with quality of life and psychological well-being questionnaires: KCCQ-12, EQ-5D-5L, PHQ-4, and a locally developed experience survey (see Supplementary material online, Table S2). Satisfaction with the CAIS was assessed with a locally developed satisfaction survey (see Supplementary material online, Table S3). These were repeated at 6 and 12 months during study visits. Guideline-directed medical therapy (GDMT) score was calculated as previously described,¹³ consistent with approaches used to describe multidrug adherence in recent HF registries.¹⁴

Endpoints

The primary objective was to assess the feasibility of the AI-assisted remote monitoring strategy, evaluated by call completion rate, alert frequency and type, nurse-led clinical actions, nurse workload (minutes/patient/week), patient adherence, satisfaction (mean recommendation score, Net Promoter Score, qualitative feedback), and patient-reported quality-of-life/experience measures.

Exploratory clinical endpoints included time to first HF hospitalization or all-cause death, all-cause death, CV mortality, HF hospitalization, decongestion interventions (oral diuretic uptitration or intravenous diuretic administration), and change in NT-proBNP levels.

Statistical analysis

All analyses were performed according to the intention-to-treat principle. Continuous variables were summarized as mean ± standard deviation (SD) if normally distributed, or median and interquartile range (IQR) otherwise. Categorical variables were expressed as frequencies and percentages. Between-group comparisons were conducted using ANCOVA for continuous outcomes and logistic regression for binary outcomes, adjusting for relevant baseline covariates. Time-to-event outcomes were analysed using Cox proportional hazards regression. For the composite of all-cause death or first HF hospitalization, the time origin was the date of enrolment. Time-to-event was defined as the number of days from enrolment to the first occurrence of either component. Patients without an event were censored at the earliest of: completion of 12-month follow-up, loss to follow-up, or withdrawal of consent. All deaths were retained as events to avoid informative censoring. An unadjusted Cox model was first fitted, followed by adjusted models including baseline variables that differed between groups. Hazard ratios (HRs) with 95% confidence intervals (CIs) are reported. Changes in patient-reported outcomes at 6 and 12 months were analysed using linear mixed-effects models for repeated measures with a random intercept at the patient level.

Statistical significance was defined as two-tailed P < 0.05. Because this was a feasibility pilot study, no formal sample size calculation was performed. The sample size was determined pragmatically based on the expected recruitment capacity of the HF clinic during the study period and the need to obtain reliable estimates of feasibility metrics (call completion, adherence, nurse workload, satisfaction). As the study was not powered for clinical endpoints, analyses of clinical outcomes were considered exploratory. All analyses were performed using Stata version 16.1 (StataCorp LLC, College Station, TX, USA).

Results

Baseline characteristics

A total of 126 patients were enrolled: 86 in the intervention group and 40 in the control group. Demographic and clinical characteristics were generally well balanced between groups, including age, sex, comorbidities, and left ventricular ejection fraction (LVEF) (see Table 1). All HF phenotypes were represented, with a predominance of heart failure with reduced ejection fraction (HFrEF) reflecting the patient population typically managed in our HF unit. Prescription rates of GDMT—including beta-blockers, renin-angiotensin system inhibitors, mineralocorticoid receptor antagonists, and sodium–glucose cotransporter 2 inhibitors—were similarly high in both groups, with no significant differences observed. However, the intervention group had evidence of more advanced disease. Functional status was significantly worse, with fewer patients in NYHA (New York Heart Association) class I and more in classes II–III (P = 0.001). Baseline furosemide dose was also higher in the CAIS group (mean difference 18.1 mg; 95% CI 2.7–33.5; P = 0.02), suggesting greater congestion or diuretic need. NT-proBNP levels, recent HF hospitalizations, and baseline quality of life (KCCQ-12, EQ-5D-5L), psychological distress (PHQ-4), and patient-reported experience were similar between groups.

Table 1.

Baseline characteristics

	Conventional care	CAIS	P-value
	n = 40	n = 86
Age, median (IQR)	68.6 (61.3–74.3)	71.0 (61.8–78.2)	0.51
Sex (female), no. (%)	9 (23%)	25 (29%)	0.52
Hypertension, no. (%)	22 (55%)	58 (67%)	0.23
Dyslipidemia, no. (%)	22 (55%)	52 (60%)	0.57
Diabetes, no. (%)	14 (36%)	28 (33%)	0.84
COPD, mo. (%)	9 (23%)	13 (15%)	0.32
Tobacco use, no. (%)			0.16
Never	12 (30%)	41 (48%)
Former smoker	21 (53%)	32 (37%)
Active smoker	7 (18%)	13 (15%)
LVEF category, no. (%)			0.71
HFpEF	9 (23%)	24 (28%)
HFmrEF	9 (23%)	15 (17%)
HFrEF	22 (55%)	47 (55%)
CeVD, no. (%)	5 (13%)	13 (15%)	1.00
CKD, no. (%)	11 (28%)	34 (40%)	0.23
Ischemic cardiomyopathy, no. (%)	23 (57%)	41 (48%)	0.34
HF duration since diagnosis, median (IQR)—years	2.11 (1.50–5.94)	2.58 (1.69–6.29)	0.20
LVEF (%), median (IQR)	39.5 (32.5–48)	40 (34–50)	0.82
NT-proBNP (pg/mL), median (IQR)	919 (314–2323)	1159.5 (429–2718)	0.34
HF hospitalization in the previous 12 months, no. (%)	9 (23%)	18 (21%)	0.82
KCCQ-12 (OSS), mean (SD)	69.17 (16.26)	69.85 (17.39)	0.83
PHQ-4, mean (SD)	6.3 (2.27)	6.88 (3.75)	0.28
EQ-5D-5L, mean (SD)	0.78 (0.18)	0.72 (0.30)	0.17
PREMs, mean (SD)	8.81 (1.57)	8.82 (0.9)	0.97
NYHA functional class, no. (%)			0.001
I	14 (35%)	10 (12%)
II	24 (60%)	63 (73%)
III	2 (5%)	11 (13%)
IV	0 (0%)	2 (2%)
ACEI, no. (%)	8 (20%)	15 (17%)	0.81
ARA2, no. (%)	3 (8%)	10 (12%)	0.75
ARNI, no. (%)	25 (63%)	53 (62%)	1.00
BB, no. (%)	35 (88%)	70 (81%)	0.45
MRA, no. (%)	35 (88%)	67 (78%)	0.23
SGLT2i, no. (%)	36 (90%)	73 (85%)	0.58
Loop diuretics, no. (%)	16 (40%)	46 (53%)	0.18
Distal diuretics, no. (%)	3 (8%)	6 (7%)	1.00
Furosemide dose (mg), mean (SD)	21.0 (32.6)	39.1 (42.1)	0.02
Vericiguat, no. (%)	2 (5%)	3 (3%)	0.65
GDMT score, median (IQR)	7 (2)	7 (2)	0.57

Open in a new tab

Data are presented as n (%), mean (standard deviation), or median (interquartile range) unless otherwise stated. Baseline characteristics are presented for all patients included.

CAIS, conversational artificial intelligence system; IQR, interquartile range; COPD, chronic obstructive pulmonary disease; LVEF, left ventricular ejection fraction; HFpEF, heart failure with preserved ejection fraction; HFmrEF, heart failure with mildly reduced ejection fraction; HFrEF, heart failure with reduced ejection fraction; CeVD, cerebrovascular disease; CKD, chronic kidney disease; NT-proBNP, n-terminal pro B-type natriuretic peptide; HF, heart failure; KCCQ-12 (OSS), Kansas City Cardiomyopathy Questionnaire–12 (Overall Summary Score); PHQ-4, Patient Health Questionnaire-4; EQ-5D-5L, EuroQol 5-Dimension 5-Level; PREMs, patient-reported experience measures; NYHA, New York Heart Association; ACEI, angiotensin-converting enzyme inhibitor; ARA2, Angiotensin II Receptor Antagonist; ARNI, Angiotensin Receptor–Neprilysin Inhibitor; BB, beta-blocker; MRA, Mineralocorticoid Receptor Antagonist; SGLT2i, Sodium–Glucose Cotransporter 2 Inhibitor.

AI-driven follow-up

During the 12-month follow-up, the AI assistant made 6762 call attempts, successfully completing 3919 calls (91.7% of 4272 scheduled calls). Each structured, symptom-focused call had a median duration of 3 min 31 s, amounting to 221.6 h of total direct patient interaction time. This metric reflects conversation time only; the overall time that a human professional would require to initiate calls, wait for connection, repeat attempts in case of no answer, and document responses would be substantially greater. Because we did not have access to a validated method for estimating the equivalent human workload, we report only observed data.

The CAIS triggered 1962 alerts (see Supplementary material online, Figure S1), most frequently related to blood pressure (21.0%), weight changes (18.6%), fatigue (12.2%), and orthopnea (11.6%). A severe alert occurred in 13% of calls and a moderate alert in 17%. Alerts prompted 648 remote interventions, primarily patient phone assessments (561; 86.6%) and 51 remote medication adjustments (7.9%) (see Supplementary material online, Figure S2). Reviewing alerts required a median of 205 min per week (2.4 min/patient/week).

Seven patients discontinued AI-assisted follow-up: one received a left ventricular assist device, three were withdrawn due to poor adherence, two requested discontinuation, and one was unable to participate due to severe hearing loss.

Intensity of follow-up

Overall, the total number of conventional visits did not differ significantly between groups (16 in conventional care [IQR 7–25] vs. 17 in the CAIS [IQR 15–18], P = 0.80), nor did total in-person visits (5.5 [IQR 4–12] vs. 6 [IQR 3–11], P = 0.83) or total telephone visits (4 [IQR 3–5] vs. 5 [IQR 4–8], P = 0.12) (see Supplementary material online, Table S4). HF nurse visits, whether in-person or remote, were also similar. However, HF physician in-person visits were significantly higher in the intervention group (two conventional care [IQR 1–3] vs. 3 CAIS [IQR 2–5], P < 0.001). Automated CAIS calls were not considered ‘staff telephonic visits’, as no clinician participated unless an alert was triggered.

Quality of life, patient experience, and satisfaction

Quality of life, assessed at baseline, 6, and 12 months, showed no significant changes in EQ-5D-5L or PHQ-4. Patient experience scores also did not differ significantly (see Table 2). However, the intervention group demonstrated a significantly greater adjusted improvement in the KCCQ-12 Overall Summary Score from baseline to 12 months compared with controls (+7.1 points; 95% CI 1.19–13.07; P = 0.019; see Figure 1) in the mixed-effects model.

Table 2.

Quality of life and patient experience with conventional care or CAIS at 12 months

Measure	Between-group difference in change (95% CI)	P-value
KCCQ-12 OSS change from baseline to 12 months^a	+7.13 (95% CI 1.19 to 13.07)	0.019
PHQ-4 change from baseline to 12 months^a	−0.87 (95% CI −2.36 to 0.616)	0.25
EQ-5D-5L change from baseline to 12 months^a	+0.052 (−0.054 to 0.149)	0.36
PREMs change from baseline to 12 months^a	−0.02 (−0.662 to 0.631)	0.96

Open in a new tab

^aChanges represent between-group differences in change from baseline derived from a linear mixed-effects model including fixed effects for time, group, and their interaction, with a random intercept for each patient. Positive values indicate greater improvement in the CAIS group.

KCCQ-12 (OSS), Kansas City Cardiomyopathy Questionnaire 12 (Overall Summary Score); PHQ-4, Patient Health Questionnaire-4; EQ-5D-5L, EuroQol 5-Dimension 5-Level; PREMs, Patient-Reported Experience Measures.

Patient satisfaction with the CAIS was high. The mean recommendation score (0–10 scale) was 8.72 ± 1.81, and the Net Promoter Score (proportion of promoters [9–10] minus detractors [0–6]) was 67.9, indicating a strong likelihood of recommending the service. Commonly reported benefits included a perceived increase in follow-up, improved communication with the hospital, and greater peace of mind, while suggested areas for improvement included more frequent direct contact from the medical team.

Exploratory endpoints

In an unadjusted intention-to-treat analysis using Cox regression, the composite of all-cause death or HF hospitalization was significantly lower in the intervention group (HR 0.39; 95% CI 0.16–0.96; P = 0.041; see Figure 2). The association remained statistically significant after adjustment for baseline furosemide dose (HR 0.31; 95% CI 0.12–0.78; P = 0.013) and when adjusting for NYHA functional class (HR 0.36; 95% CI 0.14–0.92; P = 0.032), the two variables that differed between groups at baseline. HF hospitalization showed a non-significant trend towards reduction (sHR 0.43; 95% CI 0.16–1.13; P = 0.087). CV death was less frequent in the intervention group (1.2% vs. 10.0%; P = 0.035), with a subdistribution hazard ratio of 0.11 (95% CI 0.01–1.02; P = 0.052), indicating a strong signal but not reaching statistical significance. All-cause mortality was also lower in the intervention group, but the difference was not statistically significant (HR 0.23; 95% CI 0.04–1.24; P = 0.076) (see Table 3).

Table 3.

Outcomes with conventional care or CAIS at 12 months

	Conventional care	CAIS	P-value
	n = 40	n = 86
All-cause death or HF hospitalization	10 (25.0%)	9 (10.5%)	0.034
HF hospitalization, no. (%)	8 (20.0%)	7 (8.1%)	0.076
CV death, no. (%)	4 (10.0%)	1 (1.2%)	0.035
All-cause death, no. (%)	4 (10.0%)	2 (2.3%)	0.080
WHF event, no. (%)	13 (32.5%)	23 (26.7%)	0.53
Ambulatory IV diuretics or oral diuretic uptitrations	4 (10.0%)	21 (24.4%)	0.045
IV diuretics decongestion, no. (%)	3 (7.5%)	12 (14.0%)	0.39
Patients with oral diuretic rise, no. (%)	3 (7.5%)	19 (22.1%)	0.048
Furosemide dose at 12 months (mg), mean (SD)	18.38 (30.32)	40.24 (50.68)	0.016
NT-proBNP change (pg/mL), median (IQR)	−114 (−406 to −4)	−131 (−691 to −98)	0.94
Diuretic dose change (mg), mean (SD)	−3.24 (17.33)	1.90 (28.56)	0.31
GDMT score 12 months, median (IQR)	7 (2)	7 (2)	0.63
GDMT change, mean (SD)	+0.075 (0.19)	+0.128 (0.14)	0.39

Open in a new tab

Data are presented as n (%), mean (standard deviation), or median (interquartile range), unless otherwise stated. Results are reported for patients with available data at baseline and 12 months. Percentages refer to patients with ≥1 event during follow-up. Values are shown as n/N (%). P-values were obtained from chi-square or Fisher’s exact tests, as appropriate. Hazard ratios (HRs) and subdistribution hazard ratios (sHRs) from time-to-event analyses are reported in the text. Significant of P-values <0.05 were written in bold text.

CAIS, conversational artificial intelligence system; HF, heart failure; CV, cardiovascular; WHF, worsening heart failure (HF hospitalization, IV diuretic, or HF emergency visit); IV, intravenous; NT-proBNP, n-terminal pro B-type natriuretic peptide; IQR, interquartile range.

The intervention group experienced more ambulatory intravenous diuretic administrations or oral diuretic uptitrations (24.4% vs. 10.0%; P = 0.045). In competing-risks analysis, this corresponded to a non-significant trend towards higher cumulative incidence (sHR 2.64; 95% CI 0.77–9.03; P = 0.122; Figure 3), which may reflect more proactive outpatient decongestion. Over 12 months, the intervention group maintained higher furosemide doses, with a steeper upward trajectory not observed in controls. At 12-month follow-up, no significant differences were observed between groups in the prescription rates of GDMT (see Supplementary material online, Table S5). Mean GDMT score¹³ at baseline was 6.53 ± 0.2 in the intervention group and 6.83 ± 0.24 in controls (P = 0.57), without significant change at 12 months to 6.66 ± 0.20 and 6.90 ± 0.27, respectively (P = 0.63). Intraindividual change over time did not differ significantly between groups (+0.128 ± 0.14 vs. +0.075 ± 0.19; P = 0.39). NT-proBNP levels did not change significantly in either group.

Adverse events

No adverse events were directly attributable to the AI follow-up system. While unnecessary patient contacts or treatment changes based on false-positive alerts may be considered a potential downside of the intervention, all clinical decisions were supervised by a specialized HF nurse. The precise number of false alerts could not be determined, and some were related to patient miscommunication (e.g. inaccurate reporting of weight), rather than system error.

Discussion

To the best of our knowledge, this is the first clinical study to prospectively evaluate the feasibility and potential benefit of a conversational AI-based telemonitoring strategy for HF in routine care. Weekly follow-up using a CAIS was well received, achieved high adherence, and could be integrated into existing workflows with a quantifiable and manageable nurse workload. In this real-world HF population with broad inclusion criteria, the system enabled structured, longitudinal symptom monitoring and regular clinical review without increasing routine visit frequency. Clinical outcomes were assessed as exploratory endpoints only; while fewer deaths or heart failure hospitalizations were observed in the intervention group, these findings should be interpreted with caution given the non-randomized design, baseline imbalances, and limited sample size. Overall, the study primarily supports the feasibility, acceptability, and potential scalability of a conversational AI–based approach, while generating hypotheses for future randomized evaluation.

Over the past two decades, more than 40 randomized trials and several meta-analyses,^4,5,15 including a Cochrane review,¹⁶ have assessed telemonitoring in HF, generally showing modest and heterogeneous effects on clinical outcomes. While some programmes reported reductions in hospitalizations or mortality^9,17—particularly those combining multiple physiological parameters with structured clinical feedback— others, such as OSICAT¹⁸ and BEAT-HF¹⁹—reported neutral results, often attributed to heterogeneous populations, suboptimal adherence, or poor integration into care pathways. Consequently, major cardiovascular societies provide only moderate recommendations for telemonitoring (Class IIb),³ and uptake remains limited. Importantly, growing evidence suggests that sustained engagement and proactive monitoring may be as critical as the specific technology used; however, many patients disengage from long-term digital programmes, thereby limiting their effectiveness.^10,16,20

Available telemonitoring tools range from structured telephone support^6,21 and wearables^22,23 to implantable devices such as CardioMEMS™ (Abbott, US).⁸ While invasive monitoring can reduce hospitalizations in selected patients, its scalability is restricted by cost, procedural risks, and infrastructure needs.²⁰ Many non-invasive solutions depend on patient-initiated interaction with apps or devices, which can be challenging for older adults or those with low digital literacy.¹¹ These barriers highlight the need for inclusive, low-complexity models that integrate into routine care without requiring behavioural or technological adaptation from patients.²⁴

Conversational AI can facilitate scalable, user-friendly remote monitoring by enabling natural–language interactions and automated structured symptom collection. In the present study, the system’s role was limited to capturing patient-reported information and converting it into structured clinical variables, while all thresholds and alerts were generated using predefined, rule-based logic. Accordingly, treatment decisions remained entirely clinician-led. The ‘intelligence’ of the system therefore resided primarily in scalable data capture and standardization, rather than in autonomous clinical decision-making.

Beyond the scope of the present study, AI-based technologies may contribute to HF care by improving data acquisition, reducing patient burden, and enabling more continuous assessment between clinic visits. NLP-driven conversational AI, in particular, may help overcome usability barriers by interacting through familiar channels such as standard telephone calls, without requiring smartphones, apps, or digital literacy.²⁵ However, more advanced applications –such as AI-driven optimization of pharmacologic therapy or digital twin models^26–28– were not evaluated and should not be inferred from the present findings.

The present study represents a controlled, prospective evaluation of a real-world CAIS for longitudinal HF monitoring. Unlike many prior interventions, our system required no devices or apps, operated through conventional calls, and integrated into existing clinical workflows. Only 7 of 86 patients discontinued follow-up for diverse reasons, an acceptable dropout rate for year-long monitoring, and those who remained demonstrated excellent adherence.

Although the system generated a substantial number of alerts, the observed nurse time required for alert review averaged 2.4 min per patient per week, and the intervention was not associated with an increase in routine in-person or clinician-initiated telephone visits. These findings support the operational feasibility of the approach within the studied context. However, most alerts did not lead to a documented clinical action, indicating limited specificity and a potentially high false-positive rate. While this may be acceptable in an early feasibility study prioritizing sensitivity and safety, optimization of alert thresholds and triage logic will be essential before considering broader implementation.

Although alerts not followed by a documented clinical action can reasonably be considered false positives from an operational perspective, their clinical interpretation was more nuanced in this feasibility study. Alerts without recorded actions represented a heterogeneous group that included alerts triggered by patient miscommunication (e.g. inaccurate self-reported weight), alerts that led to verbal counselling or reinforcement of self-care recommendations that were not captured as predefined ‘clinical actions’, and alerts that were clinically meaningful only when interpreted longitudinally (for example, repeated mild deviations preceding a later therapeutic adjustment). Because nurses did not prospectively classify alert validity and many low-intensity interactions were not systematically documented, it was not possible to reliably distinguish true false positives from clinically relevant but undocumented follow-up. This limitation underscores the need for more granular alert classification and documentation strategies in future studies.

Since the study was not powered to detect clinical benefit, all clinical outcomes were assessed as exploratory and should be interpreted with caution. Exploratory analyses showed lower rates of the composite of all-cause death or heart failure hospitalization in the intervention group, together with a non-significant trend towards lower cardiovascular mortality. These findings do not establish causality and may reflect residual confounding, or unmeasured differences between groups.

Importantly, the intervention group had a more advanced clinical profile at baseline, with worse NYHA functional class and higher loop-diuretic doses, which would be expected to bias results against the intervention. Although adjusted Cox models accounting for NYHA class and baseline diuretic dose yielded similar associations, residual confounding cannot be excluded given the non-randomized design and limited sample size.

While causality cannot be inferred, the observed exploratory signals are compatible with earlier identification of congestion and more proactive outpatient management. In this context, the higher rate of ambulatory intravenous or oral diuretic adjustments observed in the intervention group (24.4% vs. 10.0%) is consistent with closer clinical surveillance rather than definitive evidence of a treatment effect. The absolute number of ambulatory clinical actions was modest, and their potential impact cannot be quantified in this study; however, it cannot be excluded that earlier low-intensity interventions may have limited the duration of mild or subclinical congestion, which has been associated with downstream decompensation in prior studies.

Notably, guideline-directed medical therapy was already highly optimized in both groups at baseline and remained similar at 12 months, with implementation rates exceeding those reported in contemporary registries,^14,29 making differences in foundational pharmacologic therapy an unlikely explanation for the observed signals. Patients receiving weekly CAIS follow-up frequently reported a greater sense of clinical oversight, which may have contributed to improved self-monitoring or adherence behaviours, although these factors were not systematically measured. The higher furosemide dose observed in the intervention arm at 12 months likely reflects, at least in part, their higher baseline congestion burden; whether earlier identification of fluid retention through CAIS monitoring contributed to more consistent attainment of individualized euvolemic dosing cannot be determined from the present data. Because all diuretic adjustments were recorded prospectively using the same sources in both groups, differential underreporting is unlikely. Nonetheless, given the observational nature of the study, these findings should be considered hypothesis-generating and require confirmation in adequately powered, randomized trials.

This observation should be interpreted in the context of prior telemonitoring studies in heart failure, including large randomized trials such as TIM-HF2⁹ and OSICAT,¹⁸ which evaluated more technologically complex interventions with heterogeneous effects on clinical outcomes. Rather than suggesting superiority in efficacy, the present findings highlight differences in implementation models and patient interaction strategies across telemonitoring approaches. A key distinguishing feature of conversational AI–based follow-up is its simplicity, frequency, and low barrier to use, which may facilitate sustained engagement and timely clinical review compared with more passive or device-dependent systems. In our study, adherence to scheduled follow-up was high over 12 months, exceeding that reported in several previous remote monitoring programmes.¹⁹ While direct comparisons across studies are not possible, these results support conversational AI as a feasible and patient-centred implementation model that prioritizes engagement and continuity of monitoring, which are recognized challenges in long-term HF care.

In terms of patient-reported outcomes, the KCCQ-12 overall summary score improved significantly in the intervention group, whereas EQ-5D-5L and PHQ-4 showed no change. This divergence likely reflects the greater sensitivity of disease-specific instruments like KCCQ-12 for detecting changes in HF symptom burden and functional status, compared with generic health status measures or mental health screening tools. Generic tools such as EQ-5D-5L, while widely used, may lack the granularity to capture subtle yet clinically meaningful improvements in HF-related quality of life, particularly in populations already receiving structured follow-up.³⁰ Similarly, PHQ-4 focuses on anxiety and depression, which might not be influenced by telemonitoring interventions unless accompanied by specific psychological support components.

From a service delivery perspective, the low time demand on nurses and the absence of extra routine visits suggest that CAIS-driven follow-up can be implemented within existing clinical workflows without overburdening clinical teams in the studied setting. However, scaling up could increase alert volume, potentially leading to workload challenges or alert fatigue if not well integrated with electronic health records and triage protocols. Future studies should evaluate the impact on staffing, workflow efficiency, and clinician engagement.

Study limitations

First, its non-randomized design introduces potential selection bias and confounding, including baseline imbalances in disease severity, despite broadly similar baseline characteristics between groups. Second, the relatively small sample size increases the risk of type II error and may predispose to spurious findings or overestimation of effect sizes, and results from exploratory analyses must be taken with caution. Third, the study was conducted at a single tertiary care centre with an experienced HF unit, which may limit generalizability to other healthcare settings. Fourth, patient experience and satisfaction were assessed with locally developed questionnaires that have not been formally validated; given the lack of widely validated instruments for this purpose, this may also limit generalizability. Fifth, no demographic data on ethnicity were collected, as this variable was not defined in the electronic case report forms, precluding any analysis of differential access or effectiveness across ethnic groups. Sixth, while alerts from the AI system prompted predefined clinical actions, the rate of false-positive alerts or patient miscommunication could not be precisely quantified. Lastly, the observed feasibility and engagement reflect the performance of a specific CAIS (‘Lola’) embedded within a particular clinical workflow; results may not be replicable with other AI systems that differ in design, language capabilities, or integration into care processes. Importantly, the study was designed primarily to assess feasibility and acceptability and was not powered to evaluate clinical efficacy. All these limitations must be considered within the context of a pilot feasibility study primarily intended to generate hypotheses. Therefore, the clinical findings reported here require validation in larger, multicenter, randomized trials.

Conclusions

In this prospective, non-randomized, controlled feasibility study, AI-assisted follow-up using a conversational virtual agent was feasible, well-received, and associated with exploratory signals towards fewer deaths or HF hospitalizations, without differences in GDMT. Nurse workload remained low despite frequent alerts, and the system required no apps or devices. Overall, the study supports the feasibility and acceptability of conversational AI as a low-barrier strategy for remote HF follow-up, warranting confirmation in larger randomized trials.

Declaration of generative AI and AI-assisted technologies in the manuscript preparation process

During the preparation of this work the authors used ChatGPT-5 (OpenAI, San Francisco, CA, USA) in order to improve English phrasing and clarity. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the published article.

Supplementary Material

ztag032_Supplementary_Data

ztag032_supplementary_data.docx^{(49.6KB, docx)}

Acknowledgements

Nahum Capdevila, Elena Gómez, and Raquel Arias contributed to the deployment of the CAIS, alert review, and alert management. This project received a grant from the Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), 2024.

Contributor Information

Aleix Olivella, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain.

Ana B Méndez Fernández, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain.

Emmanuel Giménez García, Information Systems, Hospital Universitari Vall D'Hebron, Passeig de la Vall d’Hebron 119-129, Barcelona 08035, Spain; Health Services Research Group, Vall D’Hebron Research Institute (VHIR) Vall D’Hebron Hospital Campus, Barcelona 08035, Spain.

Alonso Ortega Torregimeno, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain.

Eduard Ródenas-Alesina, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBER-CV), Monforte de Lemos, 3-5 Madrid 28029, Spain.

Raúl Aguilar López, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain.

Tania Maza Pelaez, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain.

Toni Soriano-Colomé, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBER-CV), Monforte de Lemos, 3-5 Madrid 28029, Spain.

Augusto Sao Avilés, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain.

Aitor Uribarri, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Centro de Investigación Biomédica en Red de Enfermedades Cardiovasculares (CIBER-CV), Monforte de Lemos, 3-5 Madrid 28029, Spain.

Teresa Soriano Sánchez, Internal Medicine Department, Vall D’Hebron University Hospital, Passeig de la Vall d’Hebron 119-129. 08035, Barcelona, Spain.

Carmen Pérez Bocanegra, Internal Medicine Department, Vall D’Hebron University Hospital, Passeig de la Vall d’Hebron 119-129. 08035, Barcelona, Spain.

Eva Domingo Baldrich, Internal Medicine Department, Vall D’Hebron University Hospital, Passeig de la Vall d’Hebron 119-129. 08035, Barcelona, Spain.

Maria José Martinez-Zapata, IR Sant Pau, Iberoamerican Cochrane Centre-Public Health and Clinical Epidemiology Service, Hospital la Santa Creu I Sant Pau, Barcelona 08041, Spain; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), Monforte de Lemos, 3-5 Madrid 28029, Spain.

Maria Rubio-Valera, Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), Monforte de Lemos, 3-5 Madrid 28029, Spain; Health Technology Assessment in Primary Care and Mental Health (PRISMA) Research Group, Parc Sanitari Sant Joan de Déu, Institut de Recerca Sant Joan de Déu, St Boi de Llobregat, Catalonia 08830, Spain.

Ignacio Ferreira-González, Cardiology Department, Vall D’Hebron University Hospitalm Vall D’Hebron Institut de Recerca, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Medicine Department, Universitat Autònoma de Barcelona, Passeig de la Vall d’Hebron 119-129, 08035 Barcelona, Spain; Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), Monforte de Lemos, 3-5 Madrid 28029, Spain.

Supplementary material

Supplementary material is available at European Heart Journal – Digital Health.

Author contributions

Aleix Olivella (Conceptualization [lead]; Data curation [lead]; Formal analysis [lead]; Funding acquisition [lead]; Investigation [lead]; Methodology [lead]; Project administration [lead]; Resources [lead]; Supervision [lead]; Validation [lead]; Writing – original draft [lead]; Writing – review & editing [lead]), Ana B Méndez Fernández (Conceptualization [supporting]; Methodology [supporting]), Emmanuel Giménez García (Data curation [supporting]; Formal analysis [supporting]; Methodology [supporting]; Validation [supporting]), Alonso Ortega Torregimeno (Data curation [supporting]), Eduard Ródenas-Alesina (Data curation [supporting]; Formal analysis [supporting]; Methodology [supporting]; Validation [supporting]; Writing – original draft [supporting]; Writing – review & editing [supporting]), Raúl Aguilar López (Investigation [supporting]), Tania Maza Pelaez (Investigation [supporting]), Toni Soriano-Colomé (Investigation [supporting]; Supervision [supporting]), Augusto Sao Avilés (Data curation [supporting]; Formal analysis [supporting]), Aitor Uribarri (Validation [supporting]; Writing – original draft [supporting]), Teresa Soriano Sánchez (Investigation [supporting]), Carmen Pérez Bocanegra (Investigation [supporting]), Eva Domingo Baldrich (Investigation [supporting]), Maria José Martinez-Zapata (Conceptualization [supporting]; Methodology [supporting]), Maria Rubio-Valera (Formal analysis [supporting]; Methodology [supporting]), and Ignacio Ferreira-González (Conceptualization [supporting]; Funding acquisition [equal]; Methodology [supporting]; Project administration [supporting]; Supervision [supporting]; Validation [supporting]; Writing – original draft [supporting])

Funding

The use of the NLP AI assistant was funded by AstraZeneca.

This project received a grant from the Centro de Investigación Biomédica en Red de Epidemiología y Salud Pública (CIBER-ESP), 2024.

Data availability

The data underlying this article cannot be shared publicly due to privacy of individuals who participated in the study. The data will be shared on reasonable request to the corresponding author.

References

1. Groenewegen A, Rutten FH, Mosterd A, Hoes AW. Epidemiology of heart failure. Eur J Heart Fail 2020;22:1342–1356. [DOI] [PMC free article] [PubMed] [Google Scholar]
2. Shah KP, Khan SS, Baldridge AS, Grady KL, Cella D, Goyal P, et al. Health Status in heart failure and cancer. JACC Heart Fail 2023;12:1166–1178. [DOI] [PubMed] [Google Scholar]
3. Authors/Task Force Members; Mcdonagh TA, Metra M, Adamo M, Gardner RS, Baumbach A, Böhm M, et al. 2021 ESC guidelines for the diagnosis and treatment of acute and chronic heart failure. Eur Heart J 2021;42:3599–3726. [DOI] [PubMed] [Google Scholar]
4. Scholte NTB, Gürgöze MT, Aydin D, Theuns DAMJ, Manintveld OC, Ronner E, et al. Telemonitoring for heart failure : a meta-analysis. Eur Heart J 2023;00:1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]
5. Rebolledo Del Toro M, Herrera Leaño NM, Barahona-Correa JE, Muñoz Velandia OM, Fernández Ávila DG, García Peña ÁA. . Effectiveness of mobile telemonitoring applications in heart failure patients: systematic review of literature and meta-analysis. Heart Fail Rev 2023;28:431–452. [DOI] [PMC free article] [PubMed] [Google Scholar]
6. Cleland JGF, Louis AA, Rigby AS, Janssens U, Balk AHMM. Noninvasive home telemonitoring for patients with heart failure at high risk of recurrent admission and death: The Trans-European Network-Home-Care Management System (TEN-HMS) study. J Am Coll Cardiol 2005;45:1654–1664. [DOI] [PubMed] [Google Scholar]
7. Heidenreich PA, Bozkurt B, Aguilar D, Allen LA, Byun JJ, Colvin MM, et al. 2022 AHA/ACC/HFSA guideline for the management of heart failure: a report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines. Circulation 2022;145:e895–e1032. [DOI] [PubMed] [Google Scholar]
8. Abraham WT, Adamson PB, Bourge RC, Aaron MF, Costanzo MR, Stevenson LW, et al. Wireless pulmonary artery haemodynamic monitoring in chronic heart failure: a randomised controlled trial. Lancet 2011;377:658–666. [DOI] [PubMed] [Google Scholar]
9. Koehler F, Koehler K, Deckwart O, Prescher S, Wegscheider K, Kirwan BA, et al. Efficacy of telemedical interventional management in patients with heart failure (TIM-HF2): a randomised, controlled, parallel-group, unmasked trial. Lancet 2018;392:1047–1057. [DOI] [PubMed] [Google Scholar]
10. Stevenson LW, Ross HJ, Rathman LD, Boehmer JP. Remote monitoring for heart failure management at home. J Am Coll Cardiol 2023;81:2272–2291. [DOI] [PubMed] [Google Scholar]
11. Mohebali D, Kittleson MM. Remote monitoring in heart failure: current and emerging technologies in the context of the pandemic. Heart 2021;107:366–372. [DOI] [PubMed] [Google Scholar]
12. Khera R, Oikonomou EK, Nadkarni GN, Morley JR, Wiens J, Butte AJ, et al. Transforming cardiovascular care with artificial intelligence: from discovery to practice: JACC state-of-the-art review. J Am Coll Cardiol 2024;84:97–114. [DOI] [PMC free article] [PubMed] [Google Scholar]
13. Matsukawa R, Okahara A, Tokutome M, Itonaga J, Koga E, Hara A, et al. A scoring evaluation for the practical introduction of guideline-directed medical therapy in heart failure patients. ESC Heart Fail 2023;10:3352–3363. [DOI] [PMC free article] [PubMed] [Google Scholar]
14. Savarese G, Kishi T, Vardeny O, Adamsson Eryd S, Bodegård J, Lund LH, et al. Heart failure drug treatment – inertia, titration and discontinuation: a multinational observational study (EVOLUTION HF). JACC Heart Fail 2022;11:1–14. [DOI] [PubMed] [Google Scholar]
15. Kaihara T, Scherrenberg M, Intan-Goey V, Falter M, Kindermans H, Frederix I, et al. Efficacy of digital health interventions on depression and anxiety in patients with cardiac disease: a systematic review and meta-analysis. Eur Heart J Digit Health 2022;3:445–454. [DOI] [PMC free article] [PubMed] [Google Scholar]
16. Inglis SC, Clark RA, Dierckx R, Prieto-Merino D, Cleland JGF. Structured telephone support or non-invasive telemonitoring for patients with heart failure. Cochrane Database Syst Rev 2015;10:CD007228. [DOI] [PMC free article] [PubMed] [Google Scholar]
17. Yun S, Comín-Colet J, Calero-Molina E, Hidalgo E, José-Bazán N, Cobo Marcos M, et al. Evaluation of mobile health technology combining telemonitoring and teleintervention versus usual care in vulnerable-phase heart failure management (HERMeS): a multicentre, randomised controlled trial. Lancet Digit Health 2025;7. [DOI] [PubMed] [Google Scholar]
18. Galinier M, Roubille F, Berdague P, Brierre G, Cantie P, Dary P, et al. Telemonitoring versus standard care in heart failure: a randomised multicentre trial. Eur J Heart Fail 2020;22:985–994. [DOI] [PubMed] [Google Scholar]
19. Ong MK, Romano PS, Edgington S, Aronow HU, Auerbach AD, Black JT, et al. Effectiveness of remote patient monitoring after discharge of hospitalized patients with heart failure the better effectiveness after transition-heart failure (BEAT-HF) randomized clinical trial. JAMA Intern Med 2016;176:310–318. [DOI] [PMC free article] [PubMed] [Google Scholar]
20. Masterson Creber R, Dodson JA, Bidwell J, Breathett K, Lyles C, Harmon Still C, et al. Telehealth and health equity in older adults with heart failure: a scientific statement from the American Heart Association. Circ Cardiovasc Qual Outcomes 2023;16:e000123. [DOI] [PMC free article] [PubMed] [Google Scholar]
21. Krzesiński P, Jankowska EA, Siebert J, Galas A, Piotrowicz K, Stańczyk A, et al. Effects of an outpatient intervention comprising nurse-led non-invasive assessments, telemedicine support and remote cardiologists’ decisions in patients with heart failure (AMULET study): a randomised controlled trial. Eur J Heart Fail 2022;24:565–577. [DOI] [PMC free article] [PubMed] [Google Scholar]
22. Hindricks G, Taborsky M, Glikson M, Heinrich U, Schumacher B, Katz A, et al. Implant-based multiparameter telemonitoring of patients with heart failure (IN-TIME): a randomised controlled trial. Lancet 2014;384:583–590. [DOI] [PubMed] [Google Scholar]
23. Boehmer JP, Cremer S, Abo-Auda W, Stokes D, Hadi A, McCann P, et al. Impact of a novel wearable sensor on heart failure rehospitalization. JACC Heart Fail 2024;12:2011–2022. [DOI] [PubMed] [Google Scholar]
24. Spatz ES, Ginsburg GS, Rumsfeld JS, Turakhia MP. Wearable digital health technologies for monitoring in cardiovascular medicine. N Engl J Med 2024;390:346–356. [DOI] [PubMed] [Google Scholar]
25. Oikonomou EK, Khera R. Artificial intelligence-enhanced patient evaluation: bridging art and science. Eur Heart J 2024;45:3204–3218. [DOI] [PMC free article] [PubMed] [Google Scholar]
26. DeVore AD, Walsh MN, Vardeny O, Albert NM, Desai AS. Digital solutions for the optimization of pharmacologic therapy for heart failure. JACC Heart Fail 2025;13:102385. [DOI] [PubMed] [Google Scholar]
27. Man JP, Koole MAC, Meregalli PG, Handoko ML, Stienen S, de Lange FJ, et al. Digital consults in heart failure care: a randomized controlled trial. Nat Med 2024:30:2907–2913. [DOI] [PMC free article] [PubMed] [Google Scholar]
28. Thangaraj PM, Benson SH, Oikonomou EK, Asselbergs FW, Khera R. Cardiovascular care with digital twin technology in the era of generative artificial intelligence. Eur Heart J 2024;45:4808–4821. [DOI] [PMC free article] [PubMed] [Google Scholar]
29. Mebazaa A, Davison B, Chioncel O, Cohen-Solal A, Diaz R, Filippatos G, et al. Safety, tolerability and efficacy of up-titration of guideline-directed medical therapies for acute heart failure (STRONG-HF): a multinational, open-label, randomised, trial. Lancet 2022;400:1938–1952. [DOI] [PubMed] [Google Scholar]
30. Eurich DT, Johnson JA, Reid KJ, Spertus JA. Assessing responsiveness of generic and specific health related quality of life measures in heart failure. Health Qual Life Outcomes 2006;4:89. [DOI] [PMC free article] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

ztag032_Supplementary_Data

ztag032_supplementary_data.docx^{(49.6KB, docx)}

Data Availability Statement

The data underlying this article cannot be shared publicly due to privacy of individuals who participated in the study. The data will be shared on reasonable request to the corresponding author.

[ztag032-B1] 1. Groenewegen A, Rutten FH, Mosterd A, Hoes AW. Epidemiology of heart failure. Eur J Heart Fail 2020;22:1342–1356. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B2] 2. Shah KP, Khan SS, Baldridge AS, Grady KL, Cella D, Goyal P, et al. Health Status in heart failure and cancer. JACC Heart Fail 2023;12:1166–1178. [DOI] [PubMed] [Google Scholar]

[ztag032-B3] 3. Authors/Task Force Members; Mcdonagh TA, Metra M, Adamo M, Gardner RS, Baumbach A, Böhm M, et al. 2021 ESC guidelines for the diagnosis and treatment of acute and chronic heart failure. Eur Heart J 2021;42:3599–3726. [DOI] [PubMed] [Google Scholar]

[ztag032-B4] 4. Scholte NTB, Gürgöze MT, Aydin D, Theuns DAMJ, Manintveld OC, Ronner E, et al. Telemonitoring for heart failure : a meta-analysis. Eur Heart J 2023;00:1–16. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B5] 5. Rebolledo Del Toro M, Herrera Leaño NM, Barahona-Correa JE, Muñoz Velandia OM, Fernández Ávila DG, García Peña ÁA. . Effectiveness of mobile telemonitoring applications in heart failure patients: systematic review of literature and meta-analysis. Heart Fail Rev 2023;28:431–452. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B6] 6. Cleland JGF, Louis AA, Rigby AS, Janssens U, Balk AHMM. Noninvasive home telemonitoring for patients with heart failure at high risk of recurrent admission and death: The Trans-European Network-Home-Care Management System (TEN-HMS) study. J Am Coll Cardiol 2005;45:1654–1664. [DOI] [PubMed] [Google Scholar]

[ztag032-B7] 7. Heidenreich PA, Bozkurt B, Aguilar D, Allen LA, Byun JJ, Colvin MM, et al. 2022 AHA/ACC/HFSA guideline for the management of heart failure: a report of the American College of Cardiology/American Heart Association Joint Committee on Clinical Practice Guidelines. Circulation 2022;145:e895–e1032. [DOI] [PubMed] [Google Scholar]

[ztag032-B8] 8. Abraham WT, Adamson PB, Bourge RC, Aaron MF, Costanzo MR, Stevenson LW, et al. Wireless pulmonary artery haemodynamic monitoring in chronic heart failure: a randomised controlled trial. Lancet 2011;377:658–666. [DOI] [PubMed] [Google Scholar]

[ztag032-B9] 9. Koehler F, Koehler K, Deckwart O, Prescher S, Wegscheider K, Kirwan BA, et al. Efficacy of telemedical interventional management in patients with heart failure (TIM-HF2): a randomised, controlled, parallel-group, unmasked trial. Lancet 2018;392:1047–1057. [DOI] [PubMed] [Google Scholar]

[ztag032-B10] 10. Stevenson LW, Ross HJ, Rathman LD, Boehmer JP. Remote monitoring for heart failure management at home. J Am Coll Cardiol 2023;81:2272–2291. [DOI] [PubMed] [Google Scholar]

[ztag032-B11] 11. Mohebali D, Kittleson MM. Remote monitoring in heart failure: current and emerging technologies in the context of the pandemic. Heart 2021;107:366–372. [DOI] [PubMed] [Google Scholar]

[ztag032-B12] 12. Khera R, Oikonomou EK, Nadkarni GN, Morley JR, Wiens J, Butte AJ, et al. Transforming cardiovascular care with artificial intelligence: from discovery to practice: JACC state-of-the-art review. J Am Coll Cardiol 2024;84:97–114. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B13] 13. Matsukawa R, Okahara A, Tokutome M, Itonaga J, Koga E, Hara A, et al. A scoring evaluation for the practical introduction of guideline-directed medical therapy in heart failure patients. ESC Heart Fail 2023;10:3352–3363. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B14] 14. Savarese G, Kishi T, Vardeny O, Adamsson Eryd S, Bodegård J, Lund LH, et al. Heart failure drug treatment – inertia, titration and discontinuation: a multinational observational study (EVOLUTION HF). JACC Heart Fail 2022;11:1–14. [DOI] [PubMed] [Google Scholar]

[ztag032-B15] 15. Kaihara T, Scherrenberg M, Intan-Goey V, Falter M, Kindermans H, Frederix I, et al. Efficacy of digital health interventions on depression and anxiety in patients with cardiac disease: a systematic review and meta-analysis. Eur Heart J Digit Health 2022;3:445–454. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B16] 16. Inglis SC, Clark RA, Dierckx R, Prieto-Merino D, Cleland JGF. Structured telephone support or non-invasive telemonitoring for patients with heart failure. Cochrane Database Syst Rev 2015;10:CD007228. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B17] 17. Yun S, Comín-Colet J, Calero-Molina E, Hidalgo E, José-Bazán N, Cobo Marcos M, et al. Evaluation of mobile health technology combining telemonitoring and teleintervention versus usual care in vulnerable-phase heart failure management (HERMeS): a multicentre, randomised controlled trial. Lancet Digit Health 2025;7. [DOI] [PubMed] [Google Scholar]

[ztag032-B18] 18. Galinier M, Roubille F, Berdague P, Brierre G, Cantie P, Dary P, et al. Telemonitoring versus standard care in heart failure: a randomised multicentre trial. Eur J Heart Fail 2020;22:985–994. [DOI] [PubMed] [Google Scholar]

[ztag032-B19] 19. Ong MK, Romano PS, Edgington S, Aronow HU, Auerbach AD, Black JT, et al. Effectiveness of remote patient monitoring after discharge of hospitalized patients with heart failure the better effectiveness after transition-heart failure (BEAT-HF) randomized clinical trial. JAMA Intern Med 2016;176:310–318. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B20] 20. Masterson Creber R, Dodson JA, Bidwell J, Breathett K, Lyles C, Harmon Still C, et al. Telehealth and health equity in older adults with heart failure: a scientific statement from the American Heart Association. Circ Cardiovasc Qual Outcomes 2023;16:e000123. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B21] 21. Krzesiński P, Jankowska EA, Siebert J, Galas A, Piotrowicz K, Stańczyk A, et al. Effects of an outpatient intervention comprising nurse-led non-invasive assessments, telemedicine support and remote cardiologists’ decisions in patients with heart failure (AMULET study): a randomised controlled trial. Eur J Heart Fail 2022;24:565–577. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B22] 22. Hindricks G, Taborsky M, Glikson M, Heinrich U, Schumacher B, Katz A, et al. Implant-based multiparameter telemonitoring of patients with heart failure (IN-TIME): a randomised controlled trial. Lancet 2014;384:583–590. [DOI] [PubMed] [Google Scholar]

[ztag032-B23] 23. Boehmer JP, Cremer S, Abo-Auda W, Stokes D, Hadi A, McCann P, et al. Impact of a novel wearable sensor on heart failure rehospitalization. JACC Heart Fail 2024;12:2011–2022. [DOI] [PubMed] [Google Scholar]

[ztag032-B24] 24. Spatz ES, Ginsburg GS, Rumsfeld JS, Turakhia MP. Wearable digital health technologies for monitoring in cardiovascular medicine. N Engl J Med 2024;390:346–356. [DOI] [PubMed] [Google Scholar]

[ztag032-B25] 25. Oikonomou EK, Khera R. Artificial intelligence-enhanced patient evaluation: bridging art and science. Eur Heart J 2024;45:3204–3218. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B26] 26. DeVore AD, Walsh MN, Vardeny O, Albert NM, Desai AS. Digital solutions for the optimization of pharmacologic therapy for heart failure. JACC Heart Fail 2025;13:102385. [DOI] [PubMed] [Google Scholar]

[ztag032-B27] 27. Man JP, Koole MAC, Meregalli PG, Handoko ML, Stienen S, de Lange FJ, et al. Digital consults in heart failure care: a randomized controlled trial. Nat Med 2024:30:2907–2913. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B28] 28. Thangaraj PM, Benson SH, Oikonomou EK, Asselbergs FW, Khera R. Cardiovascular care with digital twin technology in the era of generative artificial intelligence. Eur Heart J 2024;45:4808–4821. [DOI] [PMC free article] [PubMed] [Google Scholar]

[ztag032-B29] 29. Mebazaa A, Davison B, Chioncel O, Cohen-Solal A, Diaz R, Filippatos G, et al. Safety, tolerability and efficacy of up-titration of guideline-directed medical therapies for acute heart failure (STRONG-HF): a multinational, open-label, randomised, trial. Lancet 2022;400:1938–1952. [DOI] [PubMed] [Google Scholar]

[ztag032-B30] 30. Eurich DT, Johnson JA, Reid KJ, Spertus JA. Assessing responsiveness of generic and specific health related quality of life measures in heart failure. Health Qual Life Outcomes 2006;4:89. [DOI] [PMC free article] [PubMed] [Google Scholar]

PERMALINK

Conversational AI for remote monitoring in heart failure: a prospective controlled pilot study

Aleix Olivella

Ana B Méndez Fernández

Emmanuel Giménez García

Alonso Ortega Torregimeno

Eduard Ródenas-Alesina

Raúl Aguilar López

Tania Maza Pelaez

Toni Soriano-Colomé

Augusto Sao Avilés

Aitor Uribarri

Teresa Soriano Sánchez

Carmen Pérez Bocanegra

Eva Domingo Baldrich

Maria José Martinez-Zapata

Maria Rubio-Valera

Ignacio Ferreira-González

Roles

Abstract

Aims

Methods and results

Conclusion

Graphical Abstract

Graphical Abstract.

Introduction

Methods

Study design

Patient population

AI system architecture

Interventions

Follow-up

Endpoints

Statistical analysis

Results

Baseline characteristics

Table 1.

AI-driven follow-up

Intensity of follow-up

Quality of life, patient experience, and satisfaction

Table 2.

Figure 1.

Exploratory endpoints

Figure 2.

Table 3.

Figure 3.

Adverse events

Discussion

Study limitations

Conclusions

Declaration of generative AI and AI-assisted technologies in the manuscript preparation process

Supplementary Material

Acknowledgements

Contributor Information

Supplementary material

Author contributions

Funding

Data availability

References

Associated Data

Supplementary Materials

Data Availability Statement

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases