Abstract
This study describes the deployment process of an AI-driven clinical decision support (CDS) system to support postpartum depression (PPD) prevention, diagnosis and management. Central to this CDS is an L2-regularized logistic regression model trained on electronic health record (EHR) data at an academic medical center, and subsequently refined through a broader dataset from a consortium to ensure its generalizability and fairness. The deployment architecture leveraged Microsoft Azure to facilitate a scalable, secure, and efficient operational framework. We used Fast Healthcare Interoperability Resources (FHIR) for data extraction and ingestion between the two systems. Continuous Integration/Continuous Deployment pipelines automated the deployment and ongoing maintenance, ensuring the system’s adaptability to evolving clinical data. Along the technical preparation, we focused on a seamless integration of the CDS within the clinical workflow, presenting risk assessment directly within the clinician schedule and providing options for subsequent actions. The developed CDS is expected to drive a PPD clinical pathway to enable efficient PPD risk management.
Introduction
Postpartum depression (PPD) is a complex and potentially life-threatening mental health condition with adverse maternal and infant health outcomes. Approximately 20% of birthing parents experience PPD.(1) However, underdiagnosis and lack of treatment are common, especially among those with low socioeconomic status.(2, 3) PPD’s etiology is complex. While a history of mental health issues is the foremost risk factor,(4, 5) social determinants of health (SDoH), such as strained marital relations, economic hardships, and stressful life events, amplify the risk.(6) The American College of Obstetricians and Gynecologists (ACOG) recommends that all pregnant individuals be screened for PPD at least once during the perinatal period.(7) Patients with a history of mental health conditions or positive screenings in prior pregnancies need more frequent monitoring. However, because the risk of PPD is commonly screened using manual questionnaires such as Edinburgh Postnatal Depression Scale (EPDS) and Patient Health Questionnaire-9 (PHQ-9),(8) screening is not universally performed due to limited clinical resources and an inadequate support on mental health in prenatal care.(9) The current tool for PPD, EPDS, is a self-reported questionnaire that consists of ten questions. The staff will score, record the results, and communicate positive screens to the responsible licensed provider for timely follow-up. Often, the initiation of the screening is dependent on the patient’s self-disclosure, leading to significant gaps in the administration of the EPDS and routine monitoring of mental health outcomes. For those that screened positive, follow up care typically requires an established referral network of mental health services. Educational materials are provided during the screening, and ongoing support is encouraged for patients, their families, and support networks. At many clinics nationwide, PPD screening during the prenatal period is recommended but remains optional. As a result of these systemic gaps and health disparities, a significant number of PPD cases remain undiagnosed and untreated, often marginalizing patients within the healthcare system and leading them to self-manage without adequate clinical guidance.(2, 10) The Centers for Disease Control and Prevention estimates that over half of the cases are not treated, highlighting the urgent need for improved awareness and resources.(11)
Previous literature has shown that identifying at-risk populations and taking individualized approach can prevent PPD. (2, 12-17) For at-risk patients, proactive and individualized prevention before symptoms manifest significantly lowers the chance of depression recurrence.(16, 18) However, successful prevention relies heavily on healthcare professionals’ ability to identify risk factors and initiate appropriate interventions. A significant obstacle to regular screening is the limited resources and time available to obstetric clinicians and their support teams for coordinating care for PPD. This limitation not only impacts the consistency of PPD screening but also leads to a generic and insufficient follow-up care for those identified as at risk. PPD risk prediction remains a knowledge gap in existing clinical methods, such as the EPDS and PHQ-9. For PPD with its complex etiology and risk factors, there is a potential for technology to assist with precise risk identification to prevent, diagnose, and follow up on maternal emotional well-being during and after pregnancy. A responsible use of technology for PPD risk detection can help transition clinical practice from relying heavily on patient self-disclosure to a more efficient screening approach, ensuring that the most vulnerable populations receive the resources they need.
Addressing this challenge, our prior work developed a risk prediction model for PPD by analyzing retrospective Electronic Health Record (EHR) data from 2015 to 2020 for prenatal care patients at Weill Cornell Medicine (WCM) and NewYork-Presbyterian Hospital (NYPH).(19) This predictive model incorporates over 30 variables from the EHR, encompassing mental health history, medical comorbidities, obstetric complications, medication orders, and patient demographics. They include known predictors such as previous mental health and marital status, as well as EHR-obtained information such as vital signs and emergency department visits. Each predictor in the model was reviewed by clinicians for clinical meaningfulness. The model was also validated using data from the INSIGHT clinical research network, a consortium of urban academic medical centers.(20)
The objective of the study is to describe our implementation process which aims to effectively leverage the predictive model to assist healthcare providers to pinpoint at-risk patients, paving the way for timely and preventative interventions. Our proposed implementation workflow directly targets challenges in implementing the ACOG- recommended PPD screening, while also being attentive to the impact of new and evolving technology on clinical teams. As part of this process, we refined the existing model to ensure equitable outcomes across diverse patient demographics. Furthermore, we developed a clinical decision support (CDS) system integrated within the EHR to streamline the detection of PPD risks. This CDS system uses automated risk identification to target specific challenges in the early detection of PPD. Additionally, it is designed to overcome barriers in delivering comprehensive PPD prevention, particularly for patients with varying clinical and socio-economic backgrounds, by providing tailored, risk-stratified follow-up care. This strategic approach not only improves the integration of new technologies but also enhances the effectiveness of PPD prevention and diagnosis pathways.
This paper is structured as follows. We describe the technical preparation undertaken to deploy the model as well as the EHR interface design in Method. We present the EHR interface and associated care pathway in Results. Challenges learned and future work are discussed in Discussion.
Method
Study Setting
The study setting is an urban health system with 11 campuses, including two academic medical centers and community hospitals. The CDS is planned to be part of a workflow in the ambulatory prenatal settings that serve patients in a diverse set of geographic locations. Epic EHR® are used across campuses. The organization structure of the implementation is multidisciplinary, including team members from cloud infrastructure, DevOps (software development and IT operations), web development, clinical informatics (EHR), data science, and clinical services (obstetrics & gynecology and psychiatry). A dedicated project management team coordinates team communications, regulatory approvals, and project timelines.
Model information
The initial ML model behind the CDS was peer-reviewed and published but not deployed.(19) The original model is a L2-regularized logistic regression model which calculates the likelihood of a patient developing PPD within 1 year after childbirth. To train and test the model, we used data from 2015 to 2018 including over 15,000 patients. EHR data from the emergency department, inpatient, and outpatient settings were used. The model was trained on 5-fold cross validation of 80% of the data, while the other 20% was used to validate the performance of the model. The initial model was further validated using data from 2014 to 2018 from a consortium of health organizations (INSIGHT) of over 50,000 patients using the same inclusion criteria.
Development methodology: To be considered for the model, the following must have been true for a patient: 1) Age 18 to 45 years old; 2) Receive prenatal care at the study site, and 3) Had an EHR-documented encounter within one year after childbirth. The outcome is defined as having a diagnosis of PPD within one year of childbirth. A PPD diagnosis was defined using ICD-9, ICD-10, Systematized Nomenclature of Medicine (SNOMED) codes. For those who did not have coded PPD diagnoses, we also used the prescription of antidepressants within 1 year following childbirth. The specific diagnostic codes for PPD definition are listed in the published by Zhang et al.(19) The use of antidepressants was defined by Anatomical Therapeutic Chemical (ATC) codes under N06A. To ensure that antidepressants were primarily used for treatment of mental health conditions, and not for other indications such as pain, we further excluded the following medications: Amitriptyline, Clomipramine, Duloxetine, Flupentixol, and Nortriptyline.
With the training population defined, we looked back in time up to the first record of prenatal visit and gathered all relevant data elements for each patient until 1 year after childbirth. We began by extracting over 1,000 potential predictive variables from Electronic Health Records (EHR), categorizing them into groups such as vital signs, medication orders, lab results, comorbidities, and demographic information, organized by patient trimesters. To manage missing data, we imputed values using the mean of the available data for each variable. The feature selection process involved an iterative Sequential Feature Selection (SFS) to identify relevant features, which were then assessed and validated by clinicians to ensure clinical relevance. Irrelevant or unexplainable features were discarded. We optimized the model using a grid search to fine-tune the parameters, aiming to maximize the area under the receiver operating characteristic curve (AUC). The process was iterative: after each training phase, clinicians reviewed the model, removing less meaningful features, and the model was refined through subsequent iterations of feature selection and training. This methodical approach ensures that the final model is both clinically valid and optimized for identifying at-risk patients. SFS was performed separately for patients with, and without, mental health history to ensure that the model can predict for both types of patients when in actual use. We combined features selected from both SFS into a single feature set such that a single model can be used for patients with and without a history of documented mental illness. In total, we ended up with 32 variables considered to be most predictive.
Evaluation and refined model: Since the initial model development used data from 2015 and 2018, we used the following datasets for the post-development evaluation. They include: 1) EHR data from the study site in 2019 (the same site as the original development but using data one year post-development to evaluate the model performance); 2) EHR data from the same study site in 2020 (the year of the COVID-19 pandemic to test resilience to turbulence); 3) EHR data from the INSIGHT clinical research network between 2014 and 2020. All three datasets were used for validation purposes only. Specifically, the clinical research network encompasses a population from the New York tri-state metropolitan area, which intersects with the health system’s catchment zone. The same inclusion and exclusion criteria as the initial development were used in the refinement. We evaluated and ultimately refined the model per the request of the study site’s expert review and guidance to ensure a safe and equitable integration of AI in patient care. The evaluation included the initial model’s bias, while also considering the net benefit (21) and predictive performance – critical criteria for translating AI into CDS.
The refined model was a result of comparing five debiasing approaches including fairness through blindness and reweighing. The first approach nullified race-related variables in the original logistic regression model. The second approach re-trained the model without race variables. The third, fourth, and fifth approaches used reweighing techniques, with the third and fourth removing race and using different weighting calculations based on expected versus actual probabilities, while the fifth removed race and applied reweighing based on literature-reported probabilities. Predictive performance focused on area under the curve (AUC) and sensitivity. The evaluation of this refined model at WCM and INSIGHT showed a consistently high model predictive performance with AUC values between 0.93 and 0.97.(21) Model fairness was evaluated using metrics including disparate impact (the probabilities of the positive outcome for the two groups), equal opportunity difference (the difference in true positive rates between two groups), and predictivity parity difference (the difference in positive predictive values), with White race being the privileged value. These metrics were selected because positive prediction (risk identification) will lead to resource allocation in the workflow. The refinement process revised the model towards equity while maintaining predictive performance. The refined model was still trained on the same dataset as the initial model development but validated using the three datasets described above. The differences between the initial and refined models are the removal of race as a predictor and retrained model coefficients. The refinement did not impact predictive performance but showed improvements in precision and sensitivity, indicating increased predictive accuracy and true positive rates. The refined model had improved statistical parity difference, equal opportunity difference, and average odds difference, reflecting changes in the model’s fairness. More details on the model refinement can be found in Liu et al.(21)
Deployment architecture
The deployment architecture revolves around two primary systems: the EHR system and the Microsoft Azure (Figure 1). Central to our cloud-based infrastructure was the utilization of Microsoft Azure, where Azure Kubernetes Service (AKS) orchestrated the containerization and management of AI predictive models. Within this framework, AI/ML models are containerized and hosted on AKS, ensuring both scalability and efficient load balancing capabilities. Concurrently, Azure Monitor provisions system-wide telemetry to facilitate real-time performance tracking. To safeguard model endpoints, robust authentication and authorization protocols are implemented. This was complemented by technical reviews that focused on security, monitoring, and maintaining system consistency. We set up dedicated modules for authentication, data extraction, modeling, and EHR data ingestion, ensuring modular and efficient processing within our architecture. Continuity in the deployment process is maintained via an integrated Continuous Integration/Continuous Deployment (CI/CD) mechanism, a core component of DevOps practices, which automates the delivery and updating of AI models directly into the EHR. This includes the use of Azure’s logging, tracing and diagnostics function to interpret potential issues and ensure system resilience. This operational efficiency was complemented by Machine Learning Operations (MLOps) strategies, which ensured the continuous refinement of deployed models in alignment with emergent clinical data and evolving practice requirements. To optimize and streamline, these processes are containerized into distinct AKS pods by the Azure pipeline.
Figure 1.
Deployment Structure
The role of data science transitions from its retrospective phase of model development to model instantiation and deployment, thus entailing more focus on MLOps. MLOps ensures the systematic deployment, monitoring, and reiteration of AI in response to emerging data patterns and clinical needs. Our monitoring framework comprises two critical components: issue identification and escalation protocol. The issue identification component involves the continuous assessment of application and model performance metrics including detection of deviations from expected outcomes, and identification of potential biases or data drift. For application monitoring, we use Azure Application Insight to track and analyze application performance, detect errors and exceptions during deployment. Anticipated issues include FHIR request errors, server connection errors, and data processing anomalies. Vulnerability patching is planned for monthly, except for critical vulnerabilities, which are addressed immediately to ensure system security. For model performance monitoring, we record model input and output, relevant parameters, time of model computation, and eventual outcome for each patient in a study database. While the model is run for each eligible patient once a week, our planned assessment for any need for re-training is six months as PPD, our main outcome, may take time to develop and be diagnosed after childbirth. The escalation protocol component delineates the procedures for addressing identified issues, engaging relevant stakeholders, and implementing necessary adjustments to ensure the model’s integrity and its alignment with clinical objectives. The order of escalation is case-by-case among members in data science, cloud infrastructure, DevOps, web development, clinical informatics (EHR), and clinical services. Lastly, our deployment process was designed with a keen awareness of the latest regulatory standards related to AI fairness and bias. This governance framework is integral to maintaining the ethical integrity of our deployment, upholding the principles of patient safety and trust in the application of AI within healthcare settings.
FHIR
A critical aspect of our infrastructure is the establishment of FHIR protocols. The EHR system communicates bidirectionally with Azure using the Health Level 7 Fast Healthcare Interoperability Resources (FHIR) protocols for data extraction and ingestion, ensuring standardized and secure data exchange. This ensured a secure and standardized framework for the bidirectional exchange of clinical data between the EHR system and the Azure cloud, pivotal for the real-time processing and integration of healthcare information. This real-time data processing capability is essential for the instantaneous application of AI-derived insights within clinical workflows. Subsequently, these insights are visualized within the EHR interface, thus facilitating their immediate utilization by healthcare practitioners. Upon refinement, the finalized PPD predictive model contains 31 variables, including marital status, current diagnosis, medical history, medication prescription, diagnostic results, and encounters. To interface with these model variables, we used FHIR API endpoints provided by the Epic EHR, as described in Table 1. We used FHIR version R4. The original PPD risk prediction model was developed under the OMOP (Observational Medical Outcomes Partnership) common data model. In using FHIR API resources, we mapped the model variables defining diagnoses using SNOMED to ICD-10-CM codes to use the FHIR condition resources. For extraction, we parsed the XML output derived from FHIR API get requests with clinician reviews to construct model variables as defined by the PPD predictive model. The model output includes a risk score (probability from 0 to 1), and three top predictors derived using SHAP (SHapley Additive exPlanations).(22) The output is ingested to the EHR using a FHIR API post request to flowsheets in the EHR.
Table 1.
FHIR API resources for data extraction
| FHIR resources | Variables |
|---|---|
| Observation/Episode of Care | Cohort definition (pregnancy) |
| Patient | Marital status |
| Condition | Anxiety history, Other disorder history, Mood disorder history, Anxiety in pregnancy, Mental disorder in pregnancy, Palpitations, Vomiting in pregnancy, Hypertensive disorder, Acute pharyngitis, Hemorrhage in early pregnancy antepartum, Diarrhea, Threatened miscarriage, Abdominal pain, Migraine, Hypothyroidism, Placental infarct, Primigravida, Pre-eclampsia, Abnormality of organs and/or soft tissues of pelvis affecting pregnancy, False labor at or after 37 completed weeks of gestation |
| Medication Request | Antidepressants, Beta blocking agents, Direct acting antivirals, Other antibacterials, Antihistamines for systemic use |
| Observations | Diastolic blood pressure in third trimester |
| Procedure | Deliveries by cesarean |
| Encounter | Emergency department Visits |
CDS Interface Development
The clinician lead in the team led the CDS interface development. To provide informational and yet concise delivery of the AI output, several considerations were taken. First, the risk information is included in the clinician schedule, allowing clinicians to take a glimpse of the panel of patients while planning for the daily schedule. The schedule column can be customized so clinicians can opt in and out of the information. Secondly, a red-colored risk warning sign are assigned exclusively to patients identified as being at risk for PPD, based on the retrospective analysis of the study sites’ data as described above.(21) This analysis established a threshold where a prediction probability above 30% indicates a significant risk of developing PPD. At this threshold, the sensitivity and PPV are 0.94 and 0.70 in 2019 validation data, 0.94 and 0.76 in 2020 validation data, and 0.94 and 0.60 in the INSIGHT validation data. We deemed the performance satisfactory as sensitivity and PPV are crucial for our workflow which refers those who receive a positive prediction to additional resources. Third, clinicians interested in learning more about the risk prediction such as the variables driving the risk score can hover over the displayed risk information. In this study, the risk factors are derived using SHapley Additive exPlanations (SHAP)(22) to identify and explain the risk factors that significantly influence the model’s predictions for each patient.
Based on each risk factor, the clinician lead and the study team summarized subsequent anticipatory actions, including referral to social work, nutrition, education, and behavioral health upon consultation with Psychiatry. We referred to the clinical practice guideline at ACOG, existing literature on PPD and pregnancy health, and consulted the Department of Psychiatry on available resources.(7) Recognizing the ACOG’s recommendation for educational support alongside PPD risk screening, we compiled pertinent educational content from Healthwise®, a provider of medically reviewed educational resources. This approach enables clinicians to offer targeted educational materials to patients, facilitating an informed start to PPD prevention strategies. Lastly, no hard-stop or pop-up alert is added with the CDS to prevent alert fatigue. Clinicians will make the ultimate clinical judgement after receiving CDS, including taking no actions towards PPD prevention if the CDS is deemed irrelevant.
Result
The Epic EHR interface is shown in Figure 2. The risk information, “Postpartum Depression Risk,” is communicated to clinicians through the red warning sign located in their schedule. The risk scores are highlighted on clinician schedules in the category of high or low risk. Risk prediction information, including factors driving the prediction and anticipatory intervention are available when clinician click the red warning sign. In addition, more details about the prediction algorithm are listed within the study site’s intranet for transparency. The interface can suggest subsequent interventions such as EPDS screening, referral to social work, nutrition, education, and behavioral health. Following the cadence of prenatal visits, the CDS is scheduled to refresh weekly.
Figure 2.
EHR interface
With the deployment, Figure 3 illustrates the care pathway that is planned to be triggered by the data-driven CDS. The proposed workflow includes using the predictive model to classify patients into a risk-tiered care pathway for PPD screening and prevention. Those who are classified as low risk will continue to receive educational materials through patient portals and other digital means per ACOG recommendation. Those who are classified as high risk will receive an EPDS screening to determine the current severity. For those who have an EPDS score of 10 or above, they will be referred to Psychiatry if existing relationships exist, or if not, to social work to establish care with Psychiatry. For those with an EPDS of under 10, they are referred to social work to follow up and establish services such as diet and lifestyle counseling. The pathway will be continuously evaluated through interviews with end-users and assessment of workflow feasibility based on existing care resources.
Figure 3.
Proposed clinical workflow incorporating the CDS.
Discussion
This paper describes the deployment process of a predictive model for PPD using the Microsoft Azure Platform. Amidst the intense interest in the deployment of AI/machine learning technologies within the healthcare sector, this study centers on delineating the technical prerequisites essential for a secure and effective implementation. While individual teams bring respective expertise in cloud infrastructure, DevOps, web development, clinical informatics, data science, and medicine, we find that a successful implementation of AI necessitates robust inter-disciplinary communication to coordinate the multifaceted components. The infrastructure and pipeline set up are re-usable; thus, the learned knowledge will enable multiple future implementation projects in other clinical areas. The risk prediction model in this paper, an L2-regularized logistic regression, is a relatively simple model. Thus, computationally it does not require significant resource and cost. Clinically, we hypothesize that the CDS will simplify the process of PPD risk detection for healthcare professionals, resulting in more timely and proactive interventions. In the short term, we expect an increase in the number of PPD screenings, subsequently leading to timely diagnoses and treatments. Furthermore, we anticipate that the increased screening frequency, coupled with interventions such as social work referrals, education, and consultations, will play a pivotal role in preventing PPD. If successful, we anticipate the long-term clinical impact to include lower PPD incidence in our patient population, leading to better pregnancy and infant outcomes. Compared to our approach that focuses on the prevention, current workflow is specifically designed to identify symptoms of depression that are already present.(12) Our approach is also automated rather than requiring self-reported metrics, although self-reported metrics may contain more nuanced and accurate health information. More detailed comparison between an AI-based approach and the existing workflow is needed as the deployment progresses further along.
We identified several challenges that may commonly face AI implementation studies. A principal difficulty was the necessity to acquire a comprehensive understanding of cloud computing infrastructures, such as Microsoft Azure, which was utilized in our study. Although alternative projects may opt for different platforms, like Amazon Web Services, Google Cloud Platform, and Epic Nebula, the core challenge remains consistent across various settings. The effective collaboration and decision-making within the team, underscored by principles of team science, emerged as critical for navigating these technological complexities. A pervasive challenge encountered in implementation projects is the need to monitor for deprecations and vulnerabilities within programming language libraries. In the context of our study, which employed the Python and JavaScript programming languages, we observed a considerable frequency of library updates over the 1.5-year implementation period. These updates often required modifications to our codebase to accommodate new library versions or to mitigate against newly identified security vulnerabilities. Such adjustments required manual intervention, underscoring the need for a proactive and systematic approach to maintain code integrity and security in the evolving landscape of software dependencies.
From the perspective of data science, the project demanded an adaptation beyond the conventional focus on AI/machine learning model development and validation. The shift towards MLOps required data scientists to broaden their skill set to include knowledge of APIs, containerization practices, and a foundational understanding of cloud infrastructure and DevOps methodologies. This diversification of expertise is essential for the successful translation of models from the development stage to clinical application, thereby realizing the potential of AI to contribute meaningfully to patient care within healthcare settings. In particular, the use of FHIR requires a multidisciplinary investigation between data scientists, clinicians, and clinical informaticians. This is critical when aligning the inputs defined by an AI model with the corresponding FHIR API resources, a process that demands meticulous attention to ensure the data obtained through FHIR aligns with the model’s original specifications as closely as possible. We found the availability to use Epic FHIR testing sandbox and institutional testing environment to be essential in ensuring the accuracy of model input. In addition, the collaboration with clinical informaticians is vital, as we needed multiple rounds of testing to ensure proper ingestion of the AI model output and designing the CDS interface.
AI implementation requires an advanced level of technical understanding. As AI meets patient care, this presented a challenge as it was imperative for all stakeholders including clinicians, healthcare administrators, and others to have a clear understanding of AI and how it impacts care. We found that discussions around what data is used, clear interpretations of model output, objective data to demonstrate model fairness were crucial for the technical team to convey to clinical stakeholders. At the same time, the implementation team needed clear understanding of the clinical workflow for which AI will be a component of to identify optimal location for initial launch. In addition, it is critical that the clinical settings can welcome AI and the resources that it demands. In our study, a main barrier to PPD risk indication is the lack of subsequent interventions. Thus, for our project, the locations that will likely yield the most benefit for patients and least burden for clinical staff are obstetric clinics with resources for social workers and mental health services.
While out of scope for this paper, ongoing work includes investigating end-user perspectives of the implemented CDS, and clinical impact studies. Health IT is known to promote coordinated care, which is needed in the case of PPD.(23) However, introducing new algorithms into healthcare necessitates a thoughtful consideration of clinician workload burden and the underlying care models. Clinician burden and burnout are pressing challenges in modern patient care, supported by extensive research.(24-26) Today, clinicians are grappling with an escalating load of clerical tasks, spanning billing, quality measures, and compliance metrics. This mounting administrative workload, exacerbated by suboptimal EHR design and information overload, heightens their cognitive workload and makes it challenging to efficiently access and prioritize essential clinical data.(27, 28) While EHR systems have often been pinpointed as culprits in clinician burden, surveys reveal that care models, which intricately shape workload distribution and team structures, wield a substantial influence on mitigating the adverse effects of EHR usage.(24-26) Studies on social preferences further shed light on the allocation of resources. In this PPD use case and beyond, AI is often used for detecting disease risk. Following the detection of such risks, AI systems help in allocating resources to address these risks effectively. Therefore, while AI itself does not directly allocate resources, it informs and supports the decision-making process regarding resource allocation based on the detected risks. Thus, there needs careful consideration of clinician preferences,(29) and it is imperative to recognize the dynamic nature of healthcare delivery where AI will become a component of.
Future studies that build on this work and other AI implementation studies may explore optimal care models for AI to be incorporated. For example, there has been ample precedents on ensuring successful adoption of technology in healthcare through financial incentives, notably the Meaningful Use program (Medicare and Medicaid EHR Incentive Programs) established by the Centers for Medicare & Medicaid Services (CMS) in the United States to encourage healthcare providers to adopt and proficiently utilize EHRs as a means to elevate the quality of patient care.(30, 31) While evidence from prior evaluation studies present mixed results in the use of financial incentives, (32) investigating whether reimbursement will affect clinicians’ adoption of AI may lead to better understanding of end-user uptake. Similarly, common barriers in technology adoption hindering intention to use are known be effort expectancy such as time constraints and administrative burdens.(33) Future studies may examine interventions to remove barriers in implementing technology tools such as those in the Prescription Drug Monitoring Programs (PDMPs)(34, 35) to help reduce the additional workload or challenges associated with integrating AI into the workflow. Lastly, with the prediction and decision support capability of AI comes with patient safety issues on patient injury and medical liability.(36, 37) This liability concern is common across forms of CDS that exposes clinicians to information on risk.(38) A survey to OBGYN clinicians conducted by our previous study revealed concerns on the lack of subsequent actions related to risk of PPD, which is often beyond regular OB care.(39) Thus, future studies should also investigate whether ensuring sufficient subsequent intervention will reduce liability concerns by the clinicians and support their AI adoption.
Conclusion
This paper describes the implementation process of a predictive model aimed at addressing PPD risk. We highlight the collective expertise of interdisciplinary teams, integrating technical and clinical insights to establish a robust framework for data extraction and ingestion, as well as for designing interfaces compatible with workflow and consideration for alert fatigue. The implementation process also benefited from a thorough review by clinical stakeholders on the model’s fairness that resulted in a refined model to balance predictive performance and equity. A pivotal aspect of our approach was the adoption of Microsoft Azure as the underlying infrastructure, which was key in ensuring system scalability, implementing advanced security protocols, and facilitating efficient resource management. As the cloud-based system continues to evolve, it may serve as a scalable blueprint for future AI integrations within EHR systems, potentially broadening its applicability across a variety of health systems and EHR vendors. The successful deployment of this model underscores the potential for AI to play a significant role in the clinical management of PPD, with future investigations to explore its integration into clinical workflows and its impact on PPD prevention strategies.
Figures & Tables
References
- 1.Curtin SC, Abma JC, Ventura SJ, Henshaw SK. Pregnancy rates for U.S. women continue to drop. NCHS Data Brief. 2013;136:1–8. [PubMed] [Google Scholar]
- 2.Werner E, Miller M, Osborne LM, Kuzava S, Monk C. Preventing postpartum depression: review and recommendations. Arch Womens Ment Health. 2015;18(1):41–60. doi: 10.1007/s00737-014-0475-y. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.Cox EQ, Sowa NA, Meltzer-Brody SE, Gaynes BN. The Perinatal Depression Treatment Cascade: Baby Steps Toward Improving Outcomes. J Clin Psychiatry. 2016;77(9):1189–200. doi: 10.4088/JCP.15r10174. [DOI] [PubMed] [Google Scholar]
- 4.Meltzer-Brody S, Howard LM, Bergink V, Vigod S, Jones I, Munk-Olsen T, et al. Postpartum psychiatric disorders. Nat Rev Dis Primers. 2018;4:18022. doi: 10.1038/nrdp.2018.22. [DOI] [PubMed] [Google Scholar]
- 5.Stewart DE, Vigod S. Postpartum Depression. N Engl J Med. 2016;375(22):2177–86. doi: 10.1056/NEJMcp1607649. [DOI] [PubMed] [Google Scholar]
- 6.Biaggi A, Conroy S, Pawlby S, Pariante CM. Identifying the women at risk of antenatal anxiety and depression: A systematic review. J Affect Disord. 2016;191:62–77. doi: 10.1016/j.jad.2015.11.014. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7.ACOG Committee Opinion No. 757: Screening for Perinatal Depression Obstet Gynecol. 2018;132(5):e208–e12. doi: 10.1097/AOG.0000000000002927. [DOI] [PubMed] [Google Scholar]
- 8.Wisner KL, Sit DK, McShea MC, Rizzo DM, Zoretich RA, Hughes CL, et al. Onset timing, thoughts of self-harm, and diagnoses in postpartum women with screen-positive depression findings. JAMA Psychiatry. 2013;70(5):490–8. doi: 10.1001/jamapsychiatry.2013.87. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9.O’Connor E, Senger CA, Henninger ML, Coppola E, Gaynes BN. Interventions to Prevent Perinatal Depression: Evidence Report and Systematic Review for the US Preventive Services Task Force. JAMA. 2019;321(6):588–601. doi: 10.1001/jama.2018.20865. [DOI] [PubMed] [Google Scholar]
- 10.Huang R, Yan C, Tian Y, Lei B, Yang D, Liu D, et al. Effectiveness of peer support intervention on perinatal depression: A systematic review and meta-analysis. J Affect Disord. 2020;276:788–96. doi: 10.1016/j.jad.2020.06.048. [DOI] [PubMed] [Google Scholar]
- 11.Ko JY, Rockhill KM, Tong VT, Morrow B, Farr SL. Trends in postpartum depressive symptoms—27 states, 2004, 2008, and 2012. Morbidity and Mortality Weekly Report. 2017;66(6):153. doi: 10.15585/mmwr.mm6606a1. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 12.Dennis CL, Creedy D. Psychosocial and psychological interventions for preventing postpartum depression. Cochrane Database Syst Rev. 2004(4):CD001134. doi: 10.1002/14651858.CD001134.pub2. [DOI] [PubMed] [Google Scholar]
- 13.Leis JA, Mendelson T, Tandon SD, Perry DF. A systematic review of home-based interventions to prevent and treat postpartum depression. Arch Womens Ment Health. 2009;12(1):3–13. doi: 10.1007/s00737-008-0039-0. [DOI] [PubMed] [Google Scholar]
- 14.Dennis CL, Hodnett E, Kenton L, Weston J, Zupancic J, Stewart DE, et al. Effect of peer support on prevention of postnatal depression among high risk women: multisite randomised controlled trial. BMJ. 2009;338:a3064. doi: 10.1136/bmj.a3064. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 15.Shorey S, Chee CYI, Ng ED, Lau Y, Dennis CL, Chan YH. Evaluation of a Technology-Based Peer-Support Intervention Program for Preventing Postnatal Depression (Part 1): Randomized Controlled Trial. J Med Internet Res. 2019;21(8):e12410. doi: 10.2196/12410. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 16.Sangsawang B, Wacharasin C, Sangsawang N. Interventions for the prevention of postpartum depression in adolescent mothers: a systematic review. Arch Womens Ment Health. 2019;22(2):215–28. doi: 10.1007/s00737-018-0901-7. [DOI] [PubMed] [Google Scholar]
- 17.Shorey S, Ng ED. Evaluation of a Technology-Based Peer-Support Intervention Program for Preventing Postnatal Depression (Part 2): Qualitative Study. J Med Internet Res. 2019;21(8):e12915. doi: 10.2196/12915. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18.Cohen LS, Altshuler LL, Harlow BL, Nonacs R, Newport DJ, Viguera AC, et al. Relapse of major depression during pregnancy in women who maintain or discontinue antidepressant treatment. Jama. 2006;295(5):499–507. doi: 10.1001/jama.295.5.499. [DOI] [PubMed] [Google Scholar]
- 19.Zhang Y, Wang S, Hermann A, Joly R, Pathak J. Development and validation of a machine learning algorithm for predicting the risk of postpartum depression among pregnant women. J Affect Disord. 2021;279:1–8. doi: 10.1016/j.jad.2020.09.113. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 20.Kaushal R, Hripcsak G, Ascheim DD, Bloom T, Campion TR, Jr., Caplan AL, et al. Changing the research landscape: the New York City Clinical Data Research Network. J Am Med Inform Assoc. 2014;21(4):587–90. doi: 10.1136/amiajnl-2014-002764. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 21.Liu Y, Joly R, Turchioe MR, Benda N, Alison H, Beecy A, et al. Preparing for the Bedside – Optimizing a Postpartum Depression Risk Prediction Model for Clinical Implementation in a Health System. J Am Med Inform Assn. 2024;00(0):1–12. doi: 10.1093/jamia/ocae056. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22.Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. Advances in Neural Information Processing Systems 30 (Nips 2017) 2017;30 [Google Scholar]
- 23.Balio CP, Apathy NC, Danek RL. Health Information Technology and Accountable Care Organizations: A Systematic Review and Future Directions. EGEMS (Wash DC) 2019;7(1):24. doi: 10.5334/egems.261. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24.McPeek-Hinz E, Boazak M, Sexton JB, Adair KC, West V, Goldstein BA, et al. Clinician Burnout Associated With Sex, Clinician Type, Work Culture, and Use of Electronic Health Records. JAMA Netw Open. 2021;4(4):e215686. doi: 10.1001/jamanetworkopen.2021.5686. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25.Kroth PJ, Morioka-Douglas N, Veres S, Babbott S, Poplau S, Qeadan F, et al. Association of Electronic Health Record Design and Use Factors With Clinician Stress and Burnout. JAMA Netw Open. 2019;2(8):e199609. doi: 10.1001/jamanetworkopen.2019.9609. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26.Budd J. Burnout Related to Electronic Health Record Use in Primary Care. J Prim Care Community Health. 2023;14:21501319231166921. doi: 10.1177/21501319231166921. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27.Zhang Y, Padman R, Levin JE. Paving the COWpath: data-driven design of pediatric order sets. J Am Med Inform Assoc. 2014;21(e2):e304–11. doi: 10.1136/amiajnl-2013-002316. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 28.Gartner D, Zhang Y, Padman R. Cognitive workload reduction in hospital information systems : Decision support for order set optimization. Health Care Manag Sci. 2017 doi: 10.1007/s10729-017-9406-6. [DOI] [PubMed] [Google Scholar]
- 29.Li J, Casalino LP, Fisman R, Kariv S, Markovits D. Experimental evidence of physician social preferences. Proc Natl Acad Sci U S A. 2022;119(28):e2112726119. doi: 10.1073/pnas.2112726119. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 30.Adler-Milstein J, DesRoches CM, Kralovec P, Foster G, Worzala C, Charles D, et al. Electronic Health Record Adoption In US Hospitals: Progress Continues, But Challenges Persist. Health Aff (Millwood) 2015;34(12):2174–80. doi: 10.1377/hlthaff.2015.0992. [DOI] [PubMed] [Google Scholar]
- 31.Markovitz AA, Ramsay PP, Shortell SM, Ryan AM. Financial Incentives and Physician Practice Participation in Medicare’s Value-Based Reforms. Health Serv Res. 2018;53 Suppl 1(Suppl Suppl 1):3052–69. doi: 10.1111/1475-6773.12743. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 32.Scott A, Sivey P, Ait Ouakrim D, Willenberg L, Naccarella L, Furler J, et al. The effect of financial incentives on the quality of health care provided by primary care physicians. Cochrane Database Syst Rev. 2011;9:CD008451. doi: 10.1002/14651858.CD008451.pub2. [DOI] [PubMed] [Google Scholar]
- 33.Venkatesh V, Thong JY, Xu X. Consumer acceptance and use of information technology: extending the unified theory of acceptance and use of technology. MIS quarterly. 2012:157–78. [Google Scholar]
- 34.Norwood CW, Wright ER. Promoting consistent use of prescription drug monitoring programs (PDMP) in outpatient pharmacies: Removing administrative barriers and increasing awareness of Rx drug abuse. Res Social Adm Pharm. 2016;12(3):509–14. doi: 10.1016/j.sapharm.2015.07.008. [DOI] [PubMed] [Google Scholar]
- 35.Robinson A, Wilson MN, Hayden JA, Rhodes E, Campbell S, MacDougall P, et al. Health Care Provider Utilization of Prescription Monitoring Programs: A Systematic Review and Meta-Analysis. Pain Med. 2021;22(7):1570–82. doi: 10.1093/pm/pnaa412. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36.Maliha G, Gerke S, Cohen IG, Parikh RB. Artificial Intelligence and Liability in Medicine: Balancing Safety and Innovation. Milbank Q. 2021;99(3):629–47. doi: 10.1111/1468-0009.12504. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37.Challen R, Denny J, Pitt M, Gompels L, Edwards T, Tsaneva-Atanasova K. Artificial intelligence, bias and clinical safety. BMJ Qual Saf. 2019;28(3):231–7. doi: 10.1136/bmjqs-2018-008370. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38.Marchant G, Barnes M, Evans JP, LeRoy B, Wolf SM, LawSeq Liability Task F. From Genetics to Genomics: Facing the Liability Implications in Clinical Care. J Law Med Ethics. 2020;48(1):11–43. doi: 10.1177/1073110520916994. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.Myers AC, Ahsan F, Joly R, Hermann A, Zhang Y, Laskoff M, et al., editors. Provider perspectives on the clinical utility of using a risk prediction tool for postpartum depression. AMIA. 2020 [Google Scholar]



