Skip to main content
AMIA Annual Symposium Proceedings logoLink to AMIA Annual Symposium Proceedings
. 2024 Oct 21;2023:2023–2032.

Ethicara for Responsible AI in Healthcare: A System for Bias Detection and AI Risk Management

Maria Kritharidou 1, Georgios Chrysogonidis 1, Tasos Ventouris 1, Vaios Tsarapastsanis 1, Danai Aristeridou 1, Anastasia Karatzia 1, Veena Calambur 2, Ahsan Huda 1, Sabrina Hsueh 1
PMCID: PMC11492113  PMID: 39435256

Abstract

The increasing torrents of health AI innovations hold promise for facilitating the delivery of patient-centered care. Yet the enablement and adoption of AI innovations in the healthcare and life science industries can be challenging with the rising concerns of AI risks and the potential harms to health equity. This paper describes Ethicara, a system that enables health AI risk assessment for responsible AI model development. Ethicara works by orchestrating a collection of self-analytics services that detect and mitigate bias and increase model transparency from harmonized data models. For the lack of risk controls currently in the health AI development and deployment process, the self-analytics tools enhanced by Ethicara are expected to provide repeatable and measurable controls to operationalize voluntary risk management frameworks and guidelines (e.g., NIST RMF, FDA GMLP) and regulatory requirements emerging from the upcoming AI regulations (e.g., EU AI Act, US Blueprint for an AI Bill of Rights). In addition, Ethicara provides plug-ins via which analytics results are incorporated into healthcare applications. This paper provides an overview of Ethicara’s architecture, pipeline, and technical components and showcases the system’s capability to facilitate responsible AI use, and exemplifies the types of AI risk controls it enables in the healthcare and life science industry.

1. Introduction

Health AI innovations in real-world evidence generation and validation hold promise for facilitating the delivery of patient-centered care1. However, health systems, providers, and patients face challenges when integrating additional insights from AI into clinical workflow and wellness decisions. Multisite studies have demonstrated the varying performance of AI/ML models in real-world settings2, 3. In addition, the controversies of health AI on racial and gender bias have sparked an ongoing debate about the ethics and responsibility of such applications on patient care that would affect outcomes, care quality, and health equity4, 5. Following the 21st Century Care Act, FDA released guidelines for Good Machine Learning Practice (GMLP), Software as a Medical Device (SaMD), Real-World Evidence (RWE), and Clinical Decision Support Software. So far, it has led to more than 500 SaMD approvals and the incorporation of RWE in more than 100 regulatory decisions for new drugs and biologics6. However, in a recent survey of health AI innovation, the adoption rate of health AI in high-stakes decision-making scenarios is still in its infancy, given the lack of risk controls for enabling the responsible use of AI in healthcare7, 8.

Meanwhile, with the increased staff shortage and clinician burnout rate, the healthcare industry is going through significant consolidations and transitions, putting AI adoption at the center of business priorities. Despite the emerging evidence on how health AI could help improve patient outcomes, care quality, and health equity, the lack of transparency on how AI insights have impeded its interpretation by clinicians in the workflow. Moreover, growing concerns about data and algorithmic bias have been introduced across the lifecycle of AI, from model development and deployment to its responsible use. The Gartner report hypothesized that 85% of the AI projects would deliver erroneous outcomes due to bias9. The newly released 2022 AI index report has documented the increase of bias further introduced by generative AI models, showing a 29% increase in elicited toxicity over state-of-the-art as of 201810.

Despite these challenges, the stakeholders in the healthcare ecosystem are becoming increasingly active and engaged. All the concerns have been the driving force in understanding how to assess health AI risks in real-world settings systematically. A number of AI regulations and standards have been proposed to include bias and model transparency in the risk management framework formally. For example, the US White House Office of Science and Technology Policy has related the Blueprint for an AI Bill of Rights45. FDA has released an action plan for AI/ML as a medical device and good machine learning practice. The National Institute of Standards and Technology (NIST) of the U.S. Department of Commerce released the Artificial Intelligence Risk Management Framework (AI RMF 1.0) and bias standard47. It warrants evaluating AI bias and maintaining responsible AI use with better transparency and bias mitigation schemes. The paper thus sets out to introduce Ethicara, an enablement tool for bias detection and AI risk management. We show the landscape of the related work and describe our implementation and major technical components, discuss our system architecture, and summarize lessons learned and future work.

2. Related Work

2.1. Types of Potential Health AI Bias

Biases typically arise in health AI systems as related to data or from the algorithm itself. Data bias refers to biases in the data used to train an ML model, and these biases persist through the algorithm training to the final predictions. Algorithmic bias refers to the biases introduced in algorithms due to the design choices; it can exist even when the underlying data bias has been mitigated. We focus on the following types of AI biases as studied in the literature11, 12. Representation or sampling bias arises during data collection, when non-representative samples are drawn from a population, or when a non-random sampling approach is introduced. For example, suppose the effectiveness of a drug is determined by a clinical trial where predominantly male participants are included. In that case, the deemed-effective drug can have potentially unintended consequences when prescribed to female patients.

Confounding bias arises when an unmeasured variable correlates with both the dependent and independent variables. Not controlling for confounding variables can induce a false relationship between variables of interest. For example, suppose the effect of a medication on a particular health outcome is being studied without accounting for factors like age and gender. This could lead to an overestimation or underestimation of the drug’s true effect on the outcome.

Algorithmic bias occurs when bias is introduced due to algorithmic design choices such as optimization functions, regularization, and model selection methods. This can lead to biased algorithmic decisions, even if bias is minimized in the input data. For example, a common design flaw is to use a linear model to describe the relationship between input and output data while the true relationship is non-linear.

2.2. AI Bias Detection Methods

To manage the abovementioned bias, first, we need to detect bias resulting in discrimination, which refers to the unjust or prejudicial treatment of individuals based on their membership related to a sensitive attribute, e.g., race, gender, religion, or sexual orientation. In the context of healthcare, this can be translated into the access to care, e.g., to determine whether an elderly is eligible for medical claims in Medicare Advantage.

There are two main types of discrimination: direct and indirect. Direct discrimination is intentional and occurs when sensitive attributes (e.g., age, gender, race) are used explicitly in making decisions, often leading to disparate treatment. It is commonly handled through ensuring individual fairness, i.e., similar individuals who differ on sensitive attributes but share common values should be treated similarly18, 19.

On the other hand, indirect discrimination may not stem from explicit discriminatory actions, but can still lead to unequal or unequitable outcomes. Indirect discrimination can be quantified using metrics such as disparate impact, when the dependence on variables correlated with sensitive attributes results in unequal outcomes across groups13, and disparate mistreatment, when misclassification rates differ among sensitive groups14. As it is difficult to observe directly, fairness metrics have been developed to measure disparate impact or mistreatment through measuring group fairness with statistical parity15. The latter measure requires a similar positive prediction ratio across protected groups and satisfies the independence criterion16 that states the decisions to be independent of any sensitive attribute. One heuristic rule, the 4/5 rule, ensures that the prediction rates for any two groups should not differ by more than 80%17.

This paper focuses on two fairness metrics for bias detection. First, Equalized Odds checks if, for any particular outcome class and sensitive attribute, a model predicts that outcome class equally well for all values of that sensitive attribute14, 20. It can surface disparate mistreatment and accommodate various misclassification measures, such as the overall misclassification rate, false positive rate, false negative rate, false omission rate, and false discovery rates14. Second, Odds Ratio quantifies the strength of the relationship between an exposure (i.e., whether a patient is in a protected group) and an outcome (e.g., access to a care program)21, 22.

2.3. AI Bias Mitigation Methods

With the health AI bias detected, the next step is mitigation. Over the last few years, researchers have proposed bias mitigation methods over the machine learning pipeline across three stages: pre-processing, in-processing, and post-processing23. At this moment, the predicted outcome and sensitive attributes are all captured in structured data, in this subsection, we hence focus on those handling tabular data (rather than image and text) and further describe a few key algorithms applicable for implementation in each stage below.

2.3.1. Pre-processing

Pre-processing bias mitigation methods aim to address bias in the data prior to model training. These methods are recommended for situations where it is acceptable to transform the original dataset. The biggest advantage of pre-processing, is they typically minimize the tradeoff between accuracy and fairness37.

  • Matching: This algorithm mitigates the confounding bias by pairing treatment and control records based on similar covariate values. As such, they estimate treatment effects while reducing the confounders that might exist in data without depending on the selection of a machine learning model24.

  • Disparate impact remover: Whereas matching re-organizes input data without modification, disparate impact remover transforms input data to remove correlations between sensitive and non-sensitive attributes. To preserve the signal of non-sensitive attributes for outcome prediction, researchers propose a partial repair algorithm to move each inverse quantile distribution towards the median distribution based on a selected repair level17.

  • Learning fair representations: This algorithm maintains both group and individual fairness by learning a new representation of the data to map each patient into a probability distribution that obfuscates membership information in the protected group while maintaining the mutual information with the outcome15.

  • Optimized preprocessing: This algorithm aims to maintain group and individual fairness and model utility for outcome prediction. The resulting framework learns a probabilistic transformation function from training data that edits non-sensitive attributes and outcome labels in both the training and testing datasets13.

  • Reweighing: This algorithm assigns weights to the training samples differently to remove disparate impact in training data. They give higher weights to samples in unprivileged protected groups with favorable outcomes, postulating that group membership and outcome should be statistically independent25.

2.3.2. In-processing

In-processing bias mitigation methods aim to reduce bias and increase fairness during model training in two major classes: model desensitization and model constraining. The first class seeks to train models with dual goals: increase model utility for outcome prediction and reduce the likelihood of differentiating based on specific sensitive attributes. Examples include adversarial debiasing algorithm28. The second class performs model selection to satisfy fairness metrics constraints during training. Examples include exponentiated gradient reduction methods, which can learn models to satisfy demographic parity, and grid search reduction methods, which train models with the equalized odds moment. The trade-off for in-processing bias mitigation is model utility and fairness constraint fulfillment26.

  • Rich subgroup fairness: This algorithm learns classifiers focusing on fairness across rich subgroups (defined over the combination of a pre-defined set of sensitive attributes). The goal is to prevent fairness gerrymandering, in which a model appears fair on each group but violates the fairness constraint on one or more structured subgroups defined over the sensitive attributes. Previous work proposes to ensure group fairness and model utility by utilizing cost-sensitive classification oracles to approach the optimal solution27.

  • Adversarial debiasing: The typical setting of algorithms in this category consists of two deep neural networks; the first one, called predictor, is trained to predict the outcome class based on features, and the second, called adversary, is trained to predict the sensitive attribute using the previous network’s output as its input. This architecture aims to maximize the predictor’s accuracy on outcome prediction while simultaneously minimizing the adversary’s ability to identify the sensitive attribute from the predictions. Previous research shows that this architecture helps yield a fairer model regarding ML fairness metrics such as Equality of Odds, since the predictor would carry minimum group discrimination information that an adversary can exploit28.

  • Exponentiated gradient reduction: This line of algorithms learns an ensemble of cost-sensitive, user-specified base estimators trained on the dataset with different sample weights such that the classification error is minimized while simultaneously satisfying the fairness constraint under a true but unknown distribution over the training data. This method casts the task of bias mitigation as a sequential cost-sensitive classification problem that can leverage gradient descent learning; it returns a randomized classifier with the lowest empirical error since it replaces the true classification errors and the true moments with their empirical versions during the training process29.

  • Grid search reduction: This line of algorithms serves as a wrapper for any base-estimator, aiming to find the best estimator that minimizes the prediction error empirically while satisfying the user-defined fairness constraint. Practically, in each iteration, the grid search reduction method trains base-estimator concerning a new, reweighted, and relabeled training data set; as such, users could select the estimator that suits their needs across a wide range of trade-off between model utility and fairness constraints29.

2.3.3. Post-processing

Post-processing bias mitigation methods are applied during the mode deployment stage to tackle the bias that might occur in predictions. This model-agnostic approach is desirable since re-training a model from scratch is time consuming and consecutively costly. The previous reviews have summarized the methods in this stage into three major classes: Outcome Adjustment, Decision Explanation, and Model Adjustment26. First, Outcome Adjustment method modifies the final predictions to yield a favorable outcome for the protected group regarding fairness constraints based on the proximity of each patient to the decision boundary. Second, Decision Explanation utilizes Explainable Artificial Intelligence (XAI) techniques to explain the final decisions and to detect and mitigate unfairness during the model deployment stage. Lastly, Model Adjustment assumes a two-step method, which trains the model first on the unbiased dataset and fine-tunes it on a new biased dataset.

  • Equalized odds postprocessing: This algorithm modifies the predictions to satisfy the Equalized Odds constraint based on the probabilities found solving a linear problem20. Concretely, it equipoises the true positive rates and false positive rates. This guarantees that the same proportion of each group receives the good and bad outcomes. Similarly, the algorithm can satisfy a more relaxed Equal Opportunity constraint, in which the same proportion of each group is guaranteed to receive the “good” outcome.

  • Calibrated equalized odds postprocessing: This algorithm modifies model predictions to minimize disparity across protected and unprotected groups after calibration. Previous research has shown that only the model with perfect accuracy can fulfill Equalized Odds and calibration simultaneously; As a result, a more relaxed notion of Equalized Odds based on the generalized false positive rate or generalized false negative rate is suggested30.

  • Reject option-based classification: This algorithm utilizes the posterior probabilities produced by a multitude of models to perform data labeling that neutralizes the effect of discrimination, assuming that the most discrimination occurs when the model is least certain of the prediction (as determined by its proximity to the decision boundary). Since the goal is to ensure favorable outcomes to unprivileged groups and unfavorable outcomes to privileged groups, the algorithm determines a label by selecting the one that would minimize the loss of reversing the prediction-based decision, e.g., access to a care program to a certain protected group31.

2.4. AI Model Transparency

Besides bias detection and mitigation, model transparency is another prevalent principle in AI Ethics in the literature, aiming to enhance interpretability, explainability, and trust-building in health AI systems32. While Interpretable ML refers to the creation of inherently interpretable models, in this paper we focus more on explainable ML, which focuses on providing post-hoc explanations for black-box models. Models can be considered black-box if they are complex and inscrutable for an individual to understand or are proprietary systems33. There are two notions of explainability – (1) global explainability, which refers to understanding how a model generally behaves over a given population, and (2) local explainability, which refers to understanding how a model will behave for an individual model output35. Local explanation methods, e.g., LIME and SHAP, are the most commonly used and focus on constructing interpretable local approximations34.

LIME, or Local Interpretable Model-Agnostic Explanations, constructs a locally interpretable model to support model explanations. LIME creates perturbed samples from the initial input along with its predictions; it then utilizes new samples to train a linear model to uncover the relationship between a patient’s features and model prediction35. Past studies proposed novel frameworks to ensure consistent and stable explanations, including a Bayesian version of LIME (BayesLIfME), which provides local explanations coupled with credible intervals as a measure of uncertainty34. SHAP, or SHapley Additive exPlanations, is a game theory-based framework to interpret predictions and measure the contribution of each input feature to the model’s predicted outcome. The primary objective of the algorithm is to generate a Shapley value for all features. A Shapley value is defined as the average marginal contribution of a feature among all possible combinations. SHAP supports both individual and global explainability by computing the mean absolute contribution of each feature to the final prediction36.

With the different methods developed to operationalize responsible AI principles for healthcare applications through enablement tools for bias detection, mitigation, and model transparency, the informatics community is now in an advantageous position to create a measurement framework to facilitate the creation of risk controls for enterprise AI risk management. In the following sections of this paper, we will describe the development of Ethicara, which is designed to helps assume the risk-based approach to provision risk controls as the first step of measuring and managing health AI risks at the organizational level.

3. Ethicara Architecture

The Ethicara architecture leverages highly scalable and secure tools and platforms to create a seamless end-to-end workflow for health data processing, modeling, and visualization for healthcare and life science applications. By utilizing cloud-based data systems, collaborative development tools, powerful dashboarding tools, and a scalable code repository, the proposed architecture enables health data scientists and stakeholders to collaborate effectively, analyze complex data, and make informed decisions based on data and insights. Additionally, the customizable and extensible frameworks for orchestration and automation enable the streamlining of complex workflows, making the architecture a highly effective and efficient data processing and modeling solution.

<3.1>. Pipeline of Responsible AI for Health

The responsible AI for health framework consisted of 3 major components: bias detection, bias mitigation, and model transparency. Data privacy and risk assessment were part of a broader regulatory review prior to deployment.

The proposed pipeline, depicted in Figure 2, was created to cover the need to combine health AI bias detection, mitigation, and model transparency techniques in one place, facilitating the health data science workflow while providing effective collaboration, productivity, and unlimited experimentation among data scientists/analysts. Several frameworks for bias detection and mitigation have been proposed, with the most popular being aif36037, fairlearn38, aequitas39, fairness40, fairadapt41, fairmodels42 and MatchIt43 written either in Python or R. Aif360, fairlearn and fairadapt are toolsets containing techniques and algorithms for fairness, the aequitas, fairness and fairmodels libraries are mainly focusing on fairness metrics and standardized visualizations to evaluate group fairness, while MatchIt includes matching algorithms to estimate treatment effects in observational studies. However, none of the packages provide a cohesive roadmap with a simplified user interface for collaborative bias detection and mitigation. Ethicara leverages aif36037, aequitas39 and MatchIt43 libraries to facilitate bias detection and mitigation tasks and BayesLIME34, SHAP36 and explainerdashboard44 tools to increase explainability and model transparency.

Figure 22:

Figure 22:

Pipeline of Responsible AI for Health

In order to support a full range of health AI models that would need to have its fairness and transparency checks for risk controls, Ethicara consists of four major pylons: (1) Data Injection, (2) Data Preparation, (3) Model Training and (4) Model Evaluation. Below we present a detailed description along with an overview of its capabilities.

  1. Data Injection enables users to upload their own data from external files, or extract data from various proprietary and public databases (e.g., EHR, claims, patient-reported outcome, Census for social health determinants).

  2. Data Preparation: To investigate potential sources of health data and algorithmic bias and guide corresponding pre-processing mitigation strategies, Ethicara is provisioned with the following tools and methods.
    • 2.1
      Bias Detection: Using the fairness metrics and detection methods summarized above, users can identify existing bias in the data and the model with the fairness metrics reviewed in the Related Work section.
      Odds ratio plugin/webapp is a bias detection tool for exploratory healthcare data analysis, the user needs to select the sub-groups of interest, e.g., minority ethno-racial groups or intersectional sub-groups, such as Black women, and the outcome variable of interest (e.g., access to care, early diagnosis of a certain rare disease). The system will then output the selected fairness metrics such as odds ratios (including confidence intervals and p-values) and disparate impact for the predefined sub-groups along with their intersections. Additionally, standardized interactive charts are produced with data visualization tools to elaborate results. The tool can be applied both in binary, multiclass classification tasks and multivariate sensitive attributes.
    • 2.2
      Bias Mitigation: Given that the mitigation methods across the three stages would be needed for different healthcare applications, users can choose to perform bias mitigation either through the Matching plugin or using the template Jupyter notebooks that contain the pre-processing bias mitigation algorithms,
      Matching plugin adopts the model-free approach to adjust for factors that determine whether there is selection bias or bias related to medical practice. The user first selects the matching technique (including Exact Matching, Coarsened Exact Matching or Nearest Neighbor Propensity Score Matching), the sub-groups of interest, and the target outcome of interest. The system would then generate a sampled dataset for future analysis and modeling and a summary table with matching results including metrics pre- and post-matching such as standardized mean difference and disparate impact that can be used to evaluate the effectiveness and bias of the algorithm. The tool can be used in binary classification tasks with multivariate sensitive attributes.
      Disparate impact remover handles disparate impact through data transformation while preserving rank-ordering within groups in the preprocessing stage. The main hyperparameter is the repair level, which takes in values from 0, where no repair is applied, to 1, where a full repair is utilized. The user first defines the sensitive attribute; then, a transformed dataset would be generated for modeling along with summary graphs that illustrate the effectiveness of the method with respect disparate impact and balanced accuracy for different repair levels is outputted. The algorithm supports binary classification tasks with binary protected attributes.
      Learning fair representations transforms the input data to guarantee that statistical parity and patient-level individual fairness is achieved while preserving most of the data signal for outcome prediction. The main hyperparameters of the model include Az, Ax, Ay controlling the trade-off between the fairness, the initial data, and the target constraints respectively of the objective function. Users first specify the protected sub-groups and main hyperparameters of the algorithm; the system then generates a transformed dataset along with a plot showing the statistical parity difference and balanced accuracy under certain classification thresholds. It is applicable to binary classification tasks with binary or multiple sensitive attributes and their intersection.
      Optimized preprocessing is a probabilistically remapping pre-processing method, in which the training and test dataset are transformed to approximately follow the data distribution in the original cohort, avoid large changes in patient-level features, and remove the detected sub-group discrimination. Important hyperparameters specified include the e value that preserves the sub-group fairness, clist value which ensures statistically closeness between the transformed and the original distribution (X, Y), and lastly, a user provided distortion function to reduce or avoid large deviations between the original and the mapped values. The fairness metric can be applied to the transformed dataset and compared with its value on original dataset to evaluate the effectiveness of the bias mitigation algorithm. It can support binary classification tasks with binary sensitive attributes and handle multi-class classification tasks using multiple bivariate or multivariate sensitive attributes.
      Reweighting applies such weights to different examples in the training dataset to remove any discrimination with respect to protected attributes in the pre-processing stage. The user first specifies the sub-groups of interest; the system then returns a transformed dataset and a graphical representation that illustrates the trade-off between statistical parity difference and balanced accuracy under certain classification thresholds. It supports binary classification tasks with multiple binary sensitive attributes and their intersections.
  3. Model Training: It incorporates additional bias mitigation techniques applicable during model training.
    • 3.1
      Bias Mitigation: Besides the pre-processing methods, Ethicara also provides four in-processing bias mitigation methods using the templates provided in the Ethicara’s repository.
      Rich subgroup fairness achieves fairness based on the parity of false positive or false negative rates on rich subgroups during the in-processing stage. The main hyperparameters of the model include C- a maximum L1 norm for dual variables and γ- maximum unfairness allowed between sub-groups. The user first selects the subgroup of interest, e.g., minority ethnic-racial groups, and the fairness metric to optimize for; then, the system generates a plot visualizing the errors against unfairness yielded by the different classifiers on the selected subgroup. This method supports the mitigation of models that are trained to handle binary classification problems and binary sensitive attributes (e.g., a binary variable that records whether a patient belongs to a protected racial group or not) and their combinations.
      Adversarial debiasing adopts the deep learning setting of the two networks -- predictor and the adversary -- aiming to satisfy equality in true positive and false positive rate across sub-groups while preserving classification accuracy. A user specifies the privileged and unprivileged sub-groups; the system then trains the predictor and adversary networks. The user can compare multiple fairness metrics between the model without and with adversarial learning to determine the trade-off. It supports binary classification with binary attributes.
      Exponentiated gradient reduction is a model-agnostic algorithm that trains a new classifier iteratively on reweighted and relabeled training dataset. This process simplifies the task of training a machine learning model with fairness constraints into a standardized model training process. The output of this model is a classifier with the best trade-off between classification error and fairness constraint. Considerations include the selection of the base estimator, the accessibility of sensitive attributes during training time (drop_prot_attr) and the eps value which signifies the allowed violation in the fairness constraint provided by the user. It supports binary classification problems with multiple binary sensitive attributes.
      Grid search reduction is an algorithm that generates several different reweighted and possibly relabeled training datasets and trains a base predictor for each one. This algorithm can be considered as a brute force, or non-iterative version of the Exponentiated gradient reduction algorithm. The user selects the estimator based on the trade-off between model’s performance and fairness metric improvement. The main parameters are the base ML estimator, the fairness constraint to be trained on, and the number of Lagrange multipliers to generate in the grid (grid_size). It supports binary classification and regression problems with binary sensitive attributes.
  4. Model Evaluation: Model Evaluation is the last stage of our data science pipeline, enclosing complementary capabilities rather than bias detection and mitigation: model transparency.
    • 4.1
      Bias Detection: In this phase the user can check existing biases in a trained ML model.
      Model Bias Evaluation plugin helps identify automatically any discrepancies found in predictions. This functionality, based on aequitas library, is designed to break down model performance differences by different segments of your data based on the inputs the analyst provides. The plugin outputs a table with model performance metrics (e.g., sensitivity or recall, specificity or true negative rate, precision) of each audited variable sub-group and standardized interactive charts to visualize results. The tool works for binary classification tasks with binary and multivariate sensitive attributes.
    • 4.2
      Bias Mitigation: This section includes post-processing algorithms available after model training.
      Reject option-based classification utilizes a base estimator to relabel its predictions close to the decision boundary with the best label that minimizes the misclassification, assuming that the most discrimination occurs when the model is least certain of the prediction. The user first specifies the upper and lower bound of the fairness metric, the classification thresholds for optimization, accompanied by the ROC margins to be used. The system outputs a dataset with some predictions modified. This supports binary classification problems with binary sensitive attributes and can deal with intersections of sensitive attributes.
      Equalized odds postprocessing changes the predictions to optimize the equality of odds probabilistically during the post-processing stage. The user specifies the privileged and unprivileged sub-groups and the dataset to be trained from; the system returns a dataset with perturbed predicted labels that satisfy the equality odds constraint. This tool works for binary classification problems with binary sensitive attributes.
      Calibrated equalized odds post-processing achieves fairness based on the Generalized False Positive Rate or Generalized False Negative Rate and calibration by editing the model’s predictions during the post-processing stage. The user needs to define the fairness metric and the protected sub-groups; the system will generate a plot with the selected fairness metric with balanced accuracy for different classification thresholds is provided. This supports binary classification tasks with binary or multiple sensitive features and their intersection.
    • 4.3
      Model Transparency: Model transparency promotes limiting complexity and in-depth review of possible model outcomes and scenarios prior to deployment through human-in-the-loop ML (HITL-ML).
      BayesLIME plugin provides local explanations accompanied by confidence intervals for any ML model. The user provides the trained model and bootstrapping samples; then, the plugin outputs a summary table with top explanations with their credible intervals for each patient. The tool can handle binary classification problems.
      SHAP plugin is ideal for interpreting predictions at the global level and for certain sub-groups. Users need to provide the trained model and specify the sub-groups of interest, and the tool provides global summary plots and plots of the selected sub-group and their intersections. It works for binary classification and regression problems and is suitable for tree-based models.
      Model Explainer web app is an interactive web app with multiple tabs, wherein each tab explains a different aspect of the model and its predictions. The main required input is a trained ML model and outputs various plots, including SHAP summary, interaction plots, and partial dependence plots.

<3.2>. Plug-in Development: Early Identification of Patients at High Risk

A typical plug-in development scenario is for patient identification predictive tasks for early identification of patients at high risk of a certain condition. Bias detection and mitigation are especially important for rare diseases wherein insufficient patient data in each potential subgroup exist to warrant the development of well-performing models. Plugins are selected as the go-to approach for our digital solutions due to their flexibility. Plug-ins contain one or more reusable components that extend the functionality of an application and enhance the available data science capabilities to the users. Plugins are popular for their extensibility, parallel development, and simplicity, and through sharing, can facilitate collaboration among users and developers, leading to continuous improvement.

A concrete example of an Ethicara plug-in is given in Figure 3, illustrating the input data, the user interface, outputs, and visualizations of the odds ratio tool of a certain cohort. Initially, the user needs to provide the data input that includes the sensitive attributes related to the sub-groups of interest (e.g., age, gender, ethnic group) and the target outcome (e.g., high-risk condition to be detected, access to care, early diagnosis, potential deterioration, re-admission risk) and initialize the plugin. In the plugin’s UI user needs to define the main parameters, such as the target variable, the required sensitive attributes along with the corresponding privileged sub-groups by typing their values in a key-value mapping structure, the confidence level, and the intersectionality option in case all possible combinations of sensitive attributes are of interest by ticking the intersection’s checkbox. Next, the main output of the tool is provided in tabular format with the corresponding summary metrics (such as odds ratios, p-values, confidence intervals and disparate impact) incorporated in the workflow to facilitate results interpretation. Finally, a standardized interactive visualization template is provided to visualize results and detect potential disparities among sub-populations.

Figure 3.

Figure 3.

User Interface/Output of Odds Ratio Plugin

<3.3>. Web-app Development: Patient Journey Understanding

A typical scenario for web-app development is for patient journey understanding that could improve patient experience by integrating the model transparency capability in interactive tools to allow self-analytics for identifying potential disparate impact at the sub-group and individual patient levels. Ethicara provides web apps for a seamless user experience to construct their own healthcare applications. It provides custom applications hosted by collaborative development platforms and offers advanced visualizations or custom applicative frontends. Unlike plug-ins, web apps are interactive and update graphs or tables based on user input. Additionally, different functionalities can be organized and presented in different tabs making the tool user-friendly and efficient.

An illustrative example is given in Figure 4, which elaborates the input data, the user interface, and the outputs of Model Explainer web app. The user provides a trained ML model, a test set, and a specified number of features to produce explanations. The UI consists of tabs to perform various functions such as feature importance, individual predictions, what-if analysis, and feature interactions. The users can interact with each tab by setting the desired parameters and update visualizations or tables in real-time. The first tab produces feature importance plots based on permutation importance or SHAP values. In addition, the number of features to be visualized can also be specified.

Figure 4.

Figure 4.

User Interface/Output of Model Explainer WebApp

The gist of providing such collaborative application-level support is evaluate AI/ML models for potential biases and help identify potential disparities in model performance in various patient sub-populations. Developers in the organization can then leverage Ethicara to compare model performance for patients from different age groups, sexes, races, and regions based on data availability.

4. Conclusion and Future Work

With the enablement tool developed to support the enablement of responsible AI for healthcare via bias detection, mitigation, and model transparency, we are now in an advantageous position of creating controls for health AI risk management. Currently, the Ethicara tool does not handle the generative AI bias detection and risk management. We will report on a separate review of the AI Bias mitigation methods for generative AI models handling unstructured data in another publication. Meanwhile, the regulators and the standard bodies have actively engaged the community and stakeholders for discussion and release of various guidelines. These frameworks provide guidance on how to integrate risk management activities into the system development life cycle. In the future, we will further extend the development of Ethicara to continue operating more risk controls and assume the industry standard in risk management framework to map, measure, and manage health AI risks, which is paramount to initiate and sustain responsible use of AI in the healthcare ecosystem. We expect this is the beginning of a necessary step to operationalize responsible health AI adoption in practice and address the important health equity issues brought by inherent bias and non-transparency. The premise is that by providing means and an integration framework for risk controls within a data-centric modeling pipeline at the organization level, the enablement mechanism can co-create the values with the generation of more collaborative medical AI applications that considers various biases across its lifecycle -- from development to deployment to post-production monitoring.

Figure 1:

Figure 1:

Ethicara Architecture

References

  • 1.Shortliffe EH. Artificial Intelligence in Medicine: Weighing the Accomplishments, Hype, and Promise. Yearb Med Inform. (2019);2019 Aug;28(1):257–262. doi: 10.1055/s-0039-1677891. doi: 10.1055/s-0039-1677891. Epub 2019 Apr 25. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 2.Wong A, Otles E, Donnelly JP, et al. External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients. JAMA Intern Med. (2021);2021 Aug 1;181(8):1065–1070. doi: 10.1001/jamainternmed.2021.2626. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 3.Reddy S, Rogers W, Makinen V, et al. Evaluation framework to guide implementation of AI systems into healthcare settings. BMJ Health & Care Informatics. (2021);2021;28:e100444. doi: 10.1136/bmjhci-2021-100444. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 4.Pierson E, Cutler DM, Leskovec J, Mullainathan S, Obermeyer Z. An algorithmic approach to reducing unexplained pain disparities in underserved populations. Nature Medicine. (2021);27(1):136–140. doi: 10.1038/s41591-020-01192-7. [DOI] [PubMed] [Google Scholar]
  • 5.Cerrato P, Halamka J, Pencina M. A proposal for developing a platform that evaluates algorithmic equity and accuracy. BMJ Health Care Inform. (2022);2022;29:e100423. doi: 10.1136/bmjhci-2021-100423. doi:10.1136/ bmjhci-2021-100423. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 6.Wu E, Wu K, Daneshjou R, Ouyang D, Ho DE, Zou J. How medical AI devices are evaluated: limitations and recommendations from an analysis of FDA approvals. Nature Medicine. (2021);2021 27:4, 27(4):582–584. doi: 10.1038/s41591-021-01312-x. [DOI] [PubMed] [Google Scholar]
  • 7.Meskó B, Görög M. A short guide for medical professionals in the era of artificial intelligence. Npj Digital Medicine. (2020);3(1) doi: 10.1038/s41746-020-00333-z. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 8.Allen B, Agarwal S, Coombs L, Wald C, Dreyer K. 2020 ACR Data Science Institute Artificial Intelligence Survey. Journal of the American College of Radiology. (2021);18(8):1153–1159. doi: 10.1016/j.jacr.2021.04.002. [DOI] [PubMed] [Google Scholar]
  • 9.Andrews W, Hare J. Lessons From Early AI Projects. Gartner. (2017).
  • 10.Daniel Z, Nestor M, Erik B, John E, Terah L, et al. The AI Index 2022 Annual Report. AI Index Steering Committee, Stanford Institute for Human-Centered AI, Stanford University, March 2022. (2022).
  • 11.Mehrabi N, Morstatter F, Saxena N, Lerman L, Galstyan A. A survey on bias and fairness in machine learning. arXiv. (2019);1908.096 [Google Scholar]
  • 12.Suresh H, Guttag JV. A Framework for Understanding Unintended Consequences of Machine Learning. Proceedings of Equity and Access in Algorithms, Mechanisms, and Optimization. (2019). ACM.
  • 13.Calmon F, Wei D, Vinzamuri B, Ramamurthy KN, Varshney KR. Optimized pre-processing for discrimination prevention. Proceedings of Advances in Neural Information Processing Systems. (2017). pp. 3992–4001.
  • 14.Zafar M, Valera I, Rodriguez M, Gummadi K. Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment. Proc. Intl. Conference on WWW. (2017). pp. 1171–1180.
  • 15.Zemel R, Wu Y, Swersky K, Pitassi T, Dwork C. Learning fair representations. Proc. Int. Conf. Mach. Learn. (2013). pp. 325–333.
  • 16.Barocas S, Hardt M, Narayanan A. Fairness and Machine Learning. (2019). MIT Press.
  • 17.Feldman M, Friedler S, Moeller J, Scheidegger C, Venkatasubramanian S. Certifying and removing disparate impact. Proc. ACM SIGKDD Int. Conf. Knowl. Disc. Data Min. (2015). pp. 259–268.
  • 18.Castelnovo A, Crupi R, Greco G, Regoli D, Penco IG, Cosentini AC. A clarification of the nuances in the fairness metrics landscape. Scientific Reports. (2022);12, 1(2022):1–21. doi: 10.1038/s41598-022-07939-1. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 19.Dwork C, Hardt M, Pitassi T, Reingold O, Zemel R. Fairness through awareness. Proceedings of the 3rd Innovations in Theoretical Computer Science Conference. (2012). pp. 214–226. ACM.
  • 20.Hardt M, Price E, Srebro N. Equality of opportunity in supervised learning. Proc. of NIPS. (2016). pp. 3315–3323.
  • 21.Szumilas M. Explaining odds ratios. J Can Acad Child Adolesc Psychiatry. (2010);Aug;19(3):227–9. [PMC free article] [PubMed] [Google Scholar]
  • 22.Bland JM, Altman DG. Education and debate-statistics notes: the odds ratio. BMJ. (2000);320:1468–1468. doi: 10.1136/bmj.320.7247.1468. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 23.Max H, Zhenpeng C, Jie MZ, Federica S, Mark H. Bias mitigation for machine learning classifiers. CoRR. (2022). (2022). abs/2207.07068.
  • 24.Elizabeth AS. Matching methods for causal inference. Journal of the Institute of Mathematical Statistics. (2010);25(1):1. doi: 10.1214/09-STS313. (2010) [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 25.Kamiran F, Calders T. Data preprocessing techniques for classification without discrimination. Knowl Inf Syst. (2012);33:1–33. [Google Scholar]
  • 26.Feng Q, Du M, Zou N, Hu X. Fair Machine Learning in Healthcare: A Review. arXiv preprint arXiv:2206.14397. (2022). (2022)
  • 27.Kearns M, Neel S, Roth A, Wu ZS. Preventing Fairness Gerrymandering: Auditing and Learning for Subgroup Fairness. Proc. International Conference on Machine Learning. pp. 2564–2572.
  • 28.Zhang BH, Lemoine B, Mitchell M. Mitigating Unwanted Biases with Adversarial Learning. Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society (AIES ’18) (2018).
  • 29.Agarwal A, Beygelzimer A, Dudik M, Langford J, Wallach H. A Reductions Approach to Fair Classification. In Proc. International Conference on Machine Learning. (2018). 2018.
  • 30.Pleiss G, Raghavan M, Wu F, Kleinberg J, Weinberger KQ. On fairness and calibration. Proc. of NIPS. (2017). pp. 5680–5689.
  • 31.Kamiran F, Karim A, Zhang X. Decision Theory for Discrimination-Aware Classification. Proc. IEEE ICDM. (2012). 2012.
  • 32.Jobin A, Ienca M, Vayena E. The global landscape of AI ethics guidelines. Nat. Mach. Intell. (2019);1(9):389–393. [Google Scholar]
  • 33.Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence. (2019);1(5):206–215. doi: 10.1038/s42256-019-0048-x. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • 34.Slack D, Hilgard A, Singh S, Lakkaraju H. Reliable post hoc explanations: Modeling uncertainty in explainability. Proc. Advances in Neural Information Processing Systems. (2012);34 2021. [Google Scholar]
  • 35.Ribeiro MT, Singh S, Guestrin C. Why Should I Trust You?: Explaining the Predictions of Any Classifier. Proc. of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. (2016). pp. 1135–1144.
  • 36.Lundberg SM, Lee SI. A Unified Approach to Interpreting Model Predictions. Proc of Adv Neural Inf Process Syst. (2017);30:4765–4774. [Google Scholar]
  • 37.Bellamy RK, Dey K, Hind M, Hoffman SC, Houde S, Kannan K, Lohia P, Martino J, Mehta S, Mojsilovic A, Nagar S, Ramamurthy KN, Richards J, Saha D, Sattigeri P, Singh M, Varshney KR, Zhang Y. AI Fairness 360: An extensible toolkit for detecting, understanding, and mitigating unwanted algorithmic bias. arXiv preprint arXiv:1810.01943. (2018). (2018)
  • 38.Bird S, Dudík M, Edgar R, Horn B, Lutz R, Milan V, Sameki M, Wallach H, Walker K. Fairlearn: A toolkit for assessing and improving fairness in AI. (2020). Technical Report MSR-TR-2020-32, Microsoft.
  • 39.Saleiro P, Kuester B, Stevens A, Anisfeld A, Hinkson L, London J, Ghani R. Aequitas: A bias and fairness audit toolkit. arXiv preprint arXiv:1811.05577. (2018). (2018)
  • 40.Kozodoi N, Varga TV. fairness: Algorithmic Fairness Metrics. (2021). CRAN - Package fairness (r-project.org). R package version 1.2.1.
  • 41.Plêcko D, Meinshausen N. Fair data adaptation with quantile preservation. JMIR. (2019);vol. 21:1–44. [Google Scholar]
  • 42.Wiśniewski J, Biecek P. fairmodels: a Flexible Tool for Bias Detection, Visualization, and Mitigation in Binary Classification Models. arXiv preprint arXiv:2104.00507. (2022). 2021.
  • 43.Ho D, Imai K, King G, Stuart E. Matchit: nonparametric preprocessing for parametric causal inference. J. Stat. Softw. (2011);42(8):1–28. [Google Scholar]
  • 44.Dijk O. ExplainerDashboard documentation. https://explainerdashboard.readthedocs.io/en/latest/ Accessed: 2023-03-06.
  • 45.OSTP White House Blueprint for an AI Bill of Rights. http://www.whitehouse.gov/ostp/ai-bill-of-rights/ Accessed March 17, 2023.
  • 46.Cameron FK. NIST AI Risk Management Framework plants a flag in the AI debate. (2023). Brookings Institute, Accessed 2023-03-17.
  • 47.Schwartz R, Vassilev A, Greene KK, Perine L, Burt A, Hall P. Towards a Standard for Identifying and Managing Bias in Artificial Intelligence. NIST Special Publication. (2022);127 [Google Scholar]

Articles from AMIA Annual Symposium Proceedings are provided here courtesy of American Medical Informatics Association

RESOURCES