Abstract
Child abuse and neglect are public health issues impacting communities throughout the United States. The broad adoption of electronic health records (EHR) in health care supports the development of machine learning–based models to help identify child abuse and neglect. Employing EHR data for child abuse and neglect detection raises several critical ethical considerations. This article applied a phenomenological approach to discuss and provide recommendations for key ethical issues related to machine learning–based risk models development and evaluation: (1) biases in the data; (2) clinical documentation system design issues; (3) lack of centralized evidence base for child abuse and neglect; (4) lack of “gold standard “in assessment and diagnosis of child abuse and neglect; (5) challenges in evaluation of risk prediction performance; (6) challenges in testing predictive models in practice; and (7) challenges in presentation of machine learning–based prediction to clinicians and patients. We provide recommended solutions to each of the 7 ethical challenges and identify several areas for further policy and research.
Keywords: child abuse and neglect, phenomenological ethics, machine learning–based risk models, pediatric emergency departments, electronic health records
INTRODUCTION
Child abuse and neglect are public health issues that have reached epidemic proportions with more than 700 000 children abused every year in the United States.1 Child abuse and neglect is defined as any action (physical, emotional, and/or sexual) by a caregiver that results in harm, potential harm, or threat of harm to a child.2 There are limited prognostic tools to assess child abuse and neglect in clinical practice3 and health professionals frequently do not detect potential cases of abuse or neglect.4 Clinician racial bias has been shown to contribute to overreporting of abuse directed at communities of color.5 More than 45% of the children with confirmed abuse and neglect reports were Black or Latinx in 2019 in the United States.6
Widespread adoption of electronic health records (EHR) can support the development of clinical decision support systems for identifying child abuse and neglect, thus offering unique insights regarding this phenomenon.7,8 Our team is conducting an exploratory study aimed at developing a machine learning–based risk model to identify potential child abuse and neglect and reduce racial bias in reporting by processing pediatric emergency department (ED) EHR data. Utilizing EHR data for child abuse and neglect detection raises several critical ethical considerations.9 This article discusses key ethical issues related to machine learning–based risk model development and evaluation that emerged during our exploratory study.
We base our analysis on the emerging recommendations for ethical healthcare machine learning development10 and frame our work in phenomenological ethics, which encourages interdisciplinary research11 according to the principles of self-responsibility in the co-constitutive relationship between the new technology and its users.12–14 Some of the key questions we pose: is it possible to create an ethically responsible machine learning–based risk model that will prioritize the well-being of children, their families, and the clinicians working with them? Furthermore, how can we adjust the machine learning–based risk model to account for known racial and other user biases?
To provide insights related to these questions, we used a phenomenological ethical lens to: (1) identify and describe key issues in development of a machine learning–based risk model and (2) provide a set of recommendations to address these ethical challenges.
KEY ISSUES IN MACHINE LEARNING–BASED MODEL DEVELOPMENT WITH ETHICAL IMPLICATIONS
When working with machine learning–based risk models we face 2 main types of ethical concerns: epistemic (do I actually know what I am doing?) and normative (what normative direction is the algorithm taking?).15 Both epistemic and normative concerns were used to identify the 7 key issues described below under 2 general areas.
Area 1—Practicality of machine learning–based risk models: limitations in the available data
Key issue (1)—Biases in the data: Different types of biases based on care giver’s profession, address, educational level, and socioeconomic status might affect the likelihood of suspicion and report of child abuse and neglect.16 Specifically, there is an emerging body of evidence that racial biases contribute to reporting of child abuse and neglect17,18 and addressing it in the healthcare system.19 Because current reporting trends are skewed towards racial minorities, existing data extracted from secondary databases will likely include inherent racial biases.20 For example, using data from the ED EHR might result in an overestimation of child abuse and neglect cases among Black and Latinx families. This overestimation hinders our ability to create unbiased machine learning–based risk models using secondary data.
Key issue (2)—Clinical documentation system design issues: EHRs were originally designed to support healthcare billing and clinical workflows.21 As a result, contemporary hospital EHRs have diverse documentation templates, structured and unstructured, to document health and social factors related to child abuse and neglect.22,23 For example, clinicians often use unstructured documentation (eg, social work and physician notes) to indicate suspicion of child abuse and neglect.
Most medical documentation of child abuse and neglect is in unstructured narrative clinical notes, thus requiring natural language processing to identify the suspected cases and potential risk factors. There is little standardization in collected data, and there are different documentation policies adopted across healthcare settings. This affects our ability to create generalizable and easily shareable machine learning–based risk models that might work in various health settings.24
Key issue (3)—Lack of centralized evidence base for child abuse and neglect: Although child abuse and neglect are widespread in society, little systematic nationwide evidence exists about its risk factors.25 Child Protective Services—state entities charged with addressing child abuse and neglect—are not required to report their data publicly. Our knowledge of risk factors is mostly based on past studies conducted via expert consensus.26,27 This limited evidence influences our ability to identify a prespecified list of risk factors potentially associated with child abuse and neglect.
Key issue (4)—Lack of “gold standard” in assessment and diagnosis of child abuse and neglect: Currently available national data compiled by the Children Defense Fund indicate vast disparities in the rates of reported child maltreatment cases between states; for example, Kentucky’s average rate per 1000 children is 23.5 cases versus Pennsylvania’s average rate of 1.8 cases.28 This more than 10-fold difference of reported rates in child abuse and neglect is likely not a true difference, but rather a reflection of contrasts in definitions, prevailing culture, child welfare resources, and reporting practices. We also anticipate large differences in numbers of reported cases when EHR data from individual or groups of hospitals are extracted. Coupled with previous issues (key issues 1–3), this lack of concrete outcomes data greatly affects our ability to build “gold standard” datasets that can serve as ground truth for the development of high-quality machine learning–based risk models.
Area 2—Ethics of risk models: machine learning creation challenges
Key issue (5)—Challenges in evaluation of risk prediction performance: Machine learning–based risk models are often evaluated by using predictive performance that includes precision, recall, and other related metrics. Even if a risk model with 99% precision is developed, it is still going to err in 1% of cases. In reality, many clinical machine learning–based risk models do not achieve such high-risk predictive performance; for example, in a recent study for developing a child abuse and neglect machine learning–based risk model, only 16% of cases identified as at risk for abuse were reported by clinicians to child protective services.23 In the domain of child abuse and neglect, these machine learning–based risk model errors can potentially introduce severe social and emotional consequences for the families who are found to be suspicious for abuse and neglect.29
Further, traditional machine learning–based risk model performance evaluation metrics (eg, precision, recall, and especially F-score) presume that false positives and false negatives are equally undesirable. This assumption of equality of trade-off has not been established or validated for the domain of child abuse and neglect.
Key issue (6)—Challenges in testing predictive models in practice: Conducting a randomized controlled trial is often referred to as a “gold standard” of medical evidence generation. However, conducting a randomized controlled trial of child abuse and neglect machine learning–based risk models may be unethical; in the control group where the risk prediction model would not be applied, there is a risk of missing potential child abuse and neglect cases that might have been identified by machine learning–based risk models. The consequences of missing a potential case are very severe as well as wrongly identifying a case. This affects our ability to understand the impact of the implemented predictive models.
Key issue (7)—Challenges in presentation of machine learning–based prediction to clinicians and patients: Some machine learning–based models (eg, logistics regression or decision trees) are explainable as they produce a list of risk factors associated with negative outcomes. Other models (eg, neural networks) produce predictions of risk with limited ability to identify specific risk factors associated with negative outcomes. Our preliminary results of qualitative interviews with clinicians24 suggest that suspicion of child abuse and neglect is often documented by clinicians in an indirect manner, via specific and sometimes vague language and can be extracted via certain test-orders patterns (eg, orders for certain X-rays). Moreover, machine learning models can base their risk predictions on variables that do not have any intuitive causal relationship to the outcome. As a result, some predictive risk factors might not be easily explainable (or even make clinical sense) and presentable to the clinicians and patients who will need further clarification of the results of the machine learning–based risk models.
Beyond ethical issues, there will be cases where clinicians will either neglect the risk prediction or will proceed contrary to high-risk or low-risk prediction. In those cases, mandatory reporting of maltreatment laws may leave clinicians who overlook a prediction of high risk in legal jeopardy, particularly if a patient comes to harm. This scenario represents new sets of ethico-legal concerns that will need to be addressed when integrating machine learning–based risk predictions into clinical settings.
DISCUSSION
Considering the key issues identified above, stakeholders involved in the machine learning–based risk model development, testing, and implementation need to be aware and responsible for this new technology’s impact. Below, we provide a set of recommendations that can assist in developing such models.
Practicality if creating machine learning–based models
1. Racial biases in the data: Ensuring the diversity of data used to train machine learning–based risk models is central to our ability to decrease machine learning–based risk model biases. However, simply providing diverse training data would not eliminate racial biases. We recommend that researchers design machine learning–based risk models in collaboration with individuals representing diverse lived experiences30 and with different groups of stakeholders, including clinicians, patients, and guardians.31 For example, simulation-based co-creation of models can encourage responsible designing of the prediction for all the parties involved by creating user stories and personas, modeling assumptions, and discussing simulated effects.32 Second, artificial intelligence researchers should be thoughtful about applying analytical methods adjusted for biased data, for example over and under sampling33 and scoring of the module penalization.34 Third, racial and ethnic concordance between patients and clinicians is an important consideration in documentation. Past research has shown that racial concordance can improve communication and information exchange.35 Fourth, stigmatizing language in medical documentation is also a potential source of bias,36 and because of interdisciplinary (physicians, nurses, social workers) differences in documentation styles within EHR,37 researchers should develop tools that provide real-time clinical feedback and recommendations regarding documentation styles and clinical decisions concerning child abuse and neglect. For example, one study flagged unapproved medication abbreviations in clinical notes, which resulted in improving the clarity of clinical documentation.38 Similarly, “flagging” racially biased or otherwise biased language can help to address some of the racial biases. Finally, involving patients in the inputting of demographic data can help to ensure an accurate reflection of their personal identities.39
2. Clinical documentation system design issues: To improve the efficiency of child abuse and neglect assessment, we recommend involving multiple stakeholders such as caregivers, social workers, and patient advocates to assist in designing EHR. The involvement of stakeholders can help improve workflows around medical documentation and redesign how/when data are collected in the ED.40 Furthermore, more standardization in the data elements is necessary for child abuse and neglect risk assessment. Additionally, we recommend using semi-automated tools, like natural language processing, to identify child abuse and neglect risk factors. These applications can help reduce some of the barriers associated with documentation issues.
3. Lack of centralized evidence base for child abuse and neglect: More qualitative and quantitative research, beyond expert consensus, is needed to understand the risk factors for child abuse and neglect. This exploration should be done thoroughly to avoid the impact of current biases and differences of child abuse and neglect definitions.
4. Lack of “gold standard” in assessment and diagnosis of child abuse and neglect: Healthcare systems (including hospitals and primary care) and Child Protective Services across the United States should establish safe and reliable data sharing pathways and develop standardized definitions of child abuse and neglect.41 Insights based on these data are crucial for understanding current child abuse and neglect rates, describing the risk factors, and analyzing the potential racial disparities in reporting. Further, more efforts should aim to create potential linkages between diverse data sources. For example, clinical data from hospitals need to be linked with outpatient social service data and data from Child Protective Services; such linkages will potentially allow the generation of “gold standard” datasets from which high-quality machine learning–based risk models can be developed. These linkages between data sources need to be implemented in a responsible and ethical manner so they are not used to further propagate biases (eg, racial) and enable optimized targeting of certain populations.
Evaluation of machine learning–based risk models
5. Challenges in evaluation of risk prediction performance: To reduce risk prediction error, the evaluation of machine learning–based risk models requires rigorous studies that should be valid and reproducible across diverse populations and over time.31 Machine learning–based risk model evaluation requires developing international standards similar to the STROBE guidelines.42 There are multiple definitions, measures, and metrics of incorporating fairness into creation and evaluation of machine learning–based models. To evaluate fairness in developing a machine learning–based risk model in child abuse and neglect, one can potentially use an approach that compares groups (eg, racial/ethnic or income groups) by taking into account potential underlying differences between groups; such an approach is also referred to as “confusion matrix-based metrics.”43 Additional fairness approaches should also be examined per definitions provided by Caton and Haas.43 Finally, machine learning–based risk models developers should allow independent evaluation and error analysis throughout the development of the model. This evaluation should be conducted by collaborators that can provide feedback and are not part of the machine learning–based risk model development team.44
6. Challenges in testing predictive models in the real world: When randomized clinical trials might not be feasible, an alternative study design, such as an observational or semi-experimental design, might be applied when testing the machine learning–based risk models. In addition, we recommend first validating the model against a training dataset in an internal cross-validation process. Afterward, the machine learning–based risk model should be tested in different clinical settings or different periods from the same setting.45
7. Challenges in presentation of prediction results to clinicians: We recommend designing “high-stakes” decision risk models (such as child abuse and neglect machine learning–based risk model) initially as interpretable.46 Designing machine learning–based risk models with high interpretability and transparency is essential for providing clinicians and caregivers a clear understanding of how the machine learning–based risk models are working and why decisions are made.47 Furthermore, it is vital that results be presented with risk estimates—not as absolute recommendation as the impact of false-positive results can have a long and negative effect on families assessed with child abuse and neglect.
CONCLUSIONS
In agreement with the principles of phenomenological ethics, we encourage responsible use of technology in order to respect the beneficence of the relationship between patients and clinicians, respect the autonomy of the users and achieve social justice for the children, potential victims of neglect and abuse. We believe that clinicians using the new technology still need to be responsible for their final decisions in major decisional steps, such as reporting to child protective services. Even if the machine learning–based risk model makes the work of the clinicians faster and more effective, they will need to remain ethically committed to understanding the relationship between users and technologies48–50 and see where their role is still crucial to keep a compassionate and just social environment.
FUNDING
This work was supported by the Data Science Institute Seed Funds Program at Columbia University. AL is the recipient of a postdoctoral scholarship from the Haruv Institute.
AUTHOR CONTRIBUTIONS
AL wrote the draft and facilitated group discussion. SF, AB, KC, NA, DP, SS, and MT reviewed and refined the draft and finalized the article.
CONFLICT OF INTEREST STATEMENT
None declared.
DATA AVAILABILITY
No new data were generated or analyzed in support of this research.
Contributor Information
Aviv Y Landau, Columbia University Data Science Institute, Columbia University School of Nursing, Columbia University, New York, New York, USA.
Susi Ferrarello, Department of Philosophy & Religious Studies, California State University, Hayward, California, USA.
Ashley Blanchard, New York Presbyterian Morgan Stanley Children’s Hospital, Columbia University Irving Medical Center, New York, New York, USA.
Kenrick Cato, Department of Emergency Medicine, Columbia University School of Nursing, Columbia University, New York, New York, USA.
Nia Atkins, Columbia College, New York, New York, USA.
Stephanie Salazar, Columbia School of Social Work, Columbia University, New York, New York, USA.
Desmond U Patton, Columbia School of Social Work, Columbia University, New York, New York, USA.
Maxim Topaz, Columbia University Data Science Institute, Columbia School of Social Work, Columbia University, New York, New York, USA.
REFERENCES
- 1.National Child Abuse Statistics from NCA. National Children's Alliance. 2019. https://www.nationalchildrensalliance.org/media-room/national-statistics-on-child-abuse/. Accessed November 19, 2020.
- 2. Lev-Wiesel R, Eisikovits Z, First M, Gottfried R, Mehlhausen D. Prevalence of Child Maltreatment in Israel: A National Epidemiological Study. J Child Adolesc Trauma 2018; 11 (2): 141–50. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Pandya NK, Baldwin KD, Wolfgruber H, Drummond DS, Hosalkar HS. Humerus fractures in the pediatric population: an algorithm to identify abuse. J Pediatr Orthop B 2010; 19 (6): 535–41. [DOI] [PubMed] [Google Scholar]
- 4. Lev-Wiesel R, First M, Gottfried R, Eisikovits Z. Reluctance versus urge to disclose child maltreatment: the impact of multi-type maltreatment. J Interpers Violence 2019; 34 (18): 3888–914. [DOI] [PubMed] [Google Scholar]
- 5. Najdowski CJ, Bernstein KM. Race, social class, and child abuse: content and strength of medical professionals’ stereotypes. Child Abuse Negl 2018; 86: 217–22. [DOI] [PubMed] [Google Scholar]
- 6.Child Maltreatment 2019. 2020. https://www.acf.hhs.gov/cb/research-data-technology/statistics-research/child-maltreatment. Accessed November 19, 2020.
- 7. Annapragada AV, Donaruma-Kwoh MM, Annapragada AV, Starosolski ZA. A natural language processing and deep learning approach to identify child abuse from pediatric electronic medical records. PLoS One 2021; 16 (2): e0247404. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 8. Berger RP, Saladino RA, Fromkin J, Heineman E, Suresh S, McGinn T. Development of an electronic medical record–based child physical abuse alert system. J Am Med Inform Assoc 2018; 25 (2): 142–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Cato KD, Bockting W, Larson E. Did I tell you that? Ethical issues related to using computational methods to discover non-disclosed patient characteristics. J Empir Res Hum Res Ethics 2016; 11 (3): 214–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Char DS, Shah NH, Magnus D. Implementing machine learning in health care – addressing ethical challenges. N Engl J Med 2018; 378 (11): 981–3. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 11. Arvidson PS. Interdisciplinary research and phenomenology as parallel processes of consciousness. Issues Interdiscip Stud 2016; 34: 30–51. [Google Scholar]
- 12. Heidegger M. The Question Concerning Technology: And Other Essays. New York: Garland; 1977. [Google Scholar]
- 13. Dreyfus HL. What Computers Still Can't Do: A Critique of Artificial Reason. Cambridge, MA: MIT Press; 2009. [Google Scholar]
- 14. Coeckelbergh M. AI Ethics. Cambridge, MA: MIT Press; 2020. [Google Scholar]
- 15. Mittelstadt BD, Allo P, Taddeo M, Wachter S, Floridi L. The ethics of algorithms: mapping the debate. Big Data Soc 2016; 3 (2): 205395171667967. [Google Scholar]
- 16. Tiyyagura G, Gawel M, Koziel JR, Asnes A, Bechtel K. Barriers and Facilitators to Detecting Child Abuse and Neglect in General Emergency Departments. Ann Emerg Med 2015; 66 (5): 447–54. [DOI] [PubMed] [Google Scholar]
- 17. Nygren P, Nelson HD, Klein J. Screening children for family violence: a review of the evidence for the US preventive services task force. Ann Fam Med 2004; 2 (2): 161–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 18. Drake B, Jolley JM, Lanier P, Fluke J, Barth RP, Jonson-Reid M. Racial bias in child protection? A comparison of competing explanations using national data. Pediatrics 2011; 127 (3): 471–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 19. Benjamin R. Assessing risk, automating racism. Science 2019; 366 (6464): 421–2. [DOI] [PubMed] [Google Scholar]
- 20. Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 2019; 366 (6464): 447–53. [DOI] [PubMed] [Google Scholar]
- 21. Cowie MR, Blomster JI, Curtis LH, et al. Electronic health records to facilitate clinical research. Clin Res Cardiol 2017; 106 (1): 1–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. McGinn T, Feldstein DA, Barata I, et al. Dissemination of child abuse clinical decision support: moving beyond a single electronic health record. Int J Med Inform 2021; 147: 104349. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Rosenthal B, Skrbin J, Fromkin J, et al. Integration of physical abuse clinical decision support at 2 general emergency departments. J Am Med Inform Assoc 2019; 26 (10): 1020–9. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 24. Landau AY, Blanchard A, Salazar S, et al. Considerations for development of child abuse and neglect phenotype with prioritizing reduction of racial Bias: a qualitative study. In: AMIA Virtual Clinical Informatics Conference; May 18–20, 2021. (Online). https://www.researchgate.net/publication/353340364_Considerations_for_Development_of_Child_Abuse_and_Neglect_Phenotype_with_Prioritizing_Reduction_of_Racial_Bias_A_Qualitative_Study. Accessed July 19, 2021. [DOI] [PMC free article] [PubMed]
- 25. Petersen AC, Joseph J, Feit MN. New Directions in Child Abuse and Neglect Research. Washington, DC: The National Academies Press; 2014. [PubMed] [Google Scholar]
- 26. Laposata ME, Laposata M. Children with signs of abuse. Am J Clin Pathol 2005; 123: S119–24. [DOI] [PubMed] [Google Scholar]
- 27. Jack G. Discourses of child protection and child welfare. Br J Social Work 1997; 27 (5): 659–78. [Google Scholar]
- 28.The state of America’s children 2020 – child welfare tables. 2020. https://www.childrensdefense.org/policy/resources/soac-2020-child-welfare-tables/. Accessed September 2, 2021.
- 29. Krawiec C, Gerard S, Iriana S, Berger R, Levi B. What We Can Learn From Failure: An EHR-Based Child Protection Alert System. Child Maltreat 2020; 25 (1): 61–9. [DOI] [PubMed] [Google Scholar]
- 30. Patton DU. Social work thinking for UX and AI design. Interactions 2020; 27 (2): 86–9. [Google Scholar]
- 31. Wiens J, Saria S, Sendak M, et al. Do no harm: a roadmap for responsible machine learning for health care. Nat Med 2019; 25 (9): 1337–40. [DOI] [PubMed] [Google Scholar]
- 32. Dennerlein S, Kowald D, Pammer-Schindler V, Lex E, Ley T. Simulation-based co-creation of predictive models. In: CEUR Workshop Proceedings. September 3, 2018; Leeds, United Kingdom. http://www.dominikkowald.info/documents/2018ectel_codesign.pdf. Accessed July 16, 2021.
- 33. Ong M-S, Magrabi F, Coiera E. Automated identification of extreme-risk events in clinical incident reports. J Am Med Inform Assoc 2012; 19 (e1): e110–8. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 34. Greenland S, Mansournia MA. Penalization, bias reduction, and default priors in logistic and related categorical and survival regressions. Stat Med 2015; 34 (23): 3133–43. [DOI] [PubMed] [Google Scholar]
- 35. Shen MJ, Peterson EB, Costas-Muñiz R, et al. The effects of race and racial concordance on patient-physician communication: a systematic review of the literature. J Racial Ethn Health Disparities 2018; 5 (1): 117–40. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 36. Goddu AP, O'Conor KJ, Lanzkron S, et al. Do words matter? Stigmatizing language and the transmission of bias in the medical record. J Gen Intern Med 2018; 33 (5): 685–91. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 37. Watson A, Weaver M, Jacobs S, Lyon ME. Interdisciplinary Communication: Documentation of Advance Care Planning and End-of-Life Care in Adolescents and Young Adults With Cancer. J Hosp Palliat Nurs 2019; 21 (3): 215–22. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Myers JS, Gojraty S, Yang W, Linsky A, Airan-Javia S, Polomano RC. A randomized-controlled trial of computerized alerts to reduce unapproved medication abbreviation use. J Am Med Inform Assoc 2011; 18 (1): 17–23. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 39.EHR Demographic Data Standards Could Improve Patient Matching. 2019. https://healthitanalytics.com/news/ehr-demographic-data-standards-could-improve-patient-matching. Accessed July 16, 2021.
- 40. Warren LR, Harrison M, Arora S, et al. Working with patients and the public to design an electronic health record interface: a qualitative mixed-methods study. BMC Med Inform Decis Mak 2019; 19 (1): 250. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 41.CDC. Developing a suspected child abuse and neglect syndrome for State Health Departments. 2020. https://www.cdc.gov/nssp/partners/child-abuse-neglect.html. Accessed July 16, 2021.
- 42. Von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Bull World Health Organ 2007; 85 (11): 867–72. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Caton S, Haas C. Fairness in machine learning: a survey. 2020. arxiv:2010.04053v1. Accessed November 24, 2021.
- 44. Park Y, Jackson GP, Foreman MA, Gruen D, Hu J, Das AK. Evaluating artificial intelligence in medicine: phases of clinical research. JAMIA Open 2020; 3 (3): 326–31. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 45. Morley J, Morton C, Karpathakis K, Taddeo M, Floridi L. Towards a framework for evaluating the safety, acceptability and efficacy of AI systems for health: an initial synthesis. 2021. arXiv:2104.06910. Accessed July 16, 2021
- 46. Rudin C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 2019; 1 (5): 206–15. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med 2019; 25 (1): 30–6. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 48. Ihde D. Technology and the Lifeworld: From Garden to Earth. Bloomington, IN: Indiana University Press; 1990. [Google Scholar]
- 49. Ihde D. Postphenomenology: Essays in the Postmodern Context. Evanston, IL: Northwestern University Press; 1995. [Google Scholar]
- 50. Liberati N. Making out with the world and valuing relationships with humans. Paladyn 2020; 11 (1): 140–6. [Google Scholar]
Associated Data
This section collects any data citations, data availability statements, or supplementary materials included in this article.
Data Availability Statement
No new data were generated or analyzed in support of this research.