Abstract
In the future artificial intelligence (AI) will have the potential to improve outcomes diabetes care. With the creation of new sensors for physiological monitoring sensors and the introduction of smart insulin pens, novel data relationships based on personal phenotypic and genotypic information will lead to selections of tailored, effective therapies that will transform health care. However, decision-making processes based exclusively on quantitative metrics that ignore qualitative factors could create a quantitative fallacy. Difficult to quantify inputs into AI-based therapeutic decision-making processes include empathy, compassion, experience, and unconscious bias. Failure to consider these “softer” variables could lead to important errors. In other words, that which is not quantified about human health and behavior is still part of the calculus for determining therapeutic interventions.
Keywords: artificial intelligence, big data analytics, quantitative fallacy, human behavior, clinical decision-making
Torture the data and it will confess to anything.
—Ronald Coase
Successful Diabetes Treatment Needs Data
Discussion on use of artificial intelligence (AI) and health specifically is ubiquitous in the medical and lay press reflecting the perception that it has enormous potential to reduce the personal and global burden of many long-term medical conditions. Currently diabetes appears to be the poster child for the application of AI in health care for a number of reasons.1
Worldwide, the number of adults and children developing diabetes continues to rise in parallel with global access to smartphone technologies.
On a daily basis, personal data from people living with diabetes are continuously created and logged.
Although the main variable of interest is glucose, with the rise in consumer tracking technologies, glucose data are being supplemented with additional information related to nutrition, physical activity, and sleep.
With the increasing availability of additional sensor technologies for physiological monitoring including smart insulin pens, social media, and records of internet searches, the diabetes data pool will continue to grow.2,3 Moreover, other data-generating comorbidities (eg, hypertension and cardiac arrhythmias) plus information from screening tests for complications (eg, retinopathy) are also adding to this “big data” resource.
The anticipated value from this torrent of data is that it can be analyzed and converted into patterns leading to actionable information, that is, a clear opportunity for AI.4 For clinicians and people with diabetes, examples of actionable information are early prediction of severe hypoglycemia not just in those with hypoglycemia unawareness or the most opportune time for insulin initiation and optimization in type 2 diabetes. Existing large population data sets have already been used to predict the onset of type 2 diabetes, which appear to have better prediction performance than classical diabetes risk prediction algorithms.5 The use of AI to analyze big datasets comprised of many data streams (not all of which are human sensor data and may be behavioral or geographic in origin) is already becoming a reality.6 It is important to note that this type of big data is not analyzed as if it is presented on a very large spreadsheet because this type of data is often unstructured (eg, pictures, phone messages, video, email, and text messages) and not amenable to capture, storage, and management by commonly used software tools—analyses of big data typically require distributed computation over a cluster of computers.7,8 The process of assembling a highly detailed set of phenotypic and genotypic data to obtain the most appropriate treatment for individuals with a specific combination of traits is the basis of precision medicine.9 There is a growing belief that novel data relationships based on phenotypic and genotypic information will lead to powerful predictions and accurate selections of tailored therapies that will transform health care in a very positive way. The diabetes clinic of the future is likely to be unrecognizable from its current format.10 The anticipated promise from the triumvirate of (1) the internet of medical things, (2) big data, and (3) AI analyzed by way of cloud computing is being welcomed as necessary, inevitable, and beneficial.11 However, this paradigm may turn out instead to be a modern-day “quantitative fallacy."
Quantitative Fallacies
A quantitative fallacy refers to a flawed decision-making process that is based exclusively on quantitative metrics and that ignores qualitative factors. The most well-known example is the eponymous McNamara Fallacy, named after the US Secretary of Defense during the Vietnam War and summarized as “if it cannot be measured, it is not important.”12 The genesis of a quantitative fallacy requires four erroneous steps13 (Table 1).
Table 1.
1. Measure whatever can easily be measured 2. Disregard that which cannot be easily measured or give it an arbitrary quantitative value 3. Presume that what cannot be measured easily is not important 4. Believe that what cannot be easily measured really does not exist |
In health care, most previous examples of this flawed type of decision process have been based on the mistaken belief that all of clinical practice can be quantified.14 In consideration of big data analytics and AI in diabetes care difficult to quantify inputs into the therapeutic decision making processes include empathy, compassion, understanding, previous experiences, and unconscious bias.15 Failure to consider these so-called “softer” variables could lead to important errors in AI when used to solve clinical problems. In other words, that which is not quantified about human health and behavior is still part of the calculus for determining therapeutic interventions. For example, continuous care by the same doctor over time is associated with greater patient satisfaction, improved health promotion, increased adherence to medication, reduced hospital use, and a reduced risk of premature death.16 The reasons for this beneficial effects of care from the same clinician over time are likely to be multifactorial, but it is also noteworthy that doctors tend to overestimate their effectiveness when consulting with patients they do not know, and underestimate their effectiveness when consulting with patients they know.17 It remains to be determined whether the same correlation applies to AI-delivered care in the future.
Data Sources: Quantitative and Qualitative
Sources of big data for diabetes include both (1) structured data from electronic health records, population registries, clinical trials, biometric data from an increasingly wide array of physiological and geospatial sensors, as well as (2) unstructured data, from medical images, photos, audio and video recordings, social media content, and consumer search data based on information collated with a smartphone. The diversity of these health care data sources can create methodological challenges for data integration. To date, big data analytics, machine learning, and AI are in their infancies with respect to providing software-generated decision support, but over time these sources of therapeutic recommendations are likely to become increasingly embedded into the health care system. As discussed earlier, unwavering adherence to the mantra of an artificial intelligence “solution” for diabetes care based solely on big data analytics (ie, use of software that learns from patterns in the data) has the potential to create a digital diabetes fallacy if there is sole reliance on the measurable. In addition, there are many methodological challenges to creating useful quantitative datasets, including (1) ensuring data quality especially from electronic health record sources, (2) maintaining data consistency, and (3) standardizing outcomes data from clinical trials. Moreover, the process of clinical decision making is invariably not recorded.18 Therefore, it is important to consider qualitative factors for affecting decision-making algorithms, which are, at present, difficult to capture but important for diabetes care (Table 2).
Table 2.
1. Low-quality quantitative data 2. Language 3. Health beliefs due to cultural, racial, or ethnic influences |
Low-Quality Quantitative Data
Quantitative biomedical data can be classified according to its quality. Medical decisions based on artificial intelligence depend on the quality of the inputted data—in other words, poor quality of quantitative data can lead to poor decision making. A recent review of the health care quality literature generated 96 terms used to describe data quality concepts.19 The six most widely recognized dimensions of biomedical data quality are presented in Table 3.20
Table 3.
Type of dimension | Definition |
---|---|
Relevance | Degree to which the information meets the needs of users |
Accuracy | Degree to which the information correctly describes what it was designed to measure |
Timeliness | Delay between the time to which the information pertains and when the information becomes available |
Accessibility | Ease with which the information can be obtained |
Interpretability | Availability of supplementary information and metadata necessary to interpret the information |
Coherence | Degree to which a set of information can be combined with other information |
The potential limiting factors of big data have been summarized into four features known as the four Vs: volume, velocity, variety, and veracity.21 Limitations in these areas can lead to misinterpretation of data sources. For example, the hype created at the onset of the digital revolution suggested that real-world data from individuals based on their online activities including social media could supplant traditional approaches to public health. It was suggested that identification of an influenza epidemic or an adverse drug effect could be determined by counting web searches of related topics. This proved to be incorrect as a standalone method; however, this approach can provide useful supplemental information.22-25 In day-to-day clinical practice, patient-generated data are invariably unstructured and highly context-dependent, and the impact of illness on an individual’s behavior and cognitive processing has been underappreciated.26 Going forward, it will be necessary to find a way to combine quantitative data from traditional health systems with qualitative patient-generated data.
The use of big data analytics to form conclusions can also contain risks of mishandling of the data or inadequate high-quality data to form robust conclusions. Fallacies in the generation of quantitative data from research design, sampling, and instrumentation, statistical analysis, and interpretation can result in unrecognized knowledge gaps27,28 (Table 4).
Table 4.
• Stratification of individuals into subgroups in error, eg, misclassification of diabetes type.29
• Variable effects of an illness upon data which can change over time.30 • Failure to consider the impact of the prevailing glucose level on patient-generated physical, psychological, and behavioral responses, eg, making assessments during or following a hypoglycemic event.31 • Exclusion bias. Absence of data from individuals not using social media can skew the interpretation. Potentially, there can be more or less big data from wealthier and younger communities as well as geographical bias (ie, urban versus rural populations contributing toward big data).32 • Inappropriate conclusions from novel big data sets without clinical interpretation, or statistical governance could lead to model overfitting and the belief in spurious relationships between data groups.33 • Patient behavior including the generation of factitious data.34 |
Mishandling of data also relates to information privacy. A successful doctor-patient relationship is based on the medical practitioner’s ability to keep information confidential—trustworthiness. For AI data are increasingly being deidentified, which works at a population level, but for personalized decision support other safeguards are necessary to protect privacy.35
Language
For any AI system to work efficiently and effectively, it will need to understand the nuances of the language of health care from the perspective of people with diabetes and not simply the jargon favored by clinicians.36 Potential confusion could arise with homophones (words that sound the same, but which have different meanings and spellings, such as cabbage and CABG) and homographs (words that are spelled the same, but have different meanings. For example, one man’s emergency department (ED) is another’s erectile dysfunction, and a verbal order for K therapy in the emergency department can result in administration of either potassium or vitamin K). Within a single language, there are also dialectal differences—what would an AI system make of the common Scottish vernacular use of “bampots,” “bevvies,” and “bairns,” or the use of a “stookey” for a broken arm? There is already abundant evidence that many patients encounter barriers to understanding health related information, and that materials and other content created by clinicians often fail in terms of understandability.37,38 Language barriers can also contribute to health disparities. US Latino diabetes patients with decreased English language skills have been shown to be at increased risk of poor glycemic control, however this risk is not present when care is delivered by physicians who speak Spanish.39 It is also worth noting that AI development itself has highlighted the underrecognized clinical challenge of patients’ and doctors’ different understanding of what is being said.40 If technology companies are to create useful AI systems, then they will need to access language from a variety of sources. These will include handwritten notes, letters, and emails (ie, medical records), and presumably (and controversially) they will also listen directly to patients talking with their clinicians.41
Race and Ethnicity
To be ultimately successful, AI requires evidence from clinical trials. In the United States, racial/ethnic minority populations are disproportionately affected by diabetes and the associated complications.42 However, despite the discriminatory nature of diabetes being self-evident, minority participation in technological interventions such as artificial pancreas development in type 1 diabetes and trials of new therapeutic agents in type 2 diabetes has been consistently low.43,44 Failure to recruit adequate numbers of minorities in clinical trials results in (1) poor trial validity, (2) poor generalizability of the results, (3) magnification of inequalities, and (4) concern about failure to detect harm in certain populations. Structured interventions, tailored to ethnic minority groups by integrating elements of culture, language, religion, and health literacy skills, have demonstrated that these measures can produce a positive impact on a range of patient-important outcomes for individuals with diabetes.45 Similarly, a review of 34 randomized trials testing culturally tailored interventions to prevent diabetes in minority populations noted that culturally tailored interventions were effective in improving risk factors for progression to diabetes among ethnic minority groups.46 There is also evidence that the differences in diabetes beliefs (between low- and high-education African American, American Indian, and white older adults) are due to socioeconomic conditions.47 Translating culturally focused education programs that, in addition, take into consideration changing socioeconomic circumstances are not easily amenable to being generated by computers simply using quantitative data.
Conclusion
Big data and artificial intelligence will be useful tools for treating diabetes in a precision medicine or precision public health paradigm. The nature of the analytic tools to process diverse large datasets is to only use quantitative data. At this time, there are many flaws with total dependence on quantitative data, based on the frequent inadequate quality of this type of data as well on the frequent need to supplement a quantitative approach with a qualitative approach. Going forward, the conversion of unstructured data into digital-processible data is the domain of cognitive computing that is likely to add significant value to AI.48 Factors besides objective data also go into clinical decision making, such as sentiment, intuition, and a physician’s experience, which have been referred to as judgment or a “gut feeling." Cognitive computing is currently ill equipped to duplicate this subjective part of reaching medical conclusions.49 There remains a need for human physicians to treat diabetes and other diseases to provide judgment, compassion, and context, which will not be available from computers for the foreseeable future.
Footnotes
Abbreviations: AI, artificial intelligence; ED, emergency department.
Declaration of Conflicting Interests: The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding: The author(s) received no financial support for the research, authorship, and/or publication of this article.
References
- 1. Contreras I, Vehi J. Artificial intelligence for diabetes management and decision support: literature review. J Med Internet Res. 2018;20:e10775. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 2. Majumder S, Mondal T, Deen MJ. Wearable sensors for remote health monitoring. Sensors (Basel). 2017;17(1):E130. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3. Klonoff DC, Kerr D. Digital diabetes communication: there’s an app for that. J Diabetes Sci Technol. 2016;10(5):1003-1005. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 4. Bellazzi R, Dagliati A, Sacchi L, Segagni D. Big data technologies: new opportunities for diabetes management. J Diabetes Sci Technol. 2015;9(5):1119-1125. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5. Razavian N, Blecker S, Schmidt AM, Smith-McLallen A, Nigam S, Sontag D. Population-level prediction of type 2 diabetes from claims data and analysis of risk factors. Big Data. 2015;3(4):277-287. [DOI] [PubMed] [Google Scholar]
- 6. Azmak O, Bayer H, Caplin A, et al. Using big data to understand the human condition: the Kavli HUMAN project. Big Data. 2015;3(3):173-188. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 7. Andreu-Perez J, Poon CC, Merrifield RD, Wong ST, Yang GZ. Big data for health. IEEE J Biomed Health Inform. 2015;19(4):1193-1208. [DOI] [PubMed] [Google Scholar]
- 8. Peek N, Holmes JH, Sun J. Technical challenges for big data in biomedicine and health: data sources, infrastructure, and analytics. Yearb Med Inform. 2014;9:42-47. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 9. Klonoff DC. Precision medicine for managing diabetes. J Diabetes Sci Technol. 2015;9(1):3-7. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 10. Kerr D, Axelrod C, Hoppe C, Klonoff D. Diabetes and technology 2030: a utopian or dystopian future? Diabet Med. 2018:35(4):498-503. [DOI] [PubMed] [Google Scholar]
- 11. Elhosenya M, Abdelaziz A, Salama AS, Riad AM, Muhammad K, Sangaiah AK. A hybrid model of Internet of Things and cloud computing to manage big data in health services applications. Future Gener Comp Sys. 2018;86:1383-1394. [Google Scholar]
- 12. Cukier K, Mayer-Schönberger V. The dictatorship of data. MIT Technology Review. 2013. Available at: https://www.technologyreview.com/s/514591/the-dictatorship-of-data/.
- 13. Yankelovich D. Corporate Priorities: A Continuing Study of the New Demands on Business. Stanford, CT: Yankelovich Inc; 1972. [Google Scholar]
- 14. O’Mahony S. Medicine and the McNamara fallacy. J Royal Coll Physic Edin. 2017;47(3):281-287. [DOI] [PubMed] [Google Scholar]
- 15. Kneafsey R, Brown S, Sein K, Chamley C, Parsons J. A qualitative study of key stakeholders’ perspectives on compassion in healthcare and the development of a framework for compassionate interpersonal relations. J Clin Nurs. 2016;25(1-2):70-79. [DOI] [PubMed] [Google Scholar]
- 16. Pereira Gray DJ, Sidaway-Lee K, White E, Thorne A, Evans PH. Continuity of care with doctors—a matter of life and death? A systematic review of continuity of care and mortality. BMJ Open. 2018;8:e21161. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 17. Pereira Gray D, Sidaway-Lee K, White E, et al. Improving continuity: the clinical challenge. InnovAiT. 2016;9:635-645. [Google Scholar]
- 18. Warnecke E. The art of communication. Aust Fam Physician. 2014;43(3):156-158. [PubMed] [Google Scholar]
- 19. Johnson SG, Speedie S, Simon G, Kumar V, Westra BL. A data quality ontology for the secondary use of EHR data. AMIA Annu Symp Proc. 2015:2015:1937-1946. [PMC free article] [PubMed] [Google Scholar]
- 20. United Nations Economic Commission for Europe. Conference of European Statisticians Recommendations of the 2010 Censuses of Population and Housing. Prepared in cooperation with the Statistical Office of the European Communities (EUROSTAT) New York and Geneva: United Nations Economic Commission for Europe; 2006. [Google Scholar]
- 21. Kruse CS, Goswamy R, Raval Y, Marawi S. Challenges and opportunities of big data in health care: a systematic review. JMIR Med Inform. 2016;4:e38. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 22. Alessa A, Faezipour M. A review of influenza detection and prediction through social networking sites. Theor Biol Med Model. 2018;15(1):2. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 23. Liu X, Liu J, Chen H. Identifying adverse drug events from health social media: a case study on heart disease discussion forums. In: Zheng X, Zeng D, Chen H, Zhang Y, Xing C, Neill DB. eds. International Conference on Smart Health. ICSH 2014: Smart Health Cham, Switzerland: Springer; 2014:25-36. [Google Scholar]
- 24. Salathé M. Digital pharmacovigilance and disease surveillance: combining traditional and big-data systems for better public health. J Infect Dis. 2016;214(suppl 4):S399-S403. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 25. Pierce CE, Bouri K, Pamer C, et al. Evaluation of Facebook and Twitter monitoring to detect safety signals for medical products: an analysis of recent FDA safety alerts. Drug Saf. 2017;40(4):317-331. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 26. Lawson VL, Bundy C, Belcher J, Harvey JN. Changes in coping behavior and the relationship to personality, health threat communication and illness perceptions from the diagnosis of diabetes: a 2-year prospective longitudinal study. Health Psychol Res. 2013;1:e20. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 27. Wang LL, Watts AS, Anderson RA, Little TD. Common fallacies in quantitative research methodology. In: Little TD. ed. The Oxford Handbook of Quantitative Methods in Psychology: Vol. 2: Statistical Analysis. New York, NY: Oxford University Press; 2013. doi: 10.1093/oxfordhb/9780199934898.013.0031 [DOI] [Google Scholar]
- 28. Dolley S. Big data’s role in precision public health. Front Public Health. 2018;6:68. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 29. Tripathi A, Rizvi AA, Knight LM, Jerrell JM. Prevalence and impact of initial misclassification of pediatric type 1 diabetes mellitus. South Med J. 2012;105(10):513-517. [DOI] [PubMed] [Google Scholar]
- 30. Vos RC, Kasteleyn MJ, Heijmans MJ, et al. Disentangling the effect of illness perceptions on health status in people with type 2 diabetes after an acute coronary event. BMC Fam Pract. 2018;19(1):35. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 31. Kerr D, Macdonald IA, Tattersall RB. Patients with type-1 diabetes adapt acutely to sustained mild hypoglycaemia. Diabetic Med. 1991;8(2):123-128. [DOI] [PubMed] [Google Scholar]
- 32. Pal BR. Social media for diabetes health education—inclusive or exclusive? Cur Diabetes Rev. 2014;10(5):284-290. [DOI] [PubMed] [Google Scholar]
- 33. Babyak MA. What you see may not be what you get: a brief, nontechnical introduction to overfitting in regression-type models. Psychosom Med. 2004;66(3):411-421. [DOI] [PubMed] [Google Scholar]
- 34. Horwitz DL. Factitious and artifactual hypoglycemia. Endocrinol Metab Clin North Am. 1989;18(1):203-210. [PubMed] [Google Scholar]
- 35. Flaumenhaft Y, Ben-Assuli O. Personal health records, global policy and regulation review. Health Policy. 2018;122(8):815-826. [DOI] [PubMed] [Google Scholar]
- 36. Reach G. Linguistic barriers in diabetes care. Diabetologia. 2009;52(8):1461-1463. [DOI] [PubMed] [Google Scholar]
- 37. Schenker Y, Karter AJ, Schillinger D, et al. The impact of limited English proficiency and physician language concordance on reports of clinical interactions among patients with diabetes: the DISTANCE study. Patient Educ Couns. 2010;81(2):222-228. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 38. Hannonen R, Komulainen J, Eklund K, Tolvanen A, Riikonen R, Ahonen T. Verbal and academic skills in children with early-onset type 1 diabetes. Dev Med Child Neurol. 2010;52:e143-e147. [DOI] [PubMed] [Google Scholar]
- 39. Fernandez A, Schillinger D, Warton EM, et al. Language barriers, physician-patient language concordance, and glycemic control among insured Latinos with diabetes: the Diabetes Study of Northern California (DISTANCE). J Gen Intern Med. 2011;26(2):170-176. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 40. Schoenick C, Clark P, Tafjord O, Turney P, Etzioni O. Moving beyond the Turing Test with the Allen AI Science Challenge. Commun Association for Computing Machinery; 2017; 60: 60-64. Available at: https://cacm.acm.org/magazines/2017/9/220439-moving-beyond-the-turing-test-with-the-allen-ai-science-challenge/fulltext [Google Scholar]
- 41. Reach G. The “Chinese room” argument and patient education. BMJ. 2008;336:335. [Google Scholar]
- 42. Bullard KM, Cowie CC, Lessem SE, et al. Prevalence of diagnosed diabetes in adults by diabetes type—United States, 2016. MMWR Morb Mortal Wkly Rep. 2018;67:359-361. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 43. Choppe C, Kerr D. Minority underrepresentation in cardiovascular outcome trials (CVOT) for type 2 diabetes? Lancet Diabetes Endocrinol. 2017;5(1):13. [DOI] [PubMed] [Google Scholar]
- 44. Huyett L, Dassau E, Pinsker J, Doyle F, Kerr D. Who is (not) in line for the artificial pancreas? Lancet Diabetes Endocrinol. 2016;4:880-881. [DOI] [PubMed] [Google Scholar]
- 45. Zeh P, Sandhu HK, Cannaby AM, Sturt JA. The impact of culturally competent diabetes care interventions for improving diabetes-related outcomes in ethnic minority groups: a systematic review. Diabet Med. 2012;29:1237-1252. [DOI] [PubMed] [Google Scholar]
- 46. Lagisetty PA, Priyadarshini S, Terrell S, et al. Culturally targeted strategies for diabetes prevention in minority population. Diabetes Educ. 2017;43(1):54-77. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 47. Grzywacz JG, Arcury TA, Ip EH, et al. Cultural basis for diabetes-related beliefs among low- and high-education African American, American Indian, and white older adults. Ethn Dis. 2012;22(4):466-472. [PMC free article] [PubMed] [Google Scholar]
- 48. Chen Y, Elenee Argentinis JD, Weber G. IBM Watson: how cognitive computing can be applied to big data challenges in life sciences research. Clin Ther. 2016;38(4):688-701. [DOI] [PubMed] [Google Scholar]
- 49. Trafton A. Doctors rely on more than just data for medical decision making. Available at: http://news.mit.edu/2018/doctors-rely-gut-feelings-decision-making-0720. Accessed August 1, 2018.