We are living in an era of “big data,” which is characterized by tremendous growth in data production, linkage, and analysis. This growth is shifting the landscape of possible opportunities and harms for public health research and practice, particularly for those affected by mass incarceration. We briefly examine the emerging role and ethical implications of using big data in public health, discuss these issues as they relate to justice-involved persons (JIPs), and suggest initial steps to promote ethical analyses and guideline development in this area.
In the big data era, a host of new data are being generated as people—often unwittingly—enlarge their digital footprint via Internet activity, purchases with credit cards or consumer loyalty cards, smartphones and wearable technologies, and electronic health records and genetic sequencing; more data are also generated as government entities become increasingly efficient and expansive in collecting information. The use of these new data sources by themselves or in combination offers unprecedented potential for public health insight. With varying success, attempts at harnessing these data include using cellphone location data to understand disease transmission, measuring the impact of public health interventions by analyzing Twitter, and predicting disease outbreaks based on search engine results.
Although these methods track patterns across populations rather than focusing on identified individuals, the expansive availability of data could allow such a shift. For example, Facebook has deployed artificial intelligence–based algorithms to flag user posts deemed to suggest imminent self-harm.1 When a post is flagged, company staff call the user’s local 911 center to dispatch first responders. In another example, state health departments are funded to use mandatorily reported HIV laboratory results routinely collected during clinical care to not only track the HIV epidemic as a whole but also to flag individual patients whose lack of recent laboratory records indicate that they are out of care. With the help of public or fee-based databases, health department staff can then attempt to locate and contact these patients and—if the patients agree—arrange for the resumption of their HIV care.2,3 Although approaches using big data for public health intervention are still rare, with growing data convergence and analytic power, they are likely to become increasingly common.
These big data approaches raise ethical questions for public health research and practice: What are the appropriate bounds on data collection for public health? For what health conditions and contexts should public health entities use big data to intervene at the individual level? How will increased data collection and subsequent intervention affect the public’s trust of public health entities? How do considerations of privacy and risks for data breaches change with expanded data collection and linkage? To what extent should public health entities promote data transparency, share data with other entities (e.g., commercial, law enforcement), or provide mechanisms of control for those represented in the data? And most provocatively, What are the dangers that mass data collection for public health can subvert freedoms of speech, movement, or assembly? These questions reflect tensions among some of the most important values that drive and justify public health planning: public health benefit, harm minimization, control, justice, trustworthiness, transparency, accountability, and social value.4–6
Although ethical issues are commonly considered by ethical review committees or institutional review boards, these bodies may be inadequately equipped to address the challenges of big data approaches.7 For example, informed consent and minimal risk, concepts developed for clinical research, may be difficult to apply in the context of big data,6,7 because data may be publicly available and scalability and automation can surreptitiously heighten the risk involved with activities traditionally deemed benign.
These ethical issues, complex in their own right, are further complicated when considering big data applications for JIPs. On the one hand, big data interventions may be particularly beneficial for JIPs, considering this population’s heavy burden of disease and insufficient access to care. In addition, the justice system’s routine—albeit variable—data collection provides opportunities for research and intervention not possible with other vulnerable populations (e.g., homeless persons). Some would argue that not using the justice systems’ data to support JIPs’ health is a moral failure. Further, the lack of unified reporting and common barriers to accessing police, court, and incarceration data can leave racial and economic inequalities in the justice system unchecked; big data tools that gather and process existing data could play an essential role in creating a more transparent evidence base to inform strategies to end mass incarceration.
On the other hand, big data approaches may be uniquely harmful or less effective among JIPs. For example, via Web sites featuring mug shots, publicly available court records, and the use of electronic surveillance monitors (i.e., GPS ankle monitors), JIPs are already heavily monitored. Creating public health systems that rely on this tracking may further perpetuate the disproportionate surveillance of persons of color as well as that of low-income persons, who may be least able to curate their digital footprint. Further, awareness of this tracking could induce anxiety or foster avoidance of routine health care, propagating health disparities. Similarly, the effective implementation of big data interventions for JIPs may be curtailed if they are generally wary of contact from government entities, even when the goal of the intervention is to provide services that are otherwise difficult to obtain.
We believe that ethical inquiries involving a wide range of stakeholders across an array of criminal justice and health contexts are needed. Stakeholders could include public health officials, privacy experts, jail or prison administrators, researchers, data scientists, and—most essentially—JIPs. Each perspective contributes critical data of different types, including scientific projections, legal considerations, practical administrative concerns, and experiential insights. Together, these inquiries can provide the building blocks for a more systematic framework for evaluating the benefits and harms of public health research and practice among JIPs in this new data age.
One powerful mechanism to encourage ethical analyses is for funders such as the National Institutes of Health and the National Institute of Justice to make such requirements commonplace. Doing so would motivate researchers and public health professionals to evaluate the ethical dimensions of their individual projects, while at the same time promoting the development of a corresponding literature of case studies that could be examined and synthesized. In turn, findings could prompt greater academic, government, and public discussion and inform the development of regulatory guidelines.
As the pace of technological advances quicken and norms in data, surveillance, and intervention continue to shift, society is challenged with defining the appropriate boundaries for these activities. Although big data approaches have great potential to advance public health, the risks involved in these approaches may also be heightened. We must proceed carefully with these approaches to ensure that they reduce rather than exacerbate health disparities. Creating greater ethical discourse about big data and public health research and practice for vulnerable populations, including JIPs, is critical in cultivating an equitable and effective public health system.
ACKNOWLEDGMENTS
This work was funded by the National Institutes of Health (NIH; grant R01AI129731) and the University of North Carolina at Chapel Hill Center for AIDS Research (grant P30 AI050410).
Note. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.
CONFLICTS OF INTEREST
The authors have no conflicts of interest to declare.
REFERENCES
- 1.Kaste M. Facebook increasingly reliant on A.I. to predict suicide risk. 2018. Available at: https://www.npr.org/2018/11/17/668408122/facebook-increasingly-reliant-on-a-i-to-predict-suicide-risk. Accessed October 5, 2019.
- 2.Sweeney P, Hoyte T, Mulatu MS et al. Implementing a data to care strategy to improve health outcomes for people with HIV: a report from the Care and Prevention in the United States Demonstration Project. Public Health Rep. 2018;133(2 suppl):60S–74S. doi: 10.1177/0033354918805987. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 3.US Centers for Disease Control and Prevention. CDC issues funding awards for HIV surveillance and prevention. 2018. Available at: https://content.govdelivery.com/accounts/USCDCNPIN/bulletins/1d0dca3. Accessed October 4, 2019.
- 4.Ballantyne A. Where is the human in the data? A guide to ethical data use. Gigascience. 2018;7(7) doi: 10.1093/gigascience/giy076. [DOI] [PMC free article] [PubMed] [Google Scholar]
- 5.Childress JF, Faden RR, Gaare RD et al. Public health ethics: mapping the terrain. J Law Med Ethics. 2002;30(2):170–178. doi: 10.1111/j.1748-720x.2002.tb00384.x. [DOI] [PubMed] [Google Scholar]
- 6.Ballantyne A. Adjusting the focus: a public health ethics approach to data research. Bioethics. 2019;33(3):357–366. doi: 10.1111/bioe.12551. [DOI] [PubMed] [Google Scholar]
- 7.Ienca M, Ferretti A, Hurst S, Puhan M, Lovis C, Vayena E. Considerations for ethics review of big data health research: a scoping review. PLoS One. 2018;13(10):e0204937. doi: 10.1371/journal.pone.0204937. [DOI] [PMC free article] [PubMed] [Google Scholar]
