Skip to main content
Sage Choice logoLink to Sage Choice
editorial
. 2024 Nov 26;20(1-2):3–5. doi: 10.1177/15562646241296712

Joint Editorial: Informed Consent and AI Transcription of Qualitative Data

Gabrielle Samuel 1, Doug Wassenaar 2,
PMCID: PMC12048736  PMID: 39600037

In both of our everyday university roles, we have independently been approached to advise researchers and/or RECs/IRBs on whether the use of AI tools to transcribe qualitative data (e.g., interviews, focus groups) should require explicit informed consent from participants.

AI tools can be used to transcribe qualitative data through different approaches. On the one hand, discussions occurring via online videoconferencing can be recorded by the software, and transcripts downloaded. On the other hand, face-to-face interviews/focus groups which are recorded digitally, can be uploaded to online software (e.g., Otter.ai) which can facilitate transcription. While in the first instance, participants would likely have already provided consent to have discussions recorded through online platforms, making it less ethically problematic that researchers access the consequent discussion transcripts via this means, this is not the case in the second instance. In this case, interviews/focus group discussions have traditionally been transcribed either by researchers themselves, or – with participants’ express permission – through outsourced external transcription services. In this case it is therefore pertinent to ask whether a shift in research practice to using online (AI) transcription tools raises ethical issues and/or whether such practices should be disclosed to participants on participant information sheets.

Given this question, we turned to the literature to gain more clarity. However, we found very few helpful resources. One of us (GS) thought genAI could assist us in conceptualizing the issue – not only to help us, but also to better explore and understand the usefulness of genAI in assisting us with ethical quandaries.

We then asked both ChatGPT and Claude.ai to draft a position statement on whether participants should be informed about the use of AI software to transcribe interview/focus group recordings. We then drew on this in developing a position statement. Our synthesized version of the response of ChatGPT and Claude.ai is presented below:

Position

We believe that researchers should inform participants when AI transcription software will be used to process their recorded responses, and obtain explicit consent for this use. While AI transcription can offer significant benefits in terms of efficiency and cost-effectiveness, it also introduces new considerations that participants have a right to be aware of and consider when deciding whether to participate in a study. Our rationale is underpinned by the following considerations:

  1. Informed Consent: The principle of informed consent is fundamental to ethical research. Participants should be fully aware of how their data will be handled, including the use of AI in processing their responses.

  2. Data Privacy and confidentiality: AI transcription often involves uploading recordings to cloud-based services. Participants should be informed about this data transfer and any associated risks with third party access. By informing participants of the use of AI, researchers can address concerns regarding data security, the potential for data breaches, and the handling of sensitive information.

  3. Accuracy and Interpretation: AI transcription, while increasingly accurate, can still make errors, especially with accents, technical terms, or poor audio quality. Participants should be aware that their words may initially be processed by an AI system, with potential for misinterpretation. The transcription also risks misinterpretation or loss of nuance.

  4. Data Retention: AI services may retain data to improve their algorithms. Participants should be informed about data retention policies and any potential for their anonymized data to be used for AI training.

  5. Transparency and trust in Research Methods: Disclosing the use of AI transcription contributes to overall transparency in research methodology. It also promotes trust between researchers and participants. When individuals understand the tools being used in the research process, they are more likely to feel comfortable sharing their experiences and insights. This trust is essential for gathering accurate and meaningful data.

Comment: While positioning ourselves in this way, we recognise some important aspects that were not retrieved by genAI. First, many of these concerns about the need to disclose this information are particularly pertinent given the current social-political context of the data/digital landscape in many countries. At present – and following several data scandals 1 - current public policy narratives emphasise depleted public trust in data technology. Indeed, at least in the UK, a trust deficit has been emphasized in terms of institutions’ (public and private) handling health data responsibly. Consequently, issues of trust are paramount in discussions about data management and research practices.

At the same time, there are issues with disclosure. We note two. First, requiring disclosure may feed a discourse of AI exceptionalism by emphasising the ethical risks associated with AI use compared to other risks and burdens related to the research endeavour. In essence, AI exceptionalism relates to the need to inform participants of technology associated risks while we may not do the same for similar risks that are human related (e.g., the risk of a transcriber who, having signed a confidentiality agreement, nevertheless discloses information about the interview/focus group discussions to unauthorised third parties). Second, disclosure pushes the burden of decision-making onto individual participants. Asking participants to make decisions about whether they are willing to accept the risks posed by the use of online (AI) transcription services, places responsibly onto them for any unintended harms that might come from the use, deferring this responsibility away from the researchers. In the event of an adverse harm, participants, then, would bear the burden of the responsibility for these harms rather than researchers. The alternative – placing responsibility onto the researchers themselves, also raises issues, since they are unlikely to have expertise on the appropriate data governance and ethical standards associated with the use of this online AI software. Nevertheless, researchers must take on this responsibility if they chose to use the software, as they would for any other research procedures/methods. RECs/IRBs could facilitate researchers by helping clarify the various ethical issues that need consideration. In essence, it is a shared responsibility across all involved in the research process.

Considering the above, we cautiously recommend the following:

  1. Explicit Consent: Include a specific clause in study information sheets and consent forms about the use of AI transcription software. This includes providing a clear, jargon-free explanation of how AI transcription works and what it means for participants’ data.

  2. Data Security Measures: Researchers to take responsibility for ensuring the appropriateness of different online AI software services to ensure that participants’ data is protected during the transcription process.

  3. Right to Decline or Withdraw: Affirm participants’ right to decline or withdraw consent for AI transcription, even if they consent to participate in the study overall.

  4. Human Verification: Assure participants that AI transcriptions will be reviewed and corrected by human researchers.

  5. Alternative Options: Where feasible, offer participants the option to have their recordings transcribed manually if they are uncomfortable with AI transcription.

In general, we suggest the following generic wording for researchers to insert into standard information sheets for participants to consider before providing informed consent:

“If you agree, this study will make use of an AI service to transcribe interview data that you provide. The AI service we will use is [INSERT HERE], which has a data governance statement that can be found here [INSERT HERE].

If you are uncomfortable with us using online software transcription services please let us know before you take part in the interview/workshop so that we can ensure that we instead use a transcription service.”

We invite, and look forward to, our readers’ comments on the above. It is possible that similar concerns apply to translation services, but that topic is for a future note.

Author Biographies

Gabrielle Samuel, King’s College, London, Associate Editor: JERHRE gabrielle.samuel@kcl.ac.uk.

Doug Wassenaar, SARETI, UKZN, South Africa, Editor-in-Chief: JERHRE wassenaar@ukzn.ac.za.

1.

See: Royal Free scandal, which involved the transfer of identifiable patient records across the entire Trust; and the 2014 public relations failure of care.data - an initiative that aimed to improve the use of GP data for research, but received harsh public criticism. Such scandals sit within recent incidents in other contexts, including the 2018 Cambridge Analytica scandal, which are seen to have eroded public trust in information technology more broadly. See, for example

Footnotes

Funding: The authors received no financial support for the research, authorship, and/or publication of this article.


Articles from Journal of Empirical Research on Human Research Ethics are provided here courtesy of SAGE Publications

RESOURCES