Skip to main content
Sage Choice logoLink to Sage Choice
. 2024 Mar 14;40(2):211–215. doi: 10.1177/08903344241235160

Ethical Use of Artificial Intelligence for Scientific Writing: Current Trends

Ellen Chetwynd 1,
PMCID: PMC11015711  PMID: 38482810

Artificial intelligence (AI) is a big topic and is evolving rapidly. This About Research article will specifically focus on the use of AI in scientific writing and will not cover the myriad ways that AI is being used in scientific inquiry. It is titled “Current Trends” because it must be considered in the time it was written, which is early 2024. As the field evolves, the journal will continue to offer the latest guidelines to authors and links to the organizations working on ethics in the use of AI in publishing.

Background

Artificial intelligence (AI) is a general concept that can be applied to specific types of machine-generated computation or learning that has evolved alongside the development of computers. While its origins can be traced to various “beginnings,” most sources suggest that modern development started with the work of Alan Mathison Turing in the early 1950s. The term “artificial intelligence” was coined at a conference organized by Marvin Minsky, John McCarthy, Claude Shannon, and Nathan Rochester of International Business Machines Corporation (IBM) in 1956. From there, AI development progressed irregularly. The breakthroughs in the growth of AI have not been focused on a single geographic area, but have occurred in changing hotspots around the world. There were periods of rapid progress, public interest, and funding that would peak, raising expectations, then stall out when those expectations could not be met with the computing power available at the time. While progress would not halt during these times, it would slow until computing power caught up to the innovations in the field. These periods of slow growth in AI development are called AI winters, and there have been two since the 1950s (Muthukrishnan et al., 2020). The most recent breakthrough, and the focus of this paper, is the publicly accessible use of large language models, such as ChatGPT (Generative Pretrained Transformer) and others.

One important concept in understanding AI is that it differs from automation, which is another type of technology that can assist us with productivity and workflow tasks. While automation uses machines to complete a process, the work of automation is based on a finite and explicit set of rules that do not change. AI goes further than automation: AI is the use of intelligent machines that can mimic human behavior or exceed it. It does this through several processes. Machine learning is the detection and prediction of patterns using algorithms after the software has been trained on large datasets. AI learns from the datasets and then can generate new information that is not specifically contained within the datasets but is predicted from what was learned. Natural language processing is the method used by AI software that allows the computer to understand, interpret, and generate human language. Large language models (LLM) or generative AI-based software programs use both technologies to understand wide bodies of data, and generate new text on demand (Committee on Publication Ethics [COPE], 2021).

The reason this matters to the scientific publishing community is that generative AI has the capacity to create research ideas, write complete papers, and assist in a multitude of areas in producing research articles. This is important in the development of our research base for two reasons. The first is that ethical standards for authors and publishers need to keep pace with the tools being used by researchers and writers, and the second is to protect the scientific community from incorrect or inadequate science (COPE & STM, 2022).

There is an understanding among researchers and publishers that the scientific evidence base is built on ethical standards of practice. Generative AI is new and requires a reset of the current standards of practice in scientific inquiry and publication so that they are inclusive of the breadth available to individuals because of the easy accessibility of AI. There are organizations within the research community positioned to work on global consensus statements such as the International Committee of Medical Journal Editors (ICMJE) and the Committee on Publication Ethics (COPE), among others. Several organizations are actively working to develop consensus international guidelines for authors and publishers to set the standards of accountability so that we can be assured that we are communicating adequately with each other about our human input and the mechanical processes we might have used to support our work. OCANGARU – ChatGPT and Artificial Intelligence Natural Large Language Models for Accountable Reporting and Use Guidelines (Cacciamani, et al., 2023) is one such guideline and can be found on the Equator (Enhancing the QUAlity and Transparency Of health Research) Network site. The type of declaration needed from authors is not new, but an extension of the understandings we currently have about the role of the various sections within a manuscript, specifically what defines authorship, the elements covered in a Methods section, and what is expected to be included in author Acknowledgements.

The second reason this is important is that there are, unfortunately, researchers who will purposefully misrepresent their work. They will fabricate manuscripts either partially or completely by excessively using technology and hired writers to submit papers that are manufactured without underlying research. These papers do not represent true scientific inquiry and may even be based on fictitious data. One of the particularly disturbing forms of misuse in research is the creation of “deepfakes,” which harness deep learning artificial intelligence algorithms to create pieces of content, such as video or audio content, that appear real but are actually completely generated (Lewis et al., 2023). The risk of malfeasance in the world at large with AI is high. In scientific writing, the use of AI for the production of falsified data could disrupt any topic of inquiry. Study results, beyond boosting the credentials of individual scientists, could be overly influenced or even created to serve the purpose of manufacturing companies.

There is a whole industry built around the development and submission of these false research papers. The companies that will manufacture fake papers are called “paper mills.” These are described by COPE & STM as “profit-oriented, unofficial and potentially illegal organizations that produce and sell fraudulent manuscripts that seem to resemble genuine research” (COPE & STM [Scientific, Technical and Medical], 2022). The risk to scientific inquiry becomes exponential if one considers the use of datasets containing deepfake components becoming the base upon which AI is trained. Paper mills are heavily dependent on AI generated text, and as quickly as the publication industry is developing algorithms to detect falsified research, the paper mills are becoming more sophisticated at avoiding detection, creating additional work and stress across the publication process (Parkinson & Wykes, 2023).

Thus, individuals working in the fields of scientific research and the publication trade work to support efficiency, innovation, and productivity, but this is complicated by the need to protect the evidence base from false narratives. Similarly, ethical authors strive to be fully accountable for their work while using AI for efficiency. The trick is not to limit the benefits of AI, while also creating structure and strategy to guide ethics in publication.

The use of AI is not black and white but a continuum. Within the process of researching and publishing research, most authors already use programs that include the use of AI, such as the grammatical corrections that are typically a part of any writing software. The question is not whether to use the technology we now have access to, but where the line needs to be drawn between use and misuse—and, once we have used the technology, how we can agree to communicate about its use in ways that are responsible, ethical, and transparent.

Limitations of AI

AI is not without its limitations, and it is within these limitations that we begin to see how to braid together the use of this technology, while not losing the oversight and creativity inherent in our own human intellect. By understanding the limitations of AI, we know how we must work with it.

AI is trained on datasets created by humans. Those datasets are necessarily part of the past. Both the datasets themselves and the humans who choose them could introduce biases to the software. This might be overt or more hidden. For example, the language used in AI-generated text might be inequitable or biased, using racist, sexist, or other forms of non-inclusive language more common in the past. The biases might also be more subtle, for example, the inclusion of a single majority-language in the dataset used to train the software which would lift up researchers speaking the most common language in scientific writing while leaving behind researchers speaking less common languages.

As a learner, it is also possible for AI to make mistakes, sometimes called hallucinations. Alkaissi and McFarlane (2023) describe a series of exercises they ran through ChatGPT in which the technology provided them with inaccurate statements within subject areas that were well-researched, as well as full essays in topic areas in which there was no known research. The references provided were false while appearing to be completely legitimate, and when the program was asked to give more current references, it simply gave the same references with updated years. These hallucinations leave scientists vulnerable to legal accountability for AI errors as well as creating an environment in which those in scientific research do not engender the trust of the general public. It is important to note that the study described here was published in February of 2023, and the LLM are becoming better with each update. The possibility that these types of events can occur is still present, but the risks to scientists will evolve along with the technology.

AI may plagiarize the work of others. Consider that the role of AI is to use a dataset of information to answer any questions that are posed to it. While the text it creates in response to a question might be newly written by the AI program, it is looking for common patterns, and, in doing so, might unintentionally use the same words as a previous author. One of the ways we guard against plagiarism is to provide credit for the concepts that come from others. This includes more than just the repetition of words, but also the ideas of others. Depending on AI to do the work of answering a question without additionally assessing the literature independently leaves researchers open to the possibility of claiming the ideas of others as their own, or using the ideas of others without appropriate attribution (Dien, 2023).

The predictive algorithms of AI are trained to discover patterns based on their training data. This means they look for common themes or ideas which will bias the program against new ideas. It is more likely to suppress views that are not part of the mainstream and/or ideas that oppose established scientific concepts. The software does not limit its training to the newest concepts in the field, so it can pull outdated or incomplete information into its response, missing information and keeping the research too attached to the past.

At this point in the development of AI, its ability to reason goes far beyond anything we have seen in the past. Yet, there are subtleties and complexities in human communication that AI might not understand. While scientific knowledge is built slowly, one idea incrementally extending the ideas that came before it, our logic is not always linear. We may bounce between topic areas, breaking siloes to combine ideas that are conceptually disparate but which lead us to leaps in comprehension. AI does not have the same creative potential, so the output received might limit the possibility for higher-level thinking (Sallam, 2023). Additionally, our language can include humor, sarcasm, and irony, all of which we understand through contextual clues; however, it is unlikely that AI would be able to pick up the same subtleties we do. While scientific papers are generally written in serious language without comedic intent, the datasets used to train AI included internet resources outside of the scientific literature. This means that LLMs have within their training dataset credible and less credible sources and/or disparate datapoints. The patterns they find in the data are likely to be influenced by these sources, so what they create in response to queries can be counterproductive and misleading. Indirectly, this is important when, as with JHL, the research being presented can be used in clinical care. Directly, the issue can have an even more profound effect on our wellbeing as AI is adopted by healthcare providers who are relying on it to interpret clinical situations and provide clinical guidance in decision-making.

This highlights one of the black boxes that exists in AI technology. The datasets on which AI was trained or by which continued training occurs are kept private by the companies who are doing the training (van Dis et al., 2023). This is not without reason since making the training techniques public also provides information that could be used by others to actively influence the training. Regardless, the absence of information means that researchers, authors, and healthcare providers using LLMs do not know how much information the models have in the topic area they are enquiring about. The lower the level of input is in a certain topic area, the more likely the LLM is to provide low-quality responses. The ongoing training of LLMs based on the information coming into them also means that anything that is entered could become part of the algorithm that is used to provide information to others. What is entered into LLMs should not be considered private.

Despite all of these limitations, AI not only has the potential to bolster efficiency, but it also makes it possible to level the playing field in published scientific research. LLMs have the capacity to reduce language barriers for authors and mitigate the cost associated with language editing services. It can help with performing the mind-numbing tasks associated with publishing research such as formatting citations and proofreading, allowing the researcher to spend more time on the content. A counter argument could, of course, be made that over-reliance on AI for these tasks could lead to a decline in writing skills, in the same way that the calculator could lead to a reduced ability to perform mathematical calculations independently, and autocorrect programs could reduce spelling acumen. Regardless of this risk, the overall effect on the research and publishing industry could be an increase in diversity in published articles. The power of the tool, because it is an open resource, also has the capacity to bring elevated resources into low-resourced areas, if the need for adequate training on their use is distributed at the same speed as the AI programs themselves. The risk of providing the tool without appropriate training is that larger organizations with more funding will be able to inequitably capitalize on the technology, re-establishing old patterns or possibly even expanding the disparities present (van Dis et al., 2023).

Clearly, this is an evolving field, with many uncertainties still left to be worked out. While the technology is progressing, scientists and authors need to be certain that they are transparent in their use of LLMs, that they assure accuracy in what they submit for publication, and that they take on accountability for the work they produce.

Transparency in the Use of AI

Attribution of AI in scientific research is evolving. OpenAI launched their website to general users in November of 2022, and early adapters struggled to responsibly communicate their use of the technology. One attempt that has since gone out of favor was to list AI as an author since it might have played an important role in writing the paper (Stokel-Walker, 2023). But being an author implies more than the act of writing, it is imbued with responsibility for the work contained within the article. According to the (International Committee of Medical Journal Editors [ICMJE], 2024), in order to qualify for authorship, an author needs to meet all of the following four criteria:

  • ○ Substantial contributions to the conception or design of the work; or the acquisition,

  • analysis, or interpretation of data for the work; AND

  • ○ Drafting the work or reviewing it critically for important intellectual content; AND

  • ○ Final approval of the version to be published; AND

  • ○ to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Within this existing internationally accepted definition, LLMs do not meet the criteria for being an author, and should not be listed as such. ICMJE gives the following alternatives: If the LLM was used for writing assistance, it can be described in the Acknowledgements so that readers have a sense of what programs were used in the creation of the manuscript. Lubowitz (2023), of the journal Arthroscopy, suggests the use of a statement similar to the following: “During the preparation of this work, the author(s) used [NAME OF MODEL OR TOOL USED AND VERSION AND EXTENSION NUMBERS] in order to [REASON].” If the AI was instead used in data collection, analysis, or figure generation it is appropriate to describe its use in the Methods section. This is no different from our current practice in the Methods section in which we disclose all of the equipment used in a study so that the study is reproducible by others. Neither of these attributions is new, but they are being newly applied to AI. The World Association of Medical Editors (WAME, 2023) suggests that if the use of AI is included in the Methods section, it should include “the full prompt used to generate the research results, the time and date of query, and the AI tool used and its version.” (WAME Recommendation 2.2)

Responsible Use of AI

Given the risk of errors or biases when using AI, it is essential that all the work that is generated using these LLMs is checked and validated by human researchers. This would include studying the field of inquiry well enough to ensure appropriate attribution when the words or ideas of others are used. Authors must check all references to ensure that they are appropriate for the text being provided, and that the references themselves are real and not fabricated by the software. Human authors will need to take final responsibility for the language used in any publication as it will be held to the standards of the journal and will need to be thoughtful, inclusive, and unbiased. Because this technology is changing so rapidly, and the results of misuse could lead to very public recriminations, it is vital that authors stay abreast of the developments in ethical use and appropriate attributions in the use of AI (Sage, n.d.). Worldwide application of appropriate oversight and adherence to ethical guidelines are essential to our consensual use of this powerful technology (Table 1).

Table 1.

Responsible Use of AI: Before Submitting.

• Keep records of the system used, the date it was used, and the queries entered
• Check all references for accuracy
• Assure that all concepts are appropriately attributed
• Check paper for plagiarism
• Assure that the language used is unbiased and inclusive
• Study the field of inquiry independently to assure the validity of AI generated information
• Check the journal and/or publishers’ guidelines for appropriate forms of attribution
• Check current ethical guidelines on attribution for ethical use of AI on international publishing ethics sites (e.g. ICJME, COPE, Equator Network, and WAME)

Note. AI = Artificial Intelligence; ICJME = International Committee of Medical Journal Editors; COPE = Committee on Publication Ethics; Equator Network = Enhancing the QUAlity and Transparency Of health Research; WAME = World Association of Medical Editors. Adapted from: Sage (n.d.) Author Guidelines on Using Generative AI and Large Language Models. https://learningresources.sagepub.com/author-guidelines-on-using-generative-ai-and-large-language-models.

Conclusion

The use of AI is seductive. It can improve efficiency and productivity. It has the potential to increase equity in scientific publishing. It has the power to create well-written and elegant articles geared toward scientific journals. Yet, researchers and authors cannot be complacent in using this tool; instead, they must actively engage in the output created to ensure that it is correct and current in every aspect, as all authors will be held fully accountable for the material they submit for publication.

Acknowledgments

The author acknowledges the use of AI to explore the landscape of this topic prior to writing the paper. No AI was used to generate or edit text. The author takes full responsibility for the content of this paper. The author also thanks Zelalem Haile for his technical review and contribution to this publication.

Footnotes

Author Contributions: Ellen Chetwynd: Conceptualization; Resources; Writing – original draft; Writing – review & editing.

The author declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The author holds a paid position as the Editor in Chief for the Journal of Human Lactation at the time this publication was written.

Funding: The author received no financial support for the research, authorship, and/or publication of this article.

ORCID iD: Ellen Chetwynd Inline graphic https://orcid.org/0000-0001-5611-8778

References

  1. Alkaissi H., McFarlane S. I. (2023). Artificial hallucinations in ChatGPT: Implications in Scientific Writing. Cureus, 15(2), Article e35179. 10.7759/cureus.35179 [DOI] [PMC free article] [PubMed] [Google Scholar]
  2. Committee on Publication Ethics Council. (2021). Artificial intelligence (AI) in decision making. COPE Discussion Document—English. 10.24318/9kvAgrnJ [DOI] [Google Scholar]
  3. Committee on Publication Ethics (COPE), & Scientific, Technical, and Medical (STM). (2022). Paper mills research. Research report from COPE & STM—English. 10.24318/jtbG8IHL [DOI] [Google Scholar]
  4. Cacciamani G. E., Gill I. S., Collins G. S. (2023). ChatGPT: standard reporting guidelines for responsible use. Nature, 618(7964), 238. https://doi-org.libproxy.lib.unc.edu/10.1038/d41586-023-01853-w [DOI] [PubMed] [Google Scholar]
  5. Dien J. (2023). Editorial: Generative artificial intelligence as a plagiarism problem. Biological Psychology, 181, Article 108621. 10.1016/j.biopsycho.2023.108621 [DOI] [PubMed]
  6. International Committee of Medical Journal Editors. (2024). Defining the role of authors and contributors. https://www.icmje.org/recommendations/browse/roles-and-responsibilities/defining-the-role-of-authors-and-contributors.html
  7. Lewis A., Vu P., Duch R. M., Chowdhury A. (2023). Deepfake detection with and without content warnings. Royal Society Open Science, 10(11), Article 231214. 10.1098/rsos.231214 [DOI] [PMC free article] [PubMed] [Google Scholar]
  8. Lubowitz J. H. (2023). Guidelines for the use of generative artificial intelligence tools for biomedical journal authors and reviewers. Arthroscopy: The Journal of Arthroscopic & Related Surgery: Official Publication of the Arthroscopy Association of North America and the International Arthroscopy Association. Advance online publication. 10.1016/j.arthro.2023.10.037 [DOI]
  9. Muthukrishnan N., Maleki F., Ovens K., Reinhold C., Forghani B., Forghani R. (2020). Brief history of artificial intelligence. Neuroimaging Clinics of North America, 30(4), 393–399. 10.1016/j.nic.2020.07.004 [DOI] [PubMed] [Google Scholar]
  10. Parkinson A., Wykes T. (2023). The anxiety of the lone editor: Fraud, paper mills and the protection of the scientific record. Journal of Mental Health, 32(5), 865–868. 10.1080/09638237.2023.2232217 [DOI] [PubMed] [Google Scholar]
  11. Sage. (n.d.). Author guidelines on using generative AI and large language models. https://group.sagepub.com/assistive-and-generative-ai-guidelines-for-authors
  12. Sallam M. (2023). ChatGPT utility in healthcare education, research, and practice: Systematic review on the promising perspectives and valid concerns. Healthcare, 11(6), Article 887. 10.3390/healthcare11060887 [DOI] [PMC free article] [PubMed] [Google Scholar]
  13. Stokel-Walker C. (2023). ChatGPT listed as author on research papers: Many scientists disapprove. Nature, 613(7945), 620–621. 10.1038/d41586-023-00107-z [DOI] [PubMed] [Google Scholar]
  14. van Dis E. A. M., Bollen J., Zuidema W., van Rooij R., Bockting C. L. (2023). ChatGPT: Five priorities for research. Nature, 614(7947), 224–226. 10.1038/d41586-023-00288-7 [DOI] [PubMed] [Google Scholar]
  15. World Association of Medical Editors. (2023). Chatbots, generative AI, and scholarly manuscripts. 10.1038/d41586-023-00288-7 [DOI] [PubMed]

Articles from Journal of Human Lactation are provided here courtesy of SAGE Publications

RESOURCES