There are multiple guidelines from publishers and organizations on the use of artificial intelligence (AI) in publishing.1-5 However, none are specific to family medicine. Most journals have basic AI use recommendations for authors, but more explicit direction is needed, as not all AI tools are the same.
As family medicine journal editors, we want to provide a unified statement about AI in academic publishing for authors, editors, publishers, and peer reviewers based on our current understanding of the field. The technology is advancing rapidly. While text generated from early large language models (LLMs) was relatively easy to identify, text generated from new versions is getting progressively better at imitating human language and is more challenging to detect. Our goal is to develop a unified framework for managing AI in family medicine journals. As this is a rapidly evolving environment, we acknowledge that any such framework must continue to evolve. However, we also feel it is important to provide some guidance for where we are today.
Definitions
Artificial intelligence is a broad field where computers perform tasks historically thought to require human intelligence. Large language models are a recent breakthrough in AI that allow computers to generate text that appears to come from a human. Large language models deal with language generation, while the broader term generative AI can also include AI-generated images or figures. ChatGPT is one of the earliest and most widely used LLM models, but other companies have developed similar products. Large language models learn to do a multifaceted analysis of word sequences in a massive text training database and generate new sequences of words using a complex probability model. The model has a random component, so responses to the exact same prompt submitted multiple times will not be identical. Large language models can generate text that looks like a medical journal article in response to a prompt, but the article’s content may or may not be accurate. Large language models may “confabulate,” generating convincing text that includes false information.6-8 Large language models do not search the Internet for answers to questions; however, they have been paired with search engines in increasingly sophisticated ways. For the rest of this editorial, we will use the broad term AI synonymously with LLMs.
Role of LLMs in academic writing and research
As LLM tools are updated and authors and researchers become familiar with them, they will undoubtedly become more functional in assisting the research and writing process by improving efficiency and consistency. However, current research on the best use of these tools in publication is lacking. A systematic review exploring the role of ChatGPT in literature searches found most articles on the topic are commentaries, blog posts, and editorials, with little peer-reviewed research.9 Some studies demonstrated benefit in narrowing the scope of a literature review when AI tools were applied to databases of studies and prompted to evaluate them for inclusion based on title and abstract. Another paper reported AI had 70% accuracy in appropriately identifying relevant studies compared with human researchers, and may reduce time and provide a less subjective approach to literature review.10-12 When used to assist writing background sections, LLM writing was rated the same, if not better than human researchers, but citations were consistently false in another study.13 Large language models are frequently deficient in providing real papers and correctly matching authors to their own papers when generating citations, and therefore are at risk of creating fictitious citations that appear convincing despite incorrect information including digital object identifier numbers.6,14
Studies evaluating perceptions of AI use in academic journals and strengths and weaknesses of the tools revealed no agreement on how to report the use of AI tools.15 There are many tools; for example, some are used to improve grammar and others to generate content, yet parameters on substantive use versus nonsubstantive use are lacking. Furthermore, current AI detection tools cannot adequately distinguish use types.15 Reported benefits include reducing workload and the ability to summarize data efficiently, whereas weaknesses include variable accuracy, plagiarism, and deficient application of evidence-based medicine standards.7,16
Guidelines on appropriate AI use exist, such as the Living Guidelines on the Responsible Use of Generative AI in Research produced by the European Commission.17 These guidelines include steps for researchers, organizations, and funders. The fundamental principles for researchers are to maintain ultimate responsibility for content; apply AI tools transparently; ensure careful evaluation of privacy, intellectual property, and applicable legislation; continuously learn how best to use AI tools; and refrain from using tools on activities that directly affect other researchers and groups.17 While these are helpful starting points, family medicine publishers can collaborate on best practices for using AI tools and help define substantive reportable use while acknowledging the current limitations of various tools and understanding they will continue to evolve. Family medicine journals do not have unique AI needs as compared with other journals, but the effort of editors to jointly present principles related to AI is a unique model.
Guidance for using LLMs and AI in family medicine publications
The core principles of scientific publishing will remain essentially unchanged by AI. For example, authorship criteria will remain the same. Authors must still be active participants in conceptualizing and producing scientific work; writers and editors of manuscripts will be held accountable for the product.
Authors must still appropriately cite the work of others when creating current scientific research. Citing works will likely change over time as the use of AI in publishing matures. It is impossible to accurately list all sources used to train a given AI product. However, it would be possible to cite where a fact originated, or who conceived a particular idea. Similarly, authors will still need to ensure their final drafts are sufficiently original and not inadvertently plagiarize the work of others.1,18 Authors must be well versed in the existing literature of a given field.
Impact on diversity, equity, and inclusion efforts
Since LLMs model text generation on a training data set, there is an inherent concern they will discover biased arguments and repeat them, thereby compounding bias.19 Since LLMs mimic human-created content, and there is a preponderance of biased, sexist, racist, and other discriminatory content on the Internet, this is a considerable risk.20 Some companies now work in the LLM/AI space to eliminate biases from these models, but they are in their infancy. Equality AI, for example, is developing responsible AI to “solve healthcare’s most challenging problems: inequity, bias and unfairness.”21 More investment is necessary to further remove bias from LLM/AI models. While authors have touted AI and LLMs as bias elimination tools, the fact results of bias elimination tools are not reproducible with any consistency has scholars questioning their utility. Successful deployment of an unbiased LLM/AI tool will depend on carefully examining and revising existing algorithms and data used to train them.22 Excellent, unbiased algorithms have not been developed but might be in the future.23 Artificial intelligence tools can be used as de facto editorial assistants that may help globalize the publication process by helping nonfluent English speakers publish in English-language journals.
Future directions
The use of LLMs and broader AI tools is expanding rapidly. There are opportunities at all levels of research, writing, and publishing to use AI to enhance our work (Box 1).24 A key goal for all family medicine journals is to require authors to identify the use of LLMs and ensure the LLMs used provide highly accurate information and mitigate the frequency of confabulation. Research is ongoing to develop methods to determine the accuracy of LLM output.25 Editors and publishers must continue to advocate for accurate tools to validate the work of LLMs. Researchers should assess the performance of tools used in the writing process. For example, they should study the extent to which LLMs plagiarize, provide false citations, or generate false statements. They should also study tools that detect these events.
Guiding principles for using AI in family medicine research and publishing
For authors
Disclose any use of AI or LLMs in the research or writing process and describe how it was used (eg, “I used ChatGPT to reduce the word count of my paper from 2700 to 2450.”). Standard disclosure statements may be helpful (eg, https://jamanetwork.com/journals/jama/fullarticle/2816213)
Be accountable to ensure work is original and accurate. For example, when using LLMs to generate text, authors can unwittingly plagiarize existing work. Authors are ultimately responsible for ensuring their work is original
Understand the limitations of LLMs (eg, erroneous citations)
Be aware of the potential for AI or LLMs to perpetuate bias
For journals and editorial teams
Explore ways AI can streamline the publication process at various stages
Develop clear, transparent guidelines for authors and reviewers before using LLMs in publishing
Do not allow LLMs to be cited as authors on manuscripts
Develop a method to accurately evaluate the use of LLMs in the writing process (eg, determine plagiarism, assess validity of references, fact check statements)24
AI—artificial intelligence, LLM—large language model.
Artificial intelligence tools are already being used by some publishers and editors for initial manuscript screening and to match potential reviewers with submitted papers. The complex interplay between AI tools and humans is evolving.26 While AI is not likely to replace human researchers, authors, reviewers, or editors, it continues to contribute to the publication process in myriad ways. We want to know more: How can LLMs contribute to the publication process? Can authors ask LLMs to do literature searches or draft a paper? Can we train AI to contribute to a revision of a paper, or to review a paper? Probably, but we must scrutinize any AI-generated references, and we likely cannot train AI to evaluate conclusions or determine impact of a specific paper in the field. Family medicine journals are publishing important papers on AI—not only about its use in research and publishing, but also about its use in clinical practice—and this editorial is a call for more scholarship in this area.27-33
Acknowledgement
We acknowledge Dan Parente, Steven Lin, Winston Liaw, Renee Crichlow, Octavia Amaechi, Brandi White, and Sam Grammer for their helpful suggestions.
Footnotes
Copublished in American Family Physician, Annals of Family Medicine, Canadian Family Physician, Evidence-Based Practice by Family Physicians Inquiries Network, Family Medicine, Family Medicine and Community Health, FP Essentials, FPM, Journal of the American Board of Family Medicine, and PRiMER.
- Copyright © 2025 the College of Family Physicians of Canada