Dec 19

Dec 19 Retractions, AI, and Peer Review: More Transparency Needed!

AI Governance, Artificial Intelligence, Context, Journals, Research, Peer Review, Bluesky, ChatGPT

One piece of science news gaining traction on social media is the retraction of an infamous research paper that popularized the unproven and dangerous COVID-19 treatment using hydroxychloroquine. The paper, published in 2020 in the International Journal of Antimicrobial Agents, was pulled due to "ethical concerns and methodological problems."

Of course, retracting the paper does not automatically erase citations and references to it, nor does it make pre-prints and downloaded pages suddenly vanish. The Internet has a long memory. The residue of this publication will persist for quite some time.

This raises the question: Isn’t the peer review system employed by reputable scientific and medical journals supposed to prevent this sort of thing from happening?

In theory, yes. However, the peer review processes currently in place are primarily volunteer efforts. Mistakes are bound to happen. Additionally, there is a rising tide of AI-generated papers, notes, and comments that human editors are struggling to combat. Some organizations like the NIH are prohibiting the use of AI tools in the peer review of research funding requests (see Tapping the Brakes on AI Use in Peer Review.)

I fear that the genie is out of the bottle. AI tools can be quite useful if you know what you’re doing and don’t surrender control. For example, I asked ChatGPT to summarize comments made by BlueSky users regarding the article referenced at the top of this post. Here was ChatGPT’s response:

Summary of Comments on Scientific Paper Retraction

The series of comments reflects widespread frustration over issues within scientific publishing and peer review processes, sparked by the retraction of a controversial paper. Commenters expressed concerns about systemic problems, including conflicts of interest, pay-to-play models, and insufficient peer review, which undermine the credibility of scientific journals.

Several themes emerged:

Conflicts of Interest and Junk Science: Instances of authors failing to disclose conflicts of interest, such as owning businesses related to their studies, were highlighted. Many commenters criticized the publication of biased or poorly conducted studies that propagate misinformation.
Peer Review Concerns: Commenters lamented the erosion of rigorous peer review, suggesting some journals prioritize profit over scientific integrity. There were calls for reform, with suggestions like open, transparent peer review akin to a social media model.
Reproducibility and Methodology Issues: Concerns about irreproducible studies, small sample sizes, and methodological flaws were prevalent. Many emphasized that science relies on reproducibility and rigorous testing of conventional wisdom.
Systemic and Cultural Problems: Several participants criticized the scientific community for succumbing to financial and political pressures, with journals sometimes prioritizing speed or novelty over quality. The infiltration of institutions by bad actors and the over-commercialization of publishing were also noted.
Public Misinformation: Many expressed frustration over how initial, flawed studies often gain widespread attention, while retractions or corrections receive little notice, perpetuating public misunderstanding. Others noted how such papers are exploited for political or ideological narratives.
Calls for Reform: Suggestions ranged from stricter peer review processes to broader structural changes in publishing models. The idea of moving toward more open and interactive review systems was also discussed.

Overall, the comments reflect a deep concern about the state of scientific publishing and its implications for trust in science, with a shared commitment to addressing these issues to safeguard the integrity of research.

After comparing this summary against the stream of comments from BlueSky, I concluded that the summary was remarkably accurate. It struck a balance between alarmist “hair on fire” despair about the state of scientific journals and an inappropriate dismissal of legitimate concerns.

The last paragraph of the summary does demonstrate a moderate example of ChatGPT’s tendency to editorialize, given its reference to a "…shared commitment to addressing these issues to safeguard the integrity of research." While such a conclusion might be a reasonable takeaway based on a review of the comments, my own conservative approach would have balanced this sentiment with a concern about the difficulties of solving the problem of fake science infiltrating the review process.

This is why I believe it is unwise to use AI tools for content generation without carefully evaluating their performance. I have used them extensively and found them helpful for a variety of applications:

Editing my own writing for grammar, clarity, and spelling.
Summarizing long and complex documents.
Searching the web using detailed plain English queries.
Generating descriptive statistics from structured numeric data tables.
Cleaning tables of contact information prior to mail-merge based promotional campaigns.
Generating detailed outlines of project plans based on detailed requirements.
Categorizing and summarizing organizational capabilities in alignment with RFP requirements.
Creating coloring book pages for my grandchildren.

Granted, I could have performed most of the above tasks manually (except for the coloring book pages), but they would have taken longer. The ability to revise and iterate by clarifying instructions or analyzing uploaded documents is also a significant time-saver.

In conclusion, anyone using AI tools to create or analyze content must understand their limitations. While these tools can save time, they do not exempt users from responsibility for the final output. This applies to blog posts, government proposals, and even peer-reviewed journals.

Rather than outright banning AI tools at any step in the research and publishing process, I would instead advocate for honest transparency about how these tools are used throughout the process.

Copyright 2024 by Dennis D. McDonald. ChatGPT generated the image in response to a prompt asking for a display of Pandora letting loose AI demons. In honor of hydroxychloroquine I asked that the demons be changed to horses.