Excessive use of words like ‘commendable’ and ‘meticulous’ suggests ChatGPT has been used in thousands of scientific studies
A London librarian has analyzed millions of articles in search of uncommon terms abused by artificial intelligence programs
Librarian Andrew Gray has made a “very surprising” discovery. He analyzed five million scientific studies published last year and detected a sudden rise in the use of certain words, such as meticulously (up 137%), intricate (117%), commendable (83%) and meticulous (59%). The librarian from the University College London can only find one explanation for this rise: tens of thousands of researchers are using ChatGPT — or other similar Large Language Model tools with artificial intelligence — to write their studies or at least “polish” them.
There are blatant examples. A team of Chinese scientists published a study on lithium batteries on February 17. The work — published in a specialized magazine from the Elsevier publishing house — begins like this: “Certainly, here is a possible introduction for your topic:Lithium-metal batteries are promising candidates for….” The authors apparently asked ChatGPT for an introduction and accidentally copied it as is. A separate article in a different Elsevier journal, published by Israeli researchers on March 8, includes the text: “In summary, the management of bilateral iatrogenic I’m very sorry, but I don’t have access to real-time information or patient-specific data, as I am an AI language model.” And, a couple of months ago, three Chinese scientists published a crazy drawing of a rat with a kind of giant penis, an image generated with artificial intelligence for a study on sperm precursor cells.
Andrew Gray estimates that at least 60,000 scientific studies (more than 1% of those analyzed in 2023) were written with the help of ChatGPT — a tool launched at the end of 2022 — or similar. “I think extreme cases of someone writing an entire study with ChatGPT are rare,” says Gray, a 41-year-old Scottish librarian. In his opinion, in most cases artificial intelligence is used appropriately to “polish” the text — identify typos or facilitate translation into English — but there is a large gray area, in which some scientists take the assistance of ChatGPT even further, without verifying the results. “Right now it is impossible to know how big this gray area is, because scientific journals do not require authors to declare the use of ChatGPT, there is very little transparency,” he laments.
Artificial intelligence language models use certain words disproportionately, as demonstrated by James Zou’s team at Stanford University. These tend to be terms with positive connotations, such as commendable, meticulous, intricate, innovative and versatile. Zou and his colleagues warned in March that the reviewers of scientific studies themselves are using these programs to write their evaluations, prior to the publication of the works. The Stanford group analyzed peer reviews of studies presented at two international artificial intelligence conferences and found that the probability of the word meticulous appearing had increased by 35-fold.
Zou’s team, on the other hand, did not detect significant traces of ChatGPT in the corrections made in the prestigious journals of the Nature group. The use of ChatGPT was associated with lower quality peer reviews. “I find it really worrying,” explains Gray. “If we know that using these tools to write reviews produces lower quality results, we must reflect on how they are being used to write studies and what that implies,” says the librarian at University College London. A year after the launch of ChatGPT, one in three scientists acknowledged that they used the tool to write their studies, according to a survey in the journal Nature.
Gray’s analysis shows that the word “intricate” appeared in 109,000 studies in 2023, more than double the average of 50,000 in previous years. The term “meticulously” went from appearing in about 12,300 studies in 2022 to more than 28,000 in 2023. While instances of “commendable“ rose from 6,500 to almost 12,000. The researcher jokes that his colleagues have congratulated him on the meticulousness of his report, still a draft pending publication in a specialized journal.
Very few studies report if they have used artificial intelligence. Gray warns of the danger of “a vicious circle,” in which subsequent versions of ChatGPT are trained with scientific articles written by the old versions, giving rise to increasingly commendable, intricate, meticulous and, above all, insubstantial studies.
Documentation professor Ángel María Delgado Vázquez highlights that the new analysis is focused on English-language studies. “Researchers who do not speak native English are using ChatGPT a lot, as an aid to writing and to improve the English language,” says Delgado Vázquez, a researcher from the Pablo de Olavide University, in Seville, Spain. “In my environment, people are using ChatGPT mainly for a first translation, or even to keep that translation directly,” he says. The Spanish professor says he would like to see an analysis on the origin of the authors who use the unusual terms.
Another one of AI’s favorite words is “delve.” Researcher Jeremy Nguyen, from the Swinburne University of Technology (Australia), has calculated that “delve” appears in more than 0.5% of medical studies, where before ChatGPT it was less than 0.04 %. Thousands of researchers are suddenly delving.
Librarian Andrew Gray warns there is a risk of broader society becoming infected with this meticulously artificial new language. Nguyen himself admitted on the social network X that it happens to him: “I actually find myself using “delve” lately in my own language—probably because I spend so much time talking to GPT.” On April 8, the official ChatGPT account on X chimed in: “I just love delving what can I say?”
Sign up for our weekly newsletter to get more English-language news coverage from EL PAÍS USA Edition