www.ddmcd.com

View Original

Larry the Cat’s Tale of Online Scientific Influence

By Dennis D. McDonald

"Publish or perish" remains a guiding principle among academic researchers. Its significance has been remarkably consistent over the years: if you want to advance in the academic research world, you need to get your work published in respected peer-reviewed journals. Furthermore, the more your research is cited by other scholars, the better your prospects for future promotions and job offers.

Despite the shift to online publishing, the importance of citation counts has persisted, though there are complaints that these metrics are not perfect methods for measuring "influence," whatever that might mean. Still, numbers are numbers, and researchers value them.

Google Scholar is a search engine provided by Google that focuses on scholarly publications, including journal articles, research reports, theses and dissertations, and other technical documents. It supplies various data in response to search queries, which may include links to articles published in peer-reviewed journals as well as "gray literature" such as technical or research reports not published in journals.

While Google Scholar may not be a "perfect" tool compared to more specialized academic search engines, it is free, widely available, and capable of providing useful information. It also includes a “citation ranking” feature based on the number of citations a paper has received from other scholarly or academic publications. This feature can offer some insight into the relative "influence" a particular linked item might have.

One question is: how accurate is Google Scholar’s citation ranking feature? And how easy might it be to artificially manipulate this feature?

An article in the July 31, 2024, issue of Science magazine reports on two researchers' attempts to experimentally manipulate statistics provided by Google Scholar. In the article How easy is it to fudge your scientific rank? Meet Larry, the world's most cited cat, author Christie Wilcox describes an experiment conducted by researchers from Northwestern University and Cambridge University.

The researchers used a method popularized by commercial “citation-boosting services,” one of which they found advertised on Facebook. First, they created a series of papers that either claimed to be authored by or cited fake math articles by “Larry the Cat.” They then posted PDFs of the articles on a server known to be indexed by Google Scholar. After Google Scholar indexed the papers, the researchers deleted them from the server.

By that time, Google Scholar had already calculated citation rankings, and just like that, Larry the Cat became a published—and cited—author! According to one of the researchers, "I asked Larry what his reaction was over the phone,” Richardson told Science. "I can only assume he was too stunned to speak."

While this is certainly an entertaining story, it reminds us that numbers can be manipulated, even those that claim to be objective or free from bias. We should be particularly skeptical of anything that claims to measure sentiment, feelings, or amorphous concepts like “influence” or “voting preferences.”

Having once made a living as a number-generator and number-cruncher while studying how scientists and other researchers communicate their findings, I’ve learned to be skeptical whenever anyone uses survey data to “predict” the future.

Nowadays, it pays to be skeptical of just about anything communicated online, especially given the increasingly sophisticated abilities of AI tools to mimic reality.

Some might view this as a sad reality of our times. On the other hand, it emphasizes the continued importance of identifying and trusting those who have earned our confidence—lest we be fooled by someone’s cat!

Copyright (c) 2024 by Dennis D. McDonald