Skip to main content

What's wrong with citation analysis?

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American


What's wrong with citation analysis?

Other than your papers not being cited enough, what's wrong with measuring scientific influence based on citation count? Citation analysis-based decisions concerning grants, promotions, etc. have become popular because, among other things, they're considered "unbiased." After all, such analysis gives numbers even non-professionals can understand, helping them make the best and most accurate decisions.

The written above is polite fiction. Why? First of all, citation analysis can only work with written, actual citations, but being influenced by something doesn't mean you're automatically going to refer to it. One of the basic assumptions behind citation analysis is that all, or at least most, of influences are cited in articles. It doesn't work that way. MacRoberts and MacRoberts (2010) define influence as "When it is evident in the text that an author makes use of another's work either directly or through secondary sources he or she has been influenced by that work." According to a series of studies they conducted, only about 30% of influences are cited.


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Secondary sources - Goodbye, citations. Once your article has been covered in a review or two, your findings will often be credited to the review article rather than your own. I'm citing only two MacRoberts & MacRoberts' articles, one of them a review, because A. those are the ones I've read and B. I'm too lazy to read and cite all the research they refer to. That's okay for informal scientific literature. However, if this was a peer-reviewed article, all the authors and articles not individually cited would have lost a citation. There's a reason review articles are cited so often.

No informal citations. Those important conversations you had with your dissertation advisor or in a conference over lunch are forever gone, even though you might have gotten some of your best ideas from them. The paper you've been impressed with but couldn't find a place to cite suffers the same fate. To quote MacRoberts and MacRoberts (1996) again:

"If one wants to know what influence has gone into a particular bit of research, there is only one way to proceed: head for the lab bench, stick close to the scientist as he works and interacts with colleagues, examine his lab notebooks, pay close attention to what he reads, and consider carefully his cultural milieu."

They're right, but their suggestion is hardly practical. That is why, in the last few years, bibliometricians have been trying to come up with metrics of academic social media cites. As the Altmetrics manifesto (2010) says "...that dog-eared (but uncited) article that used to live on a shelf now lives in Mendeley, CiteULike, or Zotero–where we can see and count it." Unfortunately, Altmetrics indices are still far from accurate (not that citation indices are, but we're stuck with them). If we're to add new metrics to the mix, they better be good.

Limited databases. I mentioned it before in this blog, but it's worth repeating: citation databases are painfully limited to a fraction of scientific publications, most of the covered ones being peer-reviewed journals. I have six Google Scholar citations for my blogs characterization article, but only two in Scopus. That's one of the reasons your GS indices are usually higher than your Web of Science and Scopus ones. My dissertation advisor, Prof. Mike Thelwall, has an h-index of 47 in GS, 31 in Scopus, and 25 in WoS. All are correct, all are wrong. It depends on the coverage and the speed of update.

The Matthew Effect – or "the rich get richer." People tend to cite already well-cited material by well-known researchers, either because that's what they've read, because they're appealing to the authority of the better known, or both.

Multiple motives - as this helpful comics shows, there are multiple motives for citations, many of them have less in common with "giving credit where credit is due" than we would like to think.

Real Impact Factor by Jorge Cham, PhD Comics

 

Even if one is trying to be as honest and accurate as possible in her citations, she can only cite what she's familiar with, and the number of articles one can read is limited (again, a factor in the popularity of review articles). She's going to cite her professors, her co-authors, the people she heard in conferences and the big names in her field, but she is bound to miss some relevant material no matter what.

As much as citation analysis seems attractive, it's not as accurate as we would like to believe. It represents only a part of the scientific world, and should not be taken as gospel.

ETA: I'm afraid there was a mistake in the original post - the Web of Science h-index for Mike Thelwall is 25 and not 15.

MacRoberts, M., & MacRoberts, B. (1996). Problems of citation analysis Scientometrics, 36 (3), 435-444 DOI: 10.1007/BF02129604

MacRoberts, M., & MacRoberts, B. (2010). Problems of citation analysis: A study of uncited and seldom-cited influences Journal of the American Society for Information Science and Technology, 61 (1), 1-12 DOI: 10.1002/asi.21228

Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). altmetrics: a manifesto http://altmetrics.org/manifesto/