About the SA Blog Network

Information Culture

Information Culture

Thoughts and analysis related to science information, data, publication and culture.
Information Culture HomeAboutContact

What’s wrong with citation analysis?

The views expressed are those of the author and are not necessarily those of Scientific American.

Email   PrintPrint

What’s wrong with citation analysis?

Other than your papers not being cited enough, what’s wrong with measuring scientific influence based on citation count? Citation analysis-based decisions concerning grants, promotions, etc. have become popular because, among other things, they’re considered “unbiased.” After all, such analysis gives numbers even non-professionals can understand, helping them make the best and most accurate decisions.

The written above is polite fiction. Why? First of all, citation analysis can only work with written, actual citations, but being influenced by something doesn’t mean you’re automatically going to refer to it. One of the basic assumptions behind citation analysis is that all, or at least most, of influences are cited in articles. It doesn’t work that way. MacRoberts and MacRoberts (2010) define influence as “When it is evident in the text that an author makes use of another’s work either directly or through secondary sources he or she has been influenced by that work.” According to a series of studies they conducted, only about 30% of influences are cited.

Secondary sources – Goodbye, citations. Once your article has been covered in a review or two, your findings will often be credited to the review article rather than your own. I’m citing only two MacRoberts & MacRoberts’ articles, one of them a review, because A. those are the ones I’ve read and B. I’m too lazy to read and cite all the research they refer to. That’s okay for informal scientific literature. However, if this was a peer-reviewed article, all the authors and articles not individually cited would have lost a citation. There’s a reason review articles are cited so often.

No informal citations. Those important conversations you had with your dissertation advisor or in a conference over lunch are forever gone, even though you might have gotten some of your best ideas from them. The paper you’ve been impressed with but couldn’t find a place to cite suffers the same fate. To quote MacRoberts and MacRoberts (1996) again:

“If one wants to know what  influence has  gone into  a particular bit of research,  there is  only  one  way  to  proceed:  head  for  the  lab  bench,  stick  close  to  the  scientist  as  he works  and  interacts  with  colleagues,  examine  his  lab  notebooks,  pay  close  attention  to what  he  reads,  and  consider  carefully his  cultural  milieu.”

They’re right, but their suggestion is hardly practical. That is why, in the last few years, bibliometricians have been trying to come up with metrics of academic social media cites. As the Altmetrics manifesto (2010) says “…that dog-eared (but uncited) article that used to live on a shelf now lives in Mendeley, CiteULike, or Zotero–where we can see and count it.” Unfortunately, Altmetrics indices are still far from accurate (not that citation indices are, but we’re stuck with them). If we’re to add new metrics to the mix, they better be good.

Limited databases. I mentioned it before in this blog, but it’s worth repeating: citation databases are painfully limited to a fraction of scientific publications, most of the covered ones being peer-reviewed journals. I have six Google Scholar citations for my blogs characterization article, but only two in Scopus. That’s one of the reasons your GS indices are usually higher than your Web of Science and Scopus ones. My dissertation advisor, Prof. Mike Thelwall, has an h-index of 47 in GS, 31 in Scopus, and 25 in WoS. All are correct, all are wrong. It depends on the coverage and the speed of update.

The Matthew Effect – or “the rich get richer.” People tend to cite already well-cited material by well-known researchers, either because that’s what they’ve read, because they’re appealing to the authority of the better known, or both.

Multiple motives -  as this helpful comics shows, there are multiple motives for citations, many of them have less in common with “giving credit where credit is due” than we would like to think.

Real Impact Factor by Jorge Cham, PhD Comics
Real Impact Factor by Jorge Cham, PhD Comics


Even if one is trying to be as honest and accurate as possible in her citations, she can only cite what she’s familiar with, and the number of articles one can read is limited (again, a factor in the popularity of review articles). She’s going to cite her professors, her co-authors, the people she heard in conferences and the big names in her field, but she is bound to miss some relevant material no matter what.

As much as citation analysis seems attractive, it’s not as accurate as we would like to believe. It represents only a part of the scientific world, and should not be taken as gospel.

ETA: I’m afraid there was a mistake in the original post – the Web of Science h-index for Mike Thelwall is 25 and not 15.

MacRoberts, M., & MacRoberts, B. (1996). Problems of citation analysis Scientometrics, 36 (3), 435-444 DOI: 10.1007/BF02129604

MacRoberts, M., & MacRoberts, B. (2010). Problems of citation analysis: A study of uncited and seldom-cited influences Journal of the American Society for Information Science and Technology, 61 (1), 1-12 DOI: 10.1002/asi.21228

Priem, J., Taraborelli, D., Groth, P., & Neylon, C. (2010). altmetrics: a manifesto


Hadas Shema About the Author: Hadas Shema is an information specialist at the Israeli Inter-University Center for E-Learning (Hebrew acronym: MEITAL). She has a B.Sc. in the Life Sciences and an MA and a PhD in Library & Information Science from Bar-Ilan University, Israel. Hadas tweets at @Hadas_Shema.

The views expressed are those of the author and are not necessarily those of Scientific American.

Rights & Permissions

Comments 4 Comments

Add Comment
  1. 1. gmperkins 4:38 pm 01/2/2013

    Good summary though you are probably preaching to the choir. Citation analysis has always seemed to be a tool for administrators who don’t want to trust the scientists/engineers/researchers working for them, that such-and-such’s work is good, etc.
    The bean counters need things to count.

    Link to this
  2. 2. kapsar 8:53 am 01/8/2013

    I think your missing a big point of bibliometrics (or scientometrics), the network analysis that typically accompanies the citation analysis. Citations rates alone are interesting, but they aren’t that interesting. However, understanding the underling networks associated with the citations or collaborations make these analyses a lot more interesting.

    For information about that I suggest reading: Spatial scientometrics: Towards a cumulative research program
    Koen Frenken∗, Sjoerd Hardeman, Jarno Hoekman from Journal of Informetrics 2009

    Link to this
  3. 3. jmartiniii1968 12:25 am 06/20/2013

    I think it is important to clearly distinguish between bibliometrics, scientometrics, citation analysis, citation network analysis, co-citation analysis, and co-citation network analysis. It seems too often all forms of bibliometric/scientometric methods get dropped into the “citation analysis as academic rating system” drawer, which has little or nothing to do with the use of citation and co-citation networks as tools to chart and reveal the structure of scientific/academic disciplines. And when used as a means to uncover latent knowledge hiding within these vast networks, it is even quite capable of generating new discoveries.

    As someone who works with Scopus and WoS doing co-citation analyses, I would also point out that an oft overlooked weakness of these databases is that the data is rarely very “clean”. I have spent countless hours trying to correct the myriad of name and title variations, as well as outright misspellings. For a single author, for example, I have found at least 8 different ways their name appears in WoS. And when dealing with thousands of source articles and hundreds of thousands of citations, it is practically impossible to completely clean the data.

    And perhaps one day someone will develop a way to easily combine Scopus and WoS export formats so that both database outputs can be easily combined and analyzed together……but I hope too much….

    Link to this
  4. 4. Hadas Shema in reply to Hadas Shema 4:15 pm 06/23/2013

    I think an easily combined Scopus and WoS export formats might be against Elsevier and Thomson-Reuters’ nature, but yes, it would be excellent. I completely understand your difficulty with those databases, and wait for ORCID to become popular, so I can tell whether the “John Smith” here and the “John C. Smith” there are actually the same person.

    Link to this

Add a Comment
You must sign in or register as a member to submit a comment.

More from Scientific American

Email this Article