Skip to main content

Providing context for the metrics used to evaluate the scientific literature

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American


As the sole science librarian at a small liberal arts college, I work with faculty and students in a variety of disciplines. This means that I need to understand the literature of those disciplines, and understanding the literature means knowing at least a little bit about the metrics that are used to measure it: impact factors, h-indexes and altmetrics can all be interesting and useful, but establishing context can be difficult.

For example, is an h-index of 9 good, bad or indifferent?

  • It can depend on discipline. Citation patterns vary: mathematicians cite fewer papers than Earth Scientists, who cite fewer papers than those in biomedicine (see Podlubny, 2005). Co-authorship traditions vary: publications in high energy physics or genetics tend to have more authors than those in paleontology. All this makes it difficult to compare h-indexes across disciplines.

  • It can depend on expectations. At a primarily undergraduate institution like mine, expectations for research output are lower than at major research universities (we are also less likely to rely on metrics like the h-index).

  • It can depend the stage of the researcher's career. An assistant professor just two years into their first permanent position can't be expected to have a higher index than the researcher who was just promoted to full-professor.


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Likewise, it is hard to evaluate impact factors across disciplines (along with many other problems with the IF). The flagship journal of the American Chemical Society (JACS) has an impact factor of 9.707. The flagship journal of the Geological Society of America (GSA Bulletin) has an impact factor of 3.787. We can't compare the two like this, and we certainly can't use these numbers to compare researchers from the two disciplines.

In a recent article in EMBO Reports, Bornmann and Marx (2013) argue for the greater use of percentiles in evaluating researchers, institutions and publications. You remember percentiles, right? When you took the SAT or the GRE, your results came back with a score and a percentile: if you were at the 85 percentile, you scored higher than 85% of your peers.

Percentiles can provide important context in one easy to read number. You know that the top is 100 and the bottom is 0. The median is 50, and this helps us makes sense of things even if the data set is skewed. However, it becomes incredibly important to select the right group to compare against. Typically, subject and year can create reasonable groups.

The folks who look at research metrics outside of citations also have the challenge of providing context. Altmetrics examine the different ways that folks interact with journal articles (in addition to citations). Are they talking about the article on Twitter? Saving the article to a bookmarking cite like CiteULike or Mendeley? Is the public citing the article on Wikipedia?

Once again, context is vital. An article was tweeted about twice. Is this good, bad or indifferent? Sixty folks on Mendeley have added it to their libraries. But what does that mean? The premier tool for easily showing and displaying altmetrics, Impact Story, can provide a bit of context for these numbers by calculating percentiles based on a comparison group of randomly selected items from the same publication year. Right now, it doesn't appear that Impact Story is taking advantage of subject categories (which is more difficult). As a result, articles in some disciplines would automatically have lower percentiles as an artifact of lower average citations in that discipline.

Using any metric to evaluate scientific research is tricky - you are trying to boil down the intellectually complicated act of advancing human knowledge into a single number. But these metrics are being used more and more by tenure and promotion committees, institutional advisory boards, grant review committees and more. If folks choose to use metrics (like percentiles) that can provide reasonable and reliable context, we can avoid at least a couple of the standard pitfalls.

See also:

  • Bornmann, L., & Marx, W. (2013). How good is research really? Measuring the citation impact of publications with percentiles increases correct assessments and fair comparisons. EMBO reports. doi:10.1038/embor.2013.9

  • Podlubny, I. (2005). Comparison of scientific impact expressed by the number of citations in different fields of science. Scientometrics, 64(1), 95–99. doi:10.1007/s11192-005-0240-0

  • Elsewhere on this blog, my excellent co-blogger Hadas Shema has talked about some of the key concepts that underlie citation analysis and bibliometrics. See her posts about the impactfactor, problems with citation analysis and negative citations, for example.

 

claimtoken-511bdd808802f

About Bonnie Swoger

Bonnie J. M. Swoger is a Science and Technology Librarian at a small public undergraduate institution in upstate New York, SUNY Geneseo. She teaches students about the science literature, helps faculty and students with library research questions and leads library assessment efforts. She has a BS in Geology from St. Lawrence University, an MS in Geology from Kent State University and an MLS from the University at Buffalo. She would love to have some free time in which to indulge in hobbies. She blogs at the Undergraduate Science Librarian and can be found on twitter @bonnieswoger.

More by Bonnie Swoger