When in trouble or in doubt, invent new words. We have bibliometrics and scientometrics from the Age of Print. Now they are joined by informetrics, cybermetrics, webometrics and altmetrics, which might not be an accurate term, but it’s sticky (more than social media-based complimentary metrics, that’s for sure). Between the PLoS journals’ Article-Level-Metrics (ALM), Nature adding the colorful altmetric.com donut to every publication and the Elsevier giant piloting the donuts in some of their journals, I think we can say altmetrics are taking over the world. Psychology speaking, there’s something exciting about alternative metrics. You can watch the metrics go up daily, an almost instant gratification, while citation-based gratification can take years. On the other hand, that’s a lot of pressure. Is my article being covered in blogs, tweeted, bookmarked? And what does it even mean?
The answer to the last question is a) “we’re not sure yet” and b) “depends on the metrics source.” It seems the use of alternative metrics have started before we have had much of a chance to look into it. It’s a lot like the way citations were used in the seventies, before research about citation behavior took off in earnest. Now, Mike Taylor wrote an excellent article about the uses of alternative metrics and their motivations. Go read! (After you finish this post, of course).
Taylor lists a number of uses for alternative metrics:
- Prediction of ultimate citation – here there’s an interesting issue: do we correlate, say, tweets and citations for article which have been tweeted at least once, or do we correlate citations to articles which were tweeted with similar articles which were not? Most articles I’ve seen so far take the first approach and correlate number of citations with number of appearances in social media. However, in our research of articles that were covered in blog posts we compared them to articles from the same journal and the same year which were not covered in our sample at all. We ignored the number of appearances in favor of a binary, yes/no approach. Both systems work to some extent, but we cannot tell if they work for other disciplines than those studied.
- Measuring/recognizing component re-use/preparatory work/reproducibility – Data, code, etc. – anything goes, as long as you can reuse it in a reliable way. However, Taylor mentions that material can be used more because of its availability than scholarly need (I’m reminded of Open Access articles, that tend to receive more citations than articles behind a paywall).
- Hidden impact (impact without citation) – That’s a good one. Taylor mentions the most bookmarked article in Mendeley (over 43,000 readers), “How to choose a good scientific problem,” that have only been cited in Scopus 4 times. The same can be said (also not to that extent) about articles like “Ten simple rules for getting grants,” that have over 300 readers in Mendeley, but only 3 citations in Scopus. These are useful articles, but not the kind that get cited. In this case, alternative metrics are truly complimentary metrics, helping us record the full impact of an article beyond formal citations.
- Real-time filtering/real-time evaluation – Receiving data about articles’ impact in real time. Taylor warns that “it is unknown if there is sufficient data to make this work at a sufficiently fine granularity, whether this is of use to scholars and whether they would trust such a system.” In short, more-research-is-needed. I think this will probably be subjected to the Matthew Effect (the rich get richer) but that can be said about many metrics – after all, the term was coined to describe the phenomenon in traditional scholarly discourse.
- Measuring social reach/estimating social impact – Alternative metrics can help us measure the dissemination of scholarly material to the public.
Not all alternative metrics sources were born equal. I’m very biased here, but I don’t think one can study a lot from the number of tweets. An article can be tweeted because it has a catchy title or won an Ig-Nobel prize, but that’s not going to tell us much about its scholarly impact. Now that I think of it, winning an Ig-Nobel can also earn you blog coverage, so it’s maybe not a good example, but in general, I think that genuine (not promotional/spam) blog coverage says more about an article’s future impact than tweets. Twitter is about dissemination of news, it does not require deep thought and consideration.
Though Mendeley’s bookmarking doesn’t require much effort either, it has moderate correlations with citation counts. There’s an r=0.55 correlation, for example, between Mendeley bookmarks and Web of Science citations for 2007 Nature and Science articles. Other academic bookmarking services have lower correlations, perhaps due to lower popularity (Connotea, another bookmarking service, closed lately).
F1000 recommendations, on the other hand, are (supposed to be) well-thought after and written by professionals in the relevant field. However, the correlation between and article’s recommendation rating and its number of citations is rather weak. Perhaps that is to be expected, because only 2% of the biomedical articles are even recommended in F1000. As the authors put it “In a sense, F1000 recommendations cannot be expected to correlate very strongly with citations, simply because about 98% of all biomedical publications do not have any recommendation at all.” It could be that some F1000 articles are another example for useful-but-less-cited articles, since there are F1000 reviewers who are clinicians.
I said it in my first altmetric post, but it bears repeating: the biggest problem with altmetrics is that data isn’t always sustainable. A journal can last centuries and be well-documented; blog posts can one day disappear. Another problem is the relevance of data. We do not know if the data will have any meaning five, ten, twenty years from now. Alternative metrics are transient by nature and might obsolesce while journal citations, already well established, carry on.
Li, X, Thelwall, M, & Giustini, D (2012). Validating online reference managers for scholarly Scientometrics (91), 461-471 DOI: 10.1007/s11192-011-0580-x
Ludo Waltman, & Rodrigo Costas (2013). F1000 recommendations as a new data source for research evaluation: A
comparison with citations JASIST arXiv: 1303.3875v1
Taylor, M. (2013). Towards a common model of citation: some thoughts on merging altmetrics and bibliometrics Research Trends