November 23, 2012

Interview with Dr. Victor Henning, Mendeley

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

This time (no, I haven't gone interview-only. One more after this one and we're back to regular posting) I'm interviewing Dr. Victor Henning. Dr Henning has a PhD in Psychology from the Bauhaus-University of Weimar, Germany, and is co-founder and CEO of Mendeley, a program which allows managing and sharing of research articles. Founded in late 2007, Mendeley has now reached 2 million users (and published an interesting global research report , with graphs showing activity by discipline, average number of articles collected per researchers, etc.)

You co-founded (with Jan Reichelt and Paul Föckler) Mendeley in 2007. What was your vision for Mendeley then, and has it changed since?

When we started Mendeley, we primarily wanted to solve our own problems: We were Ph.D. students and each had hundreds of research papers in PDF format that we needed to manage. We wondered why there wasn’t an easier way to manage those PDFs and keep them linked to bibliographic references, to enable you to create citations in the papers you’re writing. That was the first key idea behind Mendeley: Let’s develop a desktop app that can automatically turn my loose collection of PDFs into a structured research paper database that I can read, annotate, cite, and share with others.

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

The second key idea was: If we can get hundreds of thousands of researchers to use this desktop app to automatically extract information from research papers, why not crowdsource all of this data into an anonymized, open database? We realized that if the data was streaming in from researchers all over the globe in real time, we’d be able to analyze research trends across academic disciplines, connect researchers with similar interests, show readerships statistics for individual research papers, and generate collaborative filtering recommendations similar to Amazon’s, i.e. “people who have read research paper A have also read research paper B”.

That has been our vision from the start, and it hasn’t changed significantly since then. The biggest change was probably that we were initially quite focused on creating visualizations of research connections – e.g., you could visualize the citation network contained in your research paper collection as a 3D map within the Mendeley Desktop app. Later on, we dropped this because it was a magnificent task in itself, and we realized that we could let our community build visualization tools themselves if they wanted to. So our vision shifted to enabling third-party developers to tap into our crowdsourced data – for free, under a Creative Commons “CC-by” license - to build their own apps. Since the release of our Open API (http://dev.mendeley.com), more than 260 third-party research apps have been built –visualization tools, collaboration apps, semantic annotation tools, raw genome data mashups, expert finder services, Kindle and Android Apps, search interfaces, you name it. Our vision is now focused on turning Mendeley into a platform to let this ecosystem flourish.

The new Mendeley report shows that the biggest group of Mendeley users (31%) comes from the biology and medicine disciplines. Is it because you targeted those disciplines, because of their dominance in the global market of science in general, or are there other reasons?

We discussed whether to target specific disciplines in the beginning, but decided against it because we felt that Mendeley Desktop would be useful for anyone trying to manage academic knowledge – in the sciences as well as the humanities. I believe that the demographics of the Mendeley user base are possibly quite representative of the overall demographics of research: There are simply a lot more biologists and doctors than there are linguists and philosophers.

We did, however, target specific geographic regions first. Because all three of us Mendeley founders are German, our personal academic network was predominantly German, too. We wanted to avoid giving the impression to US and UK academics that Mendeley was German network, so we treaded carefully with inviting our friends first. Instead, when we launched, I did a speaking tour along the US East Coast and presented Mendeley at places like Princeton, NYU, Yale, Brown, MIT, Harvard, and Dartmouth – and it really took off from there.

Mendeley now has an institutional edition. What are its main functions, and what is its value for academic institutes?

In a bit more detail: Institutions that subscribe to the Mendeley Institutional Edition (MIE) can upgrade all of their students and faculty to a Mendeley premium account, meaning they receive more cloud storage/sync space, personalized research recommendations, enhanced group functionality, and premium support. The data dashboard tracks which journals are being read the most by the institution’s faculty and students, allowing their library to optimize their subscriptions – meaning, they can cancel journals that are not getting used, and subscribe to journals which are popular, but not yet provided by the library. This enables librarians squeeze more value out of ever-scarcer resources and provide a better service to their researchers.

Moreover, the MIE data dashboard tracks the faculty’s publications: In which journals are they publishing, and how much impact – measured in global readership – do these publications have? This helps institutions gather information for research excellence assessments and other reports, as well as highlighting to them who their current and future star researchers are.

Lastly, MIE contains a “social” tab which shows the librarian the public groups in which their faculty are most active. Subject librarians can then potentially join in the discussion, re-establish connections with their constituency, and help them discover relevant research content. They can also set the OpenURL library resolver for all researchers at their institution to ensure that, once they discover interesting citations on Mendeley, receive full-text access to the content their library has paid for.

Your impact index (in the institutional edition) is a new way of measuring articles' influence. Have you heard of any cases where it has been used in institutional decision-making? What do you think the index reflects?

MIE is still young – we only started rolling it out across campuses this fall semester – so I’ve only heard anecdotal evidence of it influencing institutional decision-making. All of our customers are excited about the prospect of making better subscription decisions while cutting subscription costs, and one librarian at a North American institution has told me specifically that he plans to use the data as negotiation leverage when discussing journal subscription renewals. It’ll be interesting how this plays out over the next few months, as we’re gradually expanding coverage to more campuses.

As for the meaning of Mendeley’s real-time readership index: I believe it is simply a very good measure of research impact, and a leading indicator of citation metrics. Logically, you should have read a piece of research before you cite it – though there are studies which suggest that, far too often, research gets cited without having been read[1]. Yet, we’ve never seen it as our job at Mendeley to interpret the data: We want to be a neutral data provider, and leave it up to each field of research to determine which metrics are important to them and how to interpret them.

Interestingly, there have been four bibliometric studies in the past year alone which have found a significant correlation of Mendeley’s readership data with Thomson Reuters’ Impact Factor, the Scopus Citation Index, and Google Scholar citations [2],[3],[4], [5] as well pointing out the remarkable Mendeley’s breadth of coverage.

Have there been any attempts to manipulate Mendeley's numbers?

Not to our knowledge, and it would be very difficult to do so. Because of the scale of Mendeley – we now track the data of more than two million users – and its distributed nature, you would have to create thousands of fake “puppet” accounts from different computers, and have all of those puppet accounts read papers by specific authors. We do have measures in place to prevent bots from creating accounts, and we’d certainly notice odd usage patterns like the one I just described. So, it’s much harder to manipulate our data than it is to manipulate citations, which – as I’m sure you know – is a fairly common problem. Earlier this year, Thomson Reuters kicked a few journals out of their citation index because their editors had gamed their citations metrics.

I'd like to thank Dr. Henning for the interview. Photos courtesy of Mendeley.

Disclaimer: even though most of the articles cited were written by my thesis advisors, they were cited by Dr. Henning - we're simply working a lot about alt-metrics at the moment.

[1] Simkin, M.V. & Roychowdhury, V.P. (2003). Read before you cite. Complex Systems. 14:269-74.

[2] Li, X., Thelwall, M., & Giustini, D. (2011). Validating online reference managers for scholarly impact measurement. Scientometrics.

[3] Li, X., & Thelwall, M. (2012). F1000, Mendeley and Traditional Bibliometric Indicators. 17th International Conference on Science and Technology Indicators.

[4] Priem, J., Piwowar, H.A., & Hemminger, B.M. (2012). Altmetrics in the wild: Using social media to explore scholarly impact. http://arxiv.org/abs/1203.4745

[5] Bar-Ilan, J. (2012). JASIST@mendeley. ACM Web Science Conference 2012 Workshop. Evanston, IL.