June 28, 2012 | 5
I was a bit amused when I read a press release headline this week: “Scientists struggle with mathematical details.” I expected the story to be about occasions when scientists had misunderstood, misinterpreted, or misapplied mathematical formulas in their published research, but instead the study in the June 25 issue of Proceedings of the National Academy of Sciences reports that research papers with lots of mathematical details are cited by other scholars less often than papers with fewer. Clearly, this disparity could slow scientific progress if important but technical papers are ignored.
Being a mathematician, I wanted to know how someone would go about quantifying such citational differences. I learned that it’s not an exact science, but the research provides some interesting data and suggestions about how to improve scientific communication.
The study featured an analysis of 649 papers from 1998 about ecology and evolution, fields that at times employ complicated equations. The researchers used the number of equations per page as a proxy for technical level. Equations were counted only if they were printed on a separate line from text, not if they were in line; if two equations were printed on the same line, they were counted separately.
The study authors, biologists Tim Fawcett and Andrew D. Higginson of the University of Bristol, assessed the impact of a paper by drawing on citation data from Thomson Reuters, excluding self-citation, which was defined as shared surnames between any authors of the two papers. Fawcett and Higginson note that this approach may have led to a few spurious self-citations (two unrelated Smiths, perhaps) but think that these errors probably exerted only a very small effect. (They don’t mention spurious non-self-citations that may have arisen from name changes because of marriage, spy activity or the witness protection program, but I’m sure that problem is even smaller.)
Fawcett and Higginson found on average a 22 percent decrease in citations for every additional equation per page. They attribute this drop mainly to fewer citations in non-theoretical papers. The proxy for theoretical versus non-theoretical paper is whether the word “model” appears in the title or abstract, taking out common phrases such as “model species” and “experimental model.” This choice seems like a pretty rough estimate, and they note that a random sample of articles indicated that 84.5 percent were correctly classified. That hit rate sounds like a low level of accuracy, but categorizing every one of the 28,068 citing papers was impossible.
In any case, Fawcett and Higginson found little correlation between equation density and citations by theoretical papers, and a large reduction in citations by non-theoretical papers for papers with more than 0.5 equations per page. They found that equations appearing in appendices had no effect on citation rate.
Some of the proxies seem a bit problematic to me, but overall this correlation may be real. Of course, the elephant in the room is how to measure, in some objective way, the “true” impact each paper should have and compare it with the citation data. Unfortunately, that would require some sort of voodoo.
Taking this study at face value, I think it might provide a lot of us with a bit of relief. Scientists are just people like us who prefer reading words to wading through equations. Frankly, I’m curious about a similar study for mathematical papers. I don’t think I’m the only mathematician whose eyes sometimes glaze over when presented with a solid page of equations, nary an English sentence in sight.
The question, though, is what to do with the technical details. Even if people don’t like reading them, that is where the science happens. And writing a paper based on your analysis without including the details of that analysis seems like a recipe for disaster.
The researchers suggest two main approaches: better mathematical education for scientists and better explanations of mathematics in papers with lots of equations. They stress that the latter is more immediate and easier to implement, noting that better education would take quite a while to trickle down and be effective. But I wonder whether better education would actually help at all. I am never one to take a stand against further math education for scientists, but the problem doesn’t seem to be poor understanding; people just don’t like reading equations.
Based on my own experiences as a mathematician, I can understand that. Even if I can wade through an equation in a paper, a sentence explaining it is valuable. The better explanation approach would probably lead to more readable papers, but adding text for readability makes papers longer, and space is at a premium in print journals. The authors also note that technical details can be relegated to appendices, either in print or online, with the caveat that theoreticians should make all assumptions explicit in the main text. Otherwise, those who don’t travel to the appendix to see the nitty-gritty details might misinterpret the results. Many journals already leave some technical details to appendices already, so the authors are just advocating this approach on a larger scale.
I have another suggestion: Eye tracking and mild electroshock therapy. If scientists skim over pages of equations or stare into space for too long while reading a technical paper, they get a gentle jolt of electricity to bring them back to the important equations at hand. Just kidding. Papers shouldn’t be literally electrifying, but more readable papers might help ensure that the best science makes an impact.