About the SA Blog Network

Talking back

Talking back

A science blog, sans blague
Talking back Home

Statistical Flaw Punctuates Brain Research in Elite Journals

The views expressed are those of the author and are not necessarily those of Scientific American.

Email   PrintPrint

Neuroscientists need a statistics refresher.

That is the message of a new analysis in Nature Neuroscience that shows that more than half of 314 articles on neuroscience in elite journals   during an 18-month period failed to take adequate measures to ensure that statistically significant study results were not, in fact, erroneous. Consequently, at  least some of the results from papers in journals like Nature, Science, Nature Neuroscience and Cell were likely to be false positives, even after going through the arduous peer-review gauntlet.

The problem of false positives appears to be rooted in the growing sophistication of both the tools and observations made by neuroscientists.  The increasing complexity poses a challenge to one of the fundamental assumptions made in statistical testing, that each observation, perhaps of an electrical signal from a particular neuron, has nothing to do with a subsequent observation, such as another signal from that same neuron.

In fact, though, it is common in neuroscience experiments—and in studies in other areas of  biology—to produce readings that are not independent of one another. Signals from the same neuron are often more similar than signals from different neurons, and thus the data points are said by statisticians to be clustered, or “nested.” To accommodate the similarity among signals, the authors from VU University Medical Center and other Dutch institutions suggest that a technique called multilevel analysis is needed to take the clustering of data points into account.

No adequate correction was made in any of the 53 percent of the 314 papers that contained clustered data when surveyed in 2012 and the first half of 2013. “We didn’t see any of the studies use the correct multi-level analysis,” says Sophie van der Sluis, the lead researcher. Seven percent of the studies did take steps to account for clustering, but these methods were much less sensitive than multi-level analysis in detecting actual biological effects.  The researchers note that some of the studies surveyed probably report false-positive results, although they couldn’t extract enough information to quantify precisely how many.  Failure to statistically correct for the clustering  in the data can increase the probability of false-positive findings to as high as 80 percent—a risk of no more than 5 percent is normally deemed acceptable.

Jonathan D. Victor, a professor of neuroscience at Weill Cornell Medical College had praise for the study, saying it “raises consciousness about the pitfalls specific to a nested design and then counsels you as to how to create a good nested design given limited resources.”

Emery N. Brown, a professor of computational neuroscience in the department of brain and cognitive sciences at the MIT-Harvard Division of Health Sciences and Technology, points to a dire need to bolster the level of statistical sophistication brought to bear in neuroscience studies. “There’s a fundamental flaw in the system and the fundamental flaw is basically that neuroscientists don’t know enough statistics to do the right things and there’s not enough statisticians working in neuroscience to help that.”

The issue of reproducibility of research results has preoccupied the editors of many top journals in recent years. The Nature journals have instituted a checklist to help authors on reporting on the methods used in their research, a list that inquires about whether the statistical objectives for a particular study were met. (Scientific American is part of the Nature Publishing Group.) The one clear message from studies like that of van der Sluis and others is that the statistician will take on an increasingly pivotal role as the field moves ahead in deciphering ever more dense networks of neural signaling.

Image Source: Zache

About the Author: Gary Stix, a senior editor, commissions, writes, and edits features, news articles and Web blogs for SCIENTIFIC AMERICAN. His area of coverage is neuroscience. He also has frequently been the issue or section editor for special issues or reports on topics ranging from nanotechnology to obesity. He has worked for more than 20 years at SCIENTIFIC AMERICAN, following three years as a science journalist at IEEE Spectrum, the flagship publication for the Institute of Electrical and Electronics Engineers. He has an undergraduate degree in journalism from New York University. With his wife, Miriam Lacob, he wrote a general primer on technology called Who Gives a Gigabyte? Follow on Twitter @@gstix1.

The views expressed are those of the author and are not necessarily those of Scientific American.

Rights & Permissions

Comments 4 Comments

Add Comment
  1. 1. Jerzy v. 3.0. 8:09 am 03/28/2014

    Not just brain research, most fields of medical research.

    Link to this
  2. 2. Martinwilson 4:03 am 03/29/2014

    obviously, research on different fields helping us to explore more about it. One should only work on result oriented matters. Do my assignment –

    Link to this
  3. 3. MSkigen 7:11 am 03/29/2014

    I especially have noticed the studies where they take people who are already diagnosed with something, especially something that only really manifests outwardly through behavior, and mistake the identified population (people who have behaved in this way significantly enough for a diagnosis to be pursued) with everyone who has that situation. Therefore, weird stuff like “there are more autistic boys than girls,” when girls’ behaviors are less likely to be pursued, are more inwardly directed and therefore less noticeable, etc.

    Link to this
  4. 4. fustbariclation 12:08 am 03/30/2014

    This isn’t quite as clear as it could be. It says ‘Failure to statistically correct for the clustering in the data can increase the probability of false-positive findings to as high as 80 percent—a risk of no more than 5 percent is normally deemed acceptable.’. Forgetting the split infinitive (‘failure statistically to correct’ would be better), it raises a few questions for me.

    Firstly, the 5% deeming seems quite arbitrary. In Bayesian statistics, you decide on the ‘acceptable’ level based on likely outcomes, not on an arbitrary figure.

    The comment about the false positive isn’t clear. There are four outcomes possible from an experiment – true positive, false positive, true negative, false negative. If these were equiprobable then the chance of a false positive would be .25. They obviously, though, aren’t equiprobable. You need to consider experimental design, limits of measurement and the various sources of error. This will give you a picture of the relative probabilities. If, as they’re saying, the probability of a false positive is .8, then that leaves only .2 between the other probabilities. So, if they’re equiprobable, it’d mean that true positives, true negatives and false positives would be .07 probable.

    Now, if false negatives are .07 probable, which isn’t all that far from .05, then there’s a simple solution. Simply frame the null hypothesis the other way around and, hey-presto, suddenly you’ll have really accurate results. What were your unacceptably high false positives are now your quite reasonable small level of false negatives.

    This doesn’t make a lot of sense, so there must be more to what this .8 probability means than the article is saying.

    Link to this

Add a Comment
You must sign in or register as a member to submit a comment.

More from Scientific American

Scientific American Dinosaurs

Get Total Access to our Digital Anthology

1,200 Articles

Order Now - Just $39! >


Email this Article