August 6, 2010

Tim White's response to "scientific commons" blog post

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American

Editor's Note: This post is a response to a Scientific American Observations blog post, "When should a scientist's data be liberated for all to see?"

I was very surprised to find that we have (yet again) been singled out, this time in your online Observations section: "When should a scientist's data be liberated for all to see?"

On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.

Your coverage concerned a Science opinion piece entitled "Prepublication Data Release, Latency, and Genome Commons." That article was specifically about genomic data. It even noted that:

"Scientific information commons are under discussion in areas ranging from microbiology ( 30) to global climate change ( 31) to molecular chemistry ( 32). Policy designers in these fields cannot adopt wholesale the approaches taken in the genome commons."

The author did not specifically mention paleoanthropology, yet you chose to single it out in your report. Why? Should this be read yet another "cheap shot" at paleoanthropology? Given the history of your magazine's ill-conceived and poorly-timed September 2009 editorial (to which your new article links), and my subsequently published reply (to which it doesn't link), it was surprising to once again read your portrayal of paleoanthropology as THE bad actor when it comes to data sharing. I would have thought it obvious that genomic and paleoanthropological data constitute proverbial "apples and oranges."

Please let me try to explain why I now throw another yellow flag in the direction of your editorial content.

In 1995, why didn't anybody publish the base-pair sequence for Chromosome 15? After all, everybody had a copy of the chromosome in every one of their cells, so the data were all "available." But we first needed the human genome project to map the chromosome, and the sequence data were (sort of) published after a certain amount of verification.

In 1995, why didn't anybody publish the skull of Ardipithecus ramidus? After all, we had excavated the fragments of the skull in matrix the year before. But it took more than a decade to extract the fossils and reassemble them accurately, and the data on them were then published after a certain amount of verification (granted, we can't "post" the fossils online as strings of digital data, but we did publish over 600 manuscript pages on these fossils in a single issue of Science).

So far, the picture looks pretty similar: To extract data takes time; to publish verified data takes more time; peer-reviewed publication takes additional time, but is the time-tested method for advancing science.

Genomic data are simply the sequence of four different bases. In your words "easily stored data."

Paleoanthropological data are substantially more complex, in content and in context. The "experiment" of human evolution was a one-time affair whose fossil remnants constitute unique, fragile, and national/international patrimony.

If you re-read your article with these facts in mind, perhaps you will understand my concerns. Through your superficial comparison, you actually encourage premature publication of paleoanthropological results. This penalizes the researchers trying to do the work correctly, generating reliable results on irreplaceable antiquities.

As the "data generators," we field and laboratory workers in human evolutionary studies are obliged to place valid, reliable data and interpretations into the "commons." The author you featured clearly understands that the equivalent of Hardin's ecological "tragedy of the commons" has a high risk of occurring in fields that involve long-term studies. If user-demands for premature data release continue to escalate, research fields such as ecology, demography, anthropology, and hominid paleobiology will suffer a parallel tragedy because the "data generators" will simply stop functioning.

So your apparent presumption that process and data in human paleobiology are indistinguishable from those in genomics was neither accurate nor helpful. And by continuing to be both unfair and inaccurate to our discipline, you continue to misinform Scientific American's readers. Sharing digital data is easy…when the data are…only digital.

ABOUT THE AUTHOR

At the University of California at Berkeley, Tim White directs the Human Evolution Research Center and is professor of Integrative Biology. In Ethiopia, he is co-PI of the Middle Awash project. He has conducted research in Africa, Europe, Asia, and North America. He is the author of Human Osteology, The Human Bone Manual, and Prehistoric Cannibalism at Mancos. He was elected to the U.S. National Academy of Science, the Royal Academy of South Africa, and the TIME 100 (2010). His research interest is human evolution, in all its dimensions.

The views expressed are those of the authors and are not necessarily those of Scientific American.