ADVERTISEMENT
  About the SA Blog Network













Cross-Check

Cross-Check


Critical views of science in the news
Cross-Check Home

Are “Big Data” Sucking Scientific Talent into Big Business?***

The views expressed are those of the author and are not necessarily those of Scientific American.


Email   PrintPrint



Over the last few years, we’ve heard a lot about how “Big Data”—which as far as I can tell is just data mining in a glossy new wrapper–are going to revolutionize science and help us create a better world.* These claims strike me as all too familiar. They remind me of the hype generated in the 1980s by chaos and in the 1990s by complexity (which was just chaos in a glossy new wrapper). Chaos and complexity enthusiasts promised (and are still promising) that ever-more-powerful computers plus jazzy new software and math were going to crack riddles that resisted more traditional scientific methods.

Is the enthusiasm of Big Business for Big Data causing a "brain drain" from science?

Advances in data-collection, computation and search programs have led to impressive gains in certain realms, notably speech recognition, language-translation and other traditional problems of artificial intelligence. So some of the enthusiasm for Big Data may turn out to be warranted. But in keeping with my crabby, glass-half-empty persona, in this post I’ll suggest that Big Data might be harming science, by luring smart young people away from the pursuit of scientific truth and toward the pursuit of profits.

My attention was drawn to this issue by a postdoc in neuroscience, whose research involves lots of data crunching. He prefers to remain anonymous, so I’ll call him Fred. After reading my recent remarks on the shakiness of the scientific literature, he wrote me to suggest that I look into a trend that could be exacerbating science’s woes.

“I think the big science journalism story of 2014 will be the brain drain from science to industry ‘data science,’” Fred writes. “Up until a few years ago, at least in my field, the best grad students got jobs as professors, and the less successful grad students took jobs in industry. It is now the reverse. It’s a real trend, and it’s a big deal. One reason is that science tends not to reward the graduate students who are best at developing good software, which is exactly what science needs right now…

“Another reason, especially important for me, is the quality of research in academia and in industry. In academia, the journals tend to want the most interesting results and are not so concerned about whether the results are true. In industry data science, [your] boss just wants the truth. That’s a much more inspiring environment to work in. I like writing code and analyzing data. In industry, I can do that for most of the day. In academia, it seems like faculty have to spend most of their time writing grants and responding to emails.”

Fred sent me a link to a blog post, “The Big Data Brain Drain: Why Science is in Trouble,” that expands on his concerns. The blogger, Jake VanderPlas, a postdoc in astrophysics at the University of Washington, claims that Big Data is, or should be, the future of science. He writes that “in a wide array of academic fields, the ability to effectively process data is superseding other more classical modes of research… From particle physics to genomics to biochemistry to neuroscience to oceanography to atmospheric physics and everywhere in-between, research is increasingly data-driven, and the pace of data collection shows no sign of abating.”

Vanderplas suggests that the growing unreliability of peer-reviewed scientific results, to which I alluded in my last post, may stem in part from the dependence of many research results on poorly written and documented software. The “crisis of irreproducibility” could be ameliorated, VanderPlas contends, by researchers who are adept at data-analysis and can share their methods with others.

The problem, VanderPlas says, is that academia is way behind Big Business in recognizing the value of data-analysis talent. “The skills required to be a successful scientific researcher are increasingly indistinguishable from the skills required to be successful in industry. While academia, with typical inertia, gradually shifts to accommodate this, the rest of the world has already begun to embrace and reward these skills to a much greater degree. The unfortunate result is that some of the most promising upcoming researchers are finding no place for themselves in the academic community, while the for-profit world of industry stands by with deep pockets and open arms.”

VanderPlas and Fred, who are are apparently software whizzes themselves, perhaps overstate the scientific potential of data crunching just a tad. And Fred’s aforementioned claim that industry “just wants the truth” strikes me as almost comically naïve. [**See Fred's clarification below.] For businesses, peddling products trumps truth–which makes the brain drain described by Fred and VanderPlas even more disturbing.

Fred is a case in point. Increasingly despondent about his prospects in brain research, he signed up for training from the Insight Data Science, which trains science Ph.D.s in data-manipulation skills that are desirable to industry (and claims to have a 100 percent job placement record). The investment paid off for Fred, who just got a job at Facebook.

*Should “Big Data” be treated as plural or singular? I polled my students, and they said plural, so I went with plural.

**Re his comment about industry bosses wanting “truth,” “Fred” just emailed me this clarification: “I think there is a distinction, which I perhaps should have made clearer, between ‘marketing’ and ‘analytics.’ When it comes to marketing a product to consumers, I agree it’s pretty obvious that business incentives are not aligned with truth telling. No one disputes that. But when it comes to the business’s internal ‘analytics’ team, the incentives are very aligned with truth telling. Analytics teams do stuff like: determining how users are interacting with the product, measuring trends in user engagement or sales, analyzing failure points in the product. This is the type of work that most data scientists do.”

***A couple of afterthoughts on this topic: First, Lee Vinsel, my Stevens colleague and former friend, points out in a comment below that industry has long lured scientists away from academia with promises of filthy lucre and freedom from the grind of tenure-and-grant-chasing. Yup. Wall Street “quants” are just one manifestation of this age-old phenomenon. So what’s new about the Big Data Brain Drain? Does it differ in degree or kind from previous academia-to-business brain drains? Good questions, Lee. I have no idea, but I bet Big Data can provide the answer! (Unless of course it’s subject to some sort of Godelian limit on self-analysis.)

Second, a fascinating implication of the rise of Big Data is that science may increasingly deliver power—that is, solutions to problems—without understanding. Big Data can, for example, help artificial intelligence researchers build programs that play chess, recognize faces and converse without knowing how human brains accomplish these tasks. The same could be true of problems in biology, physics and other fields. If science doesn’t yield insight, is it really science? (For a smart rebuttal of the notion that Big Data could bring about “the end of theory,” see the smart blog post mentioned below by Sabine Hossenfelder.)

Image: Defense Advanced Research Projects Agency via Wikimedia Commons, http://commons.wikimedia.org/wiki/File:DARPA_Big_Data.jpg.

About the Author: Every week, hockey-playing science writer John Horgan takes a puckish, provocative look at breaking science. A teacher at Stevens Institute of Technology, Horgan is the author of four books, including The End of Science (Addison Wesley, 1996) and The End of War (McSweeney's, 2012). Follow on Twitter @Horganism.

The views expressed are those of the author and are not necessarily those of Scientific American.





Rights & Permissions

Comments 12 Comments

Add Comment
  1. 1. Jerzy v. 3.0. 4:14 pm 11/8/2013

    Everything sucks scientific talent from the academia. Ph.D. students and postdocs face poor pay, 19. century working hours, no job security, no chance to work on their own project until the average age of 42 and ca 1% chance to become a tenure professor. Compare it with 100% placement rate in data science. No wonder that intelligent people wake up and leave the system.

    Link to this
  2. 2. Lee Vinsel 8:05 pm 11/8/2013

    Uh, Horgan, I’m sorry, but I don’t get this. What evidence do we have that “Big Data” is leading to bigger brain drain than corporate practices, including R&D, have done for the last century? I’m not seeing any numbers in your post. You also write, “The problem, VanderPlas says, is that academia is way behind Big Business in recognizing the value of data-analysis talent.” Have you tried to verify this in any way? All I hear about in academia is data-analysis, whether that takes the form of the “Digital Humanities” or better predicting severe weather. Have you done the work to examine this “problem”? Or are you just passing on contrived controversies?

    Link to this
  3. 3. gs_chandy 9:53 pm 11/8/2013

    I second Lee Vinsel’s first question:

    “What evidence do we have that “Big Data” is leading to bigger brain drain than corporate practices, including R&D, have done for the last century?”

    I don’t ‘get it’ either.

    GSC

    Link to this
  4. 4. gs_chandy 10:36 pm 11/8/2013

    Mr Horgan:

    Apropos of your piece on ‘Big Data’, it would be useful if you’d explore, in better detail, some of the ‘fashions in science’ as they’ve developed over the last 6-10 decades (or more). You have noted a couple of them:

    Chaos – 1980s
    Complexity theory – 1990s

    I recall Martin Gardner had written a wonderful book (to which I do not currently have access) “Fads and Fallacies in the Name of Science” which I believe explored many relevant issues – and which would have provided some useful background for this piece of yours.

    I observe that your thoughts on ‘complexity’ are not highly knowledgeable, in particular when you claim it’s just “chaos in a glossy new wrapper”. It is not – though an understanding of ‘complex systems’ could possibly enable us better understand ‘chaos’ (and why the initial efforts in this direction failed).

    To understand complexity, we do need to acquaint ourselves with the contributions of the late John N. Warfield to ‘systems science’ through which he made it possible for individuals and groups to explore scientifically and hopefully come to understand ‘complex systems’ of specific interest (including our ‘societal systems’ such as ‘education systems, ‘systems of governance’ – and why they fail so very often).

    In a comment (No. 43) to your blog post on “A Dig through Old Files Reminds Me Why I’m So Critical of Science”, I’ve provided some background about Warfield’s work on complex systems and on some developments from that work which – when adequately applied to the issues of ‘Big Data’ (and how we use it) – would help resolve many of the issues you have raised here.

    For instance, it seems to be becoming clear that Google may well be ‘barking up the wrong tree’ entirely (!!) – despite their initial quite dazzling successes in handling big data. (Warfield’s work may well help them direct their aim rather better than they have have been doing lately).

    GSC

    Link to this
  5. 5. John Horgan in reply to John Horgan 7:33 am 11/9/2013

    Lee, I think you know the answer to your final question. Now get back to your baby (or book)!

    Link to this
  6. 6. Bee 8:31 am 11/9/2013

    I just wrote a piece on big data in physics a few days ago, maybe you find it interesting:

    http://backreaction.blogspot.com/2013/11/big-data-meets-eye.html

    Link to this
  7. 7. Peter_C 9:13 am 11/9/2013

    Is “Big Data” a field of study, or is it more than one field? Or should I say, are “Big Data” a field or more than one field?

    Link to this
  8. 8. Arbeiter 10:42 am 11/9/2013

    A single failed reaction is a setback. A million failed reactions are a combinatorial library. The former cannot compete with the alter on a unit cost basis. Management obsesses on what is measurable instead of promoting what is important. Science is now administered not performed.

    Believing that applied research produces advancing knowledge is believing electricity comes from your wall outlet. What gains issue from funding cultures of failure? Sales commissions! Rather than foster brilliance we allocate for its suppression.

    Link to this
  9. 9. Andrei Kirilyuk 2:24 pm 11/9/2013

    When first “computer simulation” and now “big data” totally replace and reduce to zero human intellect participation in science (and elsewhere), it only clearly demonstrates the real perfection of destruction. No more brain for the drain. However, since everything is determined by the big money and the related, absolutely unbalanced subjective interests (especially in fundamental science), nothing can be changed within that kind of system, irrespective of any “real problem solutions”, “genuine, universal complexity concept” or whatever other truly efficient new results. What is actually of interest to the “established professionals” is mere preservation of the dominating destructive corruption nicely nourishing them, with all its meaningless word plays, all those “big data”, “nanotechnologies”, “quantum computers”, “hidden dimensions”, “parallel universes”, “quantum gravities” and other (conveniently) undetectable “dark matters” … The famous “bullshit science” (c) John Horgan has now definitely won everywhere, in all the “best places”, congratulations.

    Link to this
  10. 10. rshoff2 1:54 pm 11/11/2013

    Extremely valid topic. As important as global warming! (yes, what we do with science has implications beyond our capitalistic short-term gain is god perspective). We may use it to serve the human race, or we may obliterate the human race by using it to serve only our personal interests (which drives capitalism). Our choice completely. We are free.

    That being said, I think data is one great example. Jerzy mentioned R&D creating a vacuum (and I’ll add driven by capitalism and not by human interests).

    I think it’s funny that everyone’s assumption is that ‘academia’ is an appropriate place. They are beholden to government interests, which is controlled by special interests and big business lobbyists and are themselves prone to live in Academic Ivory Towers.

    So, the solution? There can’t be a systemic one. We as individuals must make the choice to value science and knowledge and show interest in how that knowledge is applied.

    Sounds rather utopian and impractical, right? Well, you know, we do have a social instinct. My proof is the city sidewalk on which I walk and the park bench on which I sit. Our goal should be to better balance our social survival instincts with our individual survival instincts.

    Then the appropriate focus of science will happen. But we must do it from within.

    Of course literal mathematicians, engineers, and other scientists will not be able to swallow, much less digest, my comment. I expect some quiet ridicule and an anonymous snicker. But there, I said it, and if you read it, you cannot purge it from a recess of the mind.

    Link to this
  11. 11. rshoff2 2:00 pm 11/11/2013

    By the way, I do think we have the strength of character to do this AND successfully participate in our capitalistic economic distribution system -which has done a lot of good in this world.

    Link to this
  12. 12. jsheats 1:54 am 11/12/2013

    Seems to me to be tilting at windmills. Every new topic comes in with a lot of hype (just as true in academia these days at least as it is in industry), and gradually the grain is separated from the chaff. “Big Data” is nothing but empirical analysis of data on a large scale, and is not qualitatively different from what has gone before (though the quantitative differences may be very important). Empirical study and hypothesis-driven study are complementary and both are required.

    I am annoyed, however, by the implicit assertion (made explicit by Fred) that the best intellects should go into academia and the second raters into industry. This has never really been true except as a statistical average, and is that only in the U.S. So we have excellent universities (still!), and lousy commercialization, while other countries have the reverse. People need to learn balance.

    Link to this

Add a Comment
You must sign in or register as a ScientificAmerican.com member to submit a comment.

More from Scientific American

Scientific American Dinosaurs

Get Total Access to our Digital Anthology

1,200 Articles

Order Now - Just $39! >

X

Email this Article

X