As a young researcher, I was very drawn to correlations. I felt that they could give us hints at how the world functioned. There are few experiences more exhilarating to budding young social scientists than running one’s first linear regression. Poof! The correlation pops up and tells you how something like income might be related to health. In those days, I played with a lot of these correlations and published quite a few of them.
When I joined the faculty at Columbia University’s Mailman School of Public Health, I was lucky enough to land in a department filled with economists, political scientists, health services researchers, and even a lawyer or two. Being in a rich, interdisciplinary environment helped me grow as a researcher. But no one influenced my thinking more than the economists. This was particularly true of my mentor, Sherry Glied, a healthcare economist now dean of NYU’s Robert F. Wagner Graduate School of Public Service, who would wander into my office at least once a week to ask me a critical question or two that helped shake the foundations of my correlations-based world view.
Even having grown up in poverty and having seen its devastating effects on my friends and family, I began to wonder whether correlations could actually tell us anything about the very complex world in which we live. After all, even when they actually measure “real” phenomena, the utility of such measures lies in whether we can actually do something about them. If income is correlated with health, can we improve health with an anti-poverty policy? One problem is that sick people become poor. So, some of what we might be measuring is just the effect of illness on income, rather than income on illness.
Correlations of things that have no meaningful relationship happen all the time. Did you know that one of the strongest correlations observed is the association between the proliferation of organic food and autism? Pirate attacks have declined precipitously with global warming, we just can’t figure out why there were spikes in the 1990s near Somalia, one of the warmest places on earth.
The correlation between income and health was first studied in the time of Hippocrates, and the first serious correlations between income and health were conducted by Rudolf Virchow in the 1800s. Perhaps hundreds of thousands of such studies followed over time. We have come to our current state of knowledge by studying the effect of poverty on genes, cell components that influence who we are and how we age, the chemicals that allow cells to communicate with each other, and the effect of poverty-associated environmental toxins (such as lead) on our organs. We have “learned” how poverty influences our interpersonal relationships, our participation in society, and, ultimately, the government institutions that hold this society together. I put “learned” in quotes because none of this we really know. Only together does this research begin to paint a rich picture.
Our geneticist and biologist colleagues have helped us learn that the effect of poverty on health—even on adult behavior—can influence the sperm and egg before conception. Most anti-poverty programs are targeted toward adults. Even if they work, they will take a generation to produce any meaningful impacts on health.
What we need to know now is how to actually create intergenerational change that will build a better future society through research-based social policy.
Raj Chetty has led a number of pioneering studies in health and the social sciences that have brought us closer to this future. His willingness to tackle immensely large projects is admirable, but what he does with the data he collects is more admirable still. Some years back, my colleague and I had the idea of examining whether an old randomized controlled trial of reduced class size might be linked to government data. By linking identifiers in the old dataset to Social Security and mortality data, we could “electronically” follow kids who attended reduced sized classes and those who did not throughout their lives.
Chetty and his team had the same idea. But what he did with it was truly amazing. He asked questions that mere mortals would not have considered, such as whether one could dig out the effect of teacher quality on adult outcomes from this morass of data. He is also quite talented at presenting his data. Not only was his paper beautiful and clear, but he enlisted contacts within the media to present his data in new ways. My papers paled in comparison and languished in more technical journals.
He and David Cutler, another Harvard economist, recently joined forces to lean on Chetty’s Internal Revenue Service data to take a much closer look at the association between income and mortality in America. The paper they produced—published in JAMA, covered in many news outlets and commented on by many of the nation’s leading thinkers in this area—details the association between income and health between localities, by country of birth, by gender, and many other ways.
A very similar picture—actually, an identical one—could have been very easily pieced together with an extensive review of the literature. There is nothing new in “The Association Between Income and Life Expectancy in the United States, 2001-2014.” What is new is the ability of brand-name social scientists to repackage studies of things we already know. Angus Deaton—an economist and Nobel laureate I greatly admire—penned a commentary for Chetty and Cutler’s study. Deaton himself (along with his wife) recently received a huge amount of press for calling attention to a problem that has been known for nearly a decade; some groups and some geographic regions in the US are experiencing increases in mortality.
This is not a criticism of these researchers. It is a sign of our times. We social scientists all grew up with a research toolkit that was not designed for today’s data or market, and we are running out of things to do with them. Just as “today” has no Beatles, Rolling Stones, Sex Pistols, Smiths, Nirvana, or Arcade Fire, the great social scientists are becoming little more than aging rock stars. That is not to say that there is nothing new happening with social science research. Just the opposite. It is just not happening in academia.
Radical re-conceptualizations of data and tools to use these data have been created that have unimaginable power for both understanding our world and changing it. However, these tools are held within corporate America and generally not for the free consumption of the public. Academia, hamstrung by ethics boards, risk adverse legal teams, and institutions that are not incentivized to mobilize their talent for financial gain through modern management protocols. The big data corporations, such as Google, do. And they do not concern themselves with ethics protocols or conflicts of interest. They do not even concern themselves with correlations or population means. They are busy at work understanding who we are as individuals: whether we like everything bagels, how likely we are to go to buy steel appliances.
They do this with our GPS and IP coordinates, our shopping behaviors, our searches. They know our political leanings, our music tastes, whether we are thin or fat. And they are learning how to change these characteristics, how to shape those behaviors through our friends, newsfeeds, devices. Some of the tools they use are old. Bayesian analytics have been around, well, since Bayes. Google conducted 12,000 randomized controlled trials in 2012 alone, according to one industry paper. But they are putting these tools together to understand people, not populations. And causes, not correlations. Sadly, this aforementioned unimaginable power is generally not being used to do what our social scientists hold the potential to do: to build a better society through research-based social policy. We need to give our social scientists these tools, too. Now, if we could only figure out how.