Skip to main content

Predicting Future Crowd Behavior with Big Public Data

This is a part of a series of Q&As with mathematicians and computer scientists participating at the 1st Heidelberg Laureate Forum, September 22-27, 2013.

This article was published in Scientific American’s former blog network and reflects the views of the author, not necessarily those of Scientific American


This is a part of a series of Q&As with mathematicians and computer scientists participating at the 1st Heidelberg Laureate Forum, September 22-27, 2013. More than 40 Laureates (Abel Prize, Fields Medal, Nevanlinna Prize, Turing Award) will attend the forum together with 200 young researchers. For a full week Heidelberg in Germany will be the hot spot of mathematics and computer science. Six of the young scientists told us about their current research and their expectations before the meeting.

Meet Nathan Kallusin this short Q&A series with 6 out of 200 young researchers:

Name?


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


Nathan Kallus

Nationality?

American and Israeli

Where are you based?

Operations Research Center, Massachusetts Institute of Technology

What is your current position?

PhD candidate

What is the focus of your research?

The intersection of optimization with statistics: The intersection of mathematical optimization with statistics. My work has revolved both around using statistical tools in optimal decision making under uncertainty and around using modern optimization theory and methods for statistical testing and experimental design. More generally I am very interested in data-driven methods and in the power of analytical thinking in data-rich environments, which are now becoming ubiquitous.

Why did you become a mathematician/computer scientist?

Epistemologically, most things cannot be known with certainty and even if one has certainty it's hard to convey and share it with others. In my own life, this is one reason why I always like to keep an open mind toward new things and never judge others for something I do not personally fully understand. Statistics is one tool to understand things that are not absolute and is one of the best tools for scientists to share such knowledge. Mathematical optimization is one of the most elegant, and at the same time most widely applicable, subjects in the fields of math and CS. I find it absolutely thrilling to optimize on our knowledge, to maximize our understanding, with two of the most fascinating and beautiful topics in math and computer sciences, mathematical optimization and statistics.

Anything like a favorite project?

I really like all my projects or I couldn't bring myself to work on them. I already told a bit about the focus of my current theoretical research work so let me tell a bit about a recent application-focused project on using data for a particular purpose: predicting future crowd behavior with big public data from the web. In today's online age, much of the public consciousness and comings together that bring about crowd actions like mass demonstrations and cyber hacktivism have a significant presence online where issues of concern are discussed and calls to arms are publicized and where online news media discuss recent events and provide context.

I think that there is great predictive power in online-accessible information for forecasting such crowd behavior but even if the information is publicly available, gathering and analyzing the huge mass of unstructured plain text online is a formidable task. So this past summer I teamed up with Recorded Future, a company specializing in gathering and analyzing huge amounts of varied public data from the web.

Using the data they collect from all over the World Wide Web, I was able to accurately predict future significant protests by country or by city around the world and hacktivist campaigns by perpetrating group or by nation-level targets. The challenge is to distill the data so that predicting such events is like tracking the trajectory of a system of clouds to forecast the weather. Here the amount of data was staggering and large trends could be seen by the numbers. This, I think, is the main challenge in applying predictive analytics to newly data-rich application areas, but once trajectories can be resolved in massive unstructured datasets, where they're heading is clear.

Why did you apply for the HLF13?

When a former colleague forwarded me the announcement for the Heidelberg Laureate Forum, I immediately knew I had to apply. On the one hand my gut reaction was spontaneous excitement at the mention of the biggest awards in our fields. On the other I knew that meeting the laureates thereof and fellow motivated young researchers will be a rare and extraordinary experience that may shape my research career to come.

I was very excited that you chose to invite laureates and young researchers from across the research spectrum of math and CS. Our fields are more intertwined than ever and I am excited to be working in their intersection. I have always been interested in advancing research through interdisciplinary dialogue and I think our best ideas and greatest progress have been the result of interdisciplinary cross-pollination. My own research in fact revolves around a two-way dialogue between modern optimization theory and practice, and applied and mathematical statistics and probability theory.

Inspired by my mentors over the years, in particular my current advisor, I always judge the merit of my own work through the lens of its potential societal impact. Before even trying to solve a problem, I ask myself, suppose I could––very hypothetically––completely and utterly solve the problem at hand, will it then change the way we do things? The way we understand the world? Will it improve any, albeit minuscule, aspect of our society? If so, then my contribution to its solution, however small, has merit and I will make my best effort.

Not everyone in the mathematical sciences are motivated in this way––many are motivated by the incredible beauty of math. Their contributions are brilliant and tremendous, but often understood and benefitted by only a handful of people worldwide. It is very often through interdisciplinary dialogue that these incredible contributions can fulfill their potential for societal impact and I would be so excited to be part of this intellectual dialogue at the Forum.

The luminaries of our fields became so by being visionaries, by seeing new possibilities where others did not. I want to hear what they think about the future directions of our fields. At the same time, I want to better understand this ability of theirs. Meeting all of them in person will be a great opportunity for that.

Do you have any Laureates on your list, you would love to talk to?

I am so excited that a wide range of researchers and laureates will be present in the Forum and I think it is dialogue amongst us all that will make the Forum great. I am very excited to talk to as many of the laureates, especially those whose work I am less familiar with. Some laureates of the laureates in attendance whose work I am more familiar with and who I already know I would be glad to get a chance to talk to are Dick Karp, Adi Shamir, Terrance Tao, Donald Knuth, Peter Shor, Daniel Spielman, and Andrew Wiles.

.....

This blog post originates from the official blog of the 1st Heidelberg Laureate Forum (HLF) which takes place September 22 - 27, 2013 in Heidelberg, Germany. 40 Abel, Fields, and Turing Laureates will gather to meet a select group of 200 young researchers. Beatrice Lugger is a member of the HLF blog team. Please find all her postings on the HLF blog.