About the SA Blog Network

Guest Blog

Guest Blog

Commentary invited by editors of Scientific American
Guest Blog HomeAboutContact

WeatherSignal: Big Data Meets Forecasting

The views expressed are those of the author and are not necessarily those of Scientific American.

Email   PrintPrint

Smartphone-collected Big Data has the potential to transform the way we can understand and predict weather systems. Five months ago, we at OpenSignal (a project to map global cell phone signal coverage) launched an app called WeatherSignal to collect atmospheric data from smartphones. WeatherSignal works by repurposing the sensors that already exist in Android devices in order to build a live map of atmospheric readings.

'The WeatherSignal Dashboard' for 2013-05-03 18.58.17.png

A live Pressure Map

The most recent Galaxy phone, the S4, contains a barometer, hygrometer (humidity), ambient thermometer and lightmeter – all of which is important data for meteorology. While the S4 is the most advanced phone in terms of sensors, valuable readings can be gathered from many other phones as well. The prospect of a granular network of millions of inter-connected weather stations is an exciting one for meteorology.

We are often asked how we can trust the data, as mobile phones are often indoors or in pockets. The answer to this is twofold. First, we can combine sensor readings (if light reading is sub x then phone is not outdoors, for instance) and second, given appropriate volume we can arrive at valid averages – an answer that gets to the heart of what Big Data really means.

The philosophy of Big Data is that insights can be drawn from a large volume of ‘dirty’ (or ‘noisy’) data, rather than simply relying on a small number of precise observations – a subject covered in detail by Viktor Mayer-Schönberger and Kenneth Cukier in their recent book ‘Big Data’. One good example of the success of the ‘Big Data’ approach can be seen in Google’s Flu Trends which uses Google searches to track the spread of flu outbreaks worldwide. Despite the inevitable noise, the sheer volume of Google search data meant that flu outbreaks could now be successfully identified and tracked in near real-time. In comparison, relying on Doctors to report flu cases as they were observed resulted in a comparative lag of up two weeks in the identification of outbreaks. Despite this, however, the system is not perfect. Flu Trends recently majorly overestimated an epidemic in the US – possibly because increased media coverage led to an increase in false positive searches for flu symptoms. It is also important to remember that Big Data when used on its own can only provide probabilistic insights based on correlation.

'A live Pressure Map'

The WeatherSignal Dashboard

The true benefit of Big Data is that it drives correlative insights, which are achieved through the comparison of independent datasets. It is this that buttresses the Big Data philosophy of ‘more data is better data’; you do not necessarily know what use the data you are collecting will have until you can investigate and compare it with other datasets.

One good example of this is the experiment which ultimately led to our creating WeatherSignal. We had been collecting readings of battery temperature from our connection-toolkit app called OpenSignal. On investigation, we identified a historical correlation between averaged smartphone battery temperatures and the ambient temperature readings made by dedicated weather stations.

Working from this starting point we published a paper in conjunction with the Royal Meteorological Society of the Netherlands that developed an algorithmic approach to converting battery temperature readings to ambient temperature. We use this approach to create averaged ambient temperature readings from phones that don’t contain an external thermometer; a result which would never have been possible if we hadn’t collected battery temperature readings and compared them to historic atmospheric data.

The ‘Big Data’ approach has already begun to be incorporated into weather nowcasting, and the Flu Trends disease example provides an excellent allegory for where it can initially prove most useful. The UK Met Office has started making use of various non-traditional sources to track the spread of snowfall, including geo-located tweets mentioning snow. It is instructive here to think of snowfall as an ‘outbreak’, a (relatively) unpredictable high-impact event that can be better-managed through more immediate and granular data. Such is the nature of these high-impact events that having more data, however dirty, is especially useful for helping to limit consequences through more effective immediate decision-making. Initially we believe that the WeatherSignal data will be most useful for ‘outbreak’ type events, using pressure readings for short-term storm forecasting and surface temperature readings to determine the spread of snowfall.

The next step lies in proving empirically that smartphone data has an important role to play in the future of weather forecasting. We are currently looking for more academic partners to come forward and make use of our data and already have an exciting group of collaborators lined up. We are working with Birmingham University Climate Lab (BUCL) to prove that crowdsourced smartphone sensor readings can be useful in studying urban climate. BUCL have established a dense network of weather stations and temperature sensors in their city, which will be used to test the crowdsourced readings from the WeatherSignal network.

We have also begun to supply pressure readings to the University of Washington to help prove their use in atmospheric modelling, and have announced plans to share our data with the Met office. The benefits that a crowdsourced approach can bring to the science of meteorology are only just becoming apparent, but the winds of change are blowing.

Samuel Johnston About the Author: Samuel Johnston works for London-based tech start up OpenSignal, where manages the academic outreach for WeatherSignal. He has a first class degree in Politics and Social Theory from Cambridge University and frequently writes about crowdsourcing, privacy and mobile trends on the OpenSignal company blog. Follow on Twitter @samuelbjohnston.

The views expressed are those of the author and are not necessarily those of Scientific American.

Comments 2 Comments

Add Comment
  1. 1. sqiar 11:06 am 10/14/2013

    Great article outlining one of the many possibilities on how BigData can be tamed into useful insights. There are however no generic approaches towards cleansing “dirty data” and every time you are set out to investigate a new entity new algorithms and map/reduce code has to be written. I think this is where most of the companies are struggling and hence a slow upward trend in BigData. Good for companies consulting in BigData though :)

    Link to this
  2. 2. Jerzy v. 3.0. 5:16 pm 10/15/2013

    In my experience, big data is often a siren song. So much data… turns to be all white noise, and the possible conclusions are bl**dy obvious.

    Link to this

Add a Comment
You must sign in or register as a member to submit a comment.

More from Scientific American

Email this Article