Air Quality: How can Open Data and Mobile Data provide actionable insights?

Monday, 12 December 2016

Air Quality: How can Open Data and Mobile Data provide actionable insights?

By Javier Carro and Pedro de Alarcón, PhD. Data Scientists at LUCA.

Air quality
Figure 1: Air Quality: How can Open Data and Mobile Data provide actionable insights?

Today on our blog we've decided to take the mobility and traffic Big Data  analysis we started here a little bit further, looking at the relationship between commuting and air pollution. Air quality is clearly a major challenge for large urban areas and according to the WHS, it is also a serious health risk, which is concerning given that in 92% of the world population in 2014 was living in places where the WHO air quality guidelines levels were not met 

Reducing road traffic to improve air quality is proving a struggle for local governments, and to address this issue, they are monitoring a range of harmful gases on a day to day basis. One of these is Nitrogen Dioxide (NO2), and its production correlates directly to the density of motorised vehicle traffic as well as atmospheric conditions.

To investigate this, we decided to visualize our Smart Steps data on mobility in Madrid alongside Open Data about NO2 measurements from the Madrid "Datos Abiertos" website. Here, we could find pollution measurement data from a range of air pollution sensors throughout the city, including the locations and the types of stations. You can also find local government policies on pollution protocol when NO2 levels are too high, citizen advice relating to air quality and information on their mobile application here.

In our study, we focused on hourly NO2 readings registered in the 24 available stations from January to September 2016. To get the whole picture, it was important for us to find out (1) how often the stations exceeded alarm levels (200 micgr./m3), (2) the average levels (<40 micgr./m3 as normal average) and (3) the type of the stations (close to roads, residential areas, underground stations). 

Once the data was processed, it was relatively straightforward to build a dashboard to analyze the behaviour of each of the stations across the period as you can see below:

Figure 2: Dashboard (TIBCO Spotfire) with the KPIs about the NO2 measurements in Madrid.

The dashboard in figure 2 reveals some clear insights: 

  1. There is a significant increase of NO2 pollution in September, probably due to the lack of rain and wind (top right). 
  2. Unsurprisingly, there are clear increases during rush hours (from 7:00 to 9:00 in the morning and 20:00 to 22:00 in the evening), although interestingly during the evening rush hour NO2 pollution levels are very similar regardless of whether it is a week day or the weekend (figure 3).

Hourly average
Figure 3: Hourly average NO2 levels per day of the week.

As a next step, we overlaid average NO2 levels on top of the density of workers in each postcode. As you can see in figure 3, there is a clear correlation between NO2 levels, the type of station and the density of traffic and both variables. Furthermore, there is a “green dot” in the middle of Madrid which represents the 350 acre Retiro Park. According to the data, this is unsuprisingly the best place to go you want some fresh city air, any time or any day of the week.

Postcodes in central Madrid
Figure 4: Postcodes in central Madrid represented as coloured polygons according to the density of workers (red denoting a greater density).  The markers represent air pollution sensors across the ctiy. 

In the above image, the markers denote air pollution monitoring stations. The red marker color shows that the station exceeded the average city pollution level for most of the months in the data-set. One should consider that 3 out of these 4 stations are close to a main street or motorway which results in higher measurements. When comparing against other cities around the world, the air quality in Madrid isn't actually that bad, as only 4 out of 24 stations exceeded the recommended threshold of 40 micgr./m3 (on average). ,

One should also highlight that there is one station in Madrid’s city centre (in the Plaza del Carmen) which registers high NO2 levels even though it is located close to a pedestrianized zone. However, when we take a closer look at Google Maps, we can see it is located between two car parks, which explains the above-average NO2 readings.

The video we have prepared below shows how the usual home-work-home routes are closely related to pollution patterns. You will also see how areas with higher worker density have higher pollution levels, even though not all highly polluted areas have high worker density.

As we only analyzed pollution data for the centre of Madrid, we were curious to look at the rest of the Madrid region. We found a 2013 official report with a heat map of areas surpassing the maximum 200 mic./m3 threshold, then we made a visual comparison with the density map generated from Smart Steps mobility patterns (figure 5 and figure 6), showing considerable similarities as you can see below.

Air quality in Madrid Region
Figure 5: Air quality (NO2) in Madrid Region. Geographic distribution of the number of hours with values greater than 200 micrograms/m3.

Heat map
Figure 6: Heat map generated from density of workers in the region of Madrid.

Overall, it is important to mention one limitation when assessing the statistical significance of such "visual" correlation: the low number of available stations. Although a data-driven approach to pollution is extremely important for society, it also isn't affordable to place dozens of stations across the city to measure harmful gases such as NO2. One cheaper alternative would be to place mobile stations to monitor NO2 levels, which is one of the main objectives of the EU Japan collaboration project. The local government of Madrid is currently starting to deploy this mobile solution across the city's bus network as you can see in this article.

We would love to play with the data collected from those mobile sensors in order to create a correlation model of traffic and pollution.  However, in the mean time, we have Smart Steps data as a powerful complementary source to find out which areas of the city are affected by NO2 more importantly, making short term forecasts for policy makers to act accordingly.

Needless to say, we would encourage all of our Madrid-based readers to use public transport, car sharing services and cleaner vehicles to ensure we start to reduce traffic and create a healthier city for the future, in line with the UN Sustainable Development Goals

No comments:

Post a Comment