How much do London firefighters spend on saving helpless kittens? The Open Data answers

Wednesday, April 26, 2017

How much do London firefighters spend on saving helpless kittens? The Open Data answers

Original post in Spanish by Paloma Recuerdo

In this second post, a continuation of "Smart Cities: Squeezing Open Data with Power BI", we will analyse the problem and understand the measures taken by London Firefighters in view of these results as they try to alleviate the situation.

Hypothesis

The time has come to consider what information we want to extract from the data, what answers we are looking for. Some questions may be clear from the beginning of the analysis. Others, however, will emerge as the data reveals more information.

The problem: The alarming signal that led us to consider the analysis is the increase in the number of interventions by the fire department to perform these services, and the associated cost.

Image of campign launch with squirrel
Figure 1: Image of the campaign launched in 2016.

We begin to hypothesize what we will try to show with the data. With the conclusions we obtain, we will look for strategies or define initiatives that allow us to solve the problem or reduce / minimize its effects.

  • Hypothesis 1: The number of services increases each year. If no corrective measure is considered, the cost will continue to increase.
  • Hypothesis 2: The type of animal involved in the incident is essential when discriminating whether or not the intervention of the fire department is really necessary. The location of the incident (rural or urban) may also be related to the type of animal.

The first step is to take a look at the data. Load them as a table, create the names of the fields (some of them will be descriptive, others not), and try some filters. This brief preliminary exploration will help us choose which fields can provide the most relevant information for each report.

* In a more complex analysis, we would be in the phase of selecting which attributes provide us with a greater information gain, allowing us to segment the data more efficiently. For example, what are the attributes that allow us to group and predict the values of the "IncidentNominalCost" field (service cost)?

To work on Hypothesis 1, we will choose the "Line Chart" display. We select the sum field ´CalYear´ (calendar year) and drag it under the "Axis" label to represent this value on the vertical axis (ordered) and the PumpCount field we drag under the label " Values "to appear on the horizontal axis (abscissa). If we directly select the fields on the list, we can add them in an order that is not the one that interests us, that is why it is better to drag them directly to their final position.

Image of creation of line chart and selection of CalYear
Figure 2: The field CalYear (year) must appear under the label "Axis", while the field "PumpCount" (number of cases) must appear under the label "Values".

Thus, we obtain the first graph of the report, which shows the evolution of the number of rescue services per year.

Graph shwowing number of cases a year from 2008 to 2018
Figure 3: Evolution of the number of cases per year.

To obtaina a more efficient analysis of this graph it is necessary to apply some filters. Since the year 2017, we only have some data from the first quarter, so we will cut by full year. We apply the filter:

Example of the advanced filter.
Figure 4: Example of the advanced filter. It only shows the years before 2016

With the new applied filter, the graph would now be the following:


graph showing evolution of the number of cases per year
Figure 5: Evolution of the number of cases per year (filtered).


We can try another data visualization, "Funnel" where it is easier to appreciate the total value of the number of services performed per year. Changing from one to another is as easy as selecting in the display panel the new format with which we want to present the data. Power BI will do the rest of the work.

Horizontal bar graph image fo Funnel example
Figure 6: Display example "Funnel".

You can clearly see an increase in performances between 2009 and 2011, and how from that point  it begins to decline (we'll see why), to re-start in 2016. The sample is not very large, we see a trend of the progressive increase in the number of cases.
  
The evolution of the cost of the service per year is not exactly linear since the cost of the service is determined by the number of hours dedicated and may vary in each case. Clearly, in 2011, there has been a change in trend in the evolution of cost, which has been reversed in 2015.

Fill graph showing the evolution of the service cost per year.
Figure 7: Evolution of the service cost per year.

To work on Hypothesis 2, we choose the "Pie Chart" visualization. We select the sum field CalYear and drag it under the label "Axis" to represent this value in the vertical axis (ordinate) and field PumpCount we drag it under the label " Values "to appear on the horizontal axis (abscissa).

Pie chart showing: Number of services by type of animal.
Figure 8: Number of services by type of animal.

Clearly, most of the incidents have to do with small animals. When passing the pointer for each sector of the diagram we can see the concrete data and the percentage of the total (49.61% cats, 17.91% birds, and 17.79% dogs).



If we translate this into costs, and we return to the "Clustered Column Chart", we can see the public expenditure dedicated to rescue cats in the period 2009-2016: 866.834 GBP, of which 115.404 GBP was spent in 2016.

Vertical bar chart showing: Cost of the service by type of animal in 2016
Figure 9: Cost of the service by type of animal in 2016.

In this same period 2009-2016 GBP 307,418 has been dedicated to "rescue" birds, 11,084 GBP to rescue squirrels. Between 2009-2010 alone the fire department has had to rescue specimens of these small rodents in distress 34 times.

Analysing the distribution of notices according to the parameter "Animal Group Parent" has revealed information of great interest to us. We are going to finish this analysis by using information about its geographical distribution.

To analyse the geographical distribution, we chose the "Map" visualization. We selected the Borough field and dragged it under the "Location" label, the AnimalGroupParent field under the "Legend" label and the PumpCount field we dragged it under the "Size" label.

We see the distribution of notices regarding cats and dogs is fairly homogeneous in the most urban areas:

Map graph showing the geographic distribution of notices regarding the rescue of cats
Figure 10: Geographic distribution of notices regarding the rescue of cats (in yellow) and dogs (in orange).

Relative to other types of larger animals such as cows, bulls, deer, etc, it is more dispersed and associated with rural areas, as expected. In these cases, when dealing with large animals, it is most likely that the firefighters' participation is essential to resolving the situation.

Map graph showing the Geographical distribution of warnings regarding the rescue of large animals
Figure 11: Geographical distribution of warnings regarding the rescue of large animals such as bulls (in red), cows (in gray) and deer (in blue).
Therefore, as mentioned previously, the parameter "AnimalGroupParent" is emerging as one of the parameters that provides more information when discriminating or at least prioritizing services.
  
If some points appear "out of field", it can be due to duplications or errors in the names of postal codes or names of cities that coincide on either side of the Atlantic. In those cases, we can click directly on those points and exclude them from the graph.

map showing anonomous data
Figure 12: Example of clearly erroneous data due to duplications of names, numerical codes etc.

PowerBI also allows us to view segmentations. We can segment the report data by a specific value, for example, by year or by geographic location. As an example, we will segment the number of services performed in 2016 related to "small" animals (dogs, cats, birds, hedgehogs, hamsters, squirrels, ducks, etc.).

To perform this segmentation, we chose the "Slicer" visualization. Select the Borough field (District) and, automatically, all the other panels of the visualisations will show the information corresponding to that specific segment, which, in this case, corresponds to a district.

Graph showing Appearance of the visualisations with two different visualizations

Figure 13: Appearance of the visualisations with two different visualizations (table and funnel) after applying the segmentationBorugh = "City of London"



We could still go deeper with PowerBI and perform an analysis of the text included in the "FinalDescription" field. For example, to group and analyse in greater detail those in which the previous intervention of the RSPCA have mentioned ("... ASSIST RSPCA ..."), or occurrences such as "Trapped" or "Stuck". This type of "Text Analytics" can also be carried out with PowerBI thanks to its native integration with R.

Conclusions: Measures need to be taken

All the previous analysis has served to confirm the hypothesis 1 that could be reformulated as:

 "If no measures are taken, the inadequate consumption of public resources in this type of service will continue to increase year by year"

If we add the following data to this:
  • The citizens of London feel a great love for animals (it may seem a cliché, but the data we have analysed corroborates this)
  • The good citizen who calls the firefighters to help an animal does not pay out of their pocket for the cost of the service .... Or if they even pay for it?
It seems obvious to draw the conclusion that the citizen who makes the call is NOT aware of the cost incurred in carrying it out. They do not realize the waste of public money (and the misuse of an emergency resource) that is involved when calling the firefighters to rescue a hamster, or a dove trapped in a line or help a cat down from a roof.
  
This poses a problem for the one who haa to give a solution that should result in a more efficient use of public resources and a better service to the citizens. In this case, a public awareness campaign was proposed.


2012: The Campaign

In July 2012, the London Fire Department launched the campaign:

Poster of london Firefighters Campaign 2012
Figure 14: London Firefighters Campaign 2012: "I am an animal, get me out of here".

The objective of the campaign was to educate the citizens on how to continue being "good Samaritans" in the case of finding animals in delicate situations without, therefore, misusing public resources.

The campaign had two axes:
  • On the one hand, to inform people about the general cost to the citizens of giving this type of warning directly to the fire brigade
  • And on the other hand, to show what would be the most appropriate alternative route in this type of situation. In this case, call the RSPCA (Royal Society for the Revention of Cruelty to Animals).

This campaign had an immediate positive effect on the population that is reflected in the decrease in the number of calls registered as of 2012.

News article on the decline of animal rescue services.
Figure 15: News about the decline of animal rescue services.

However, in 2015 there was a new trend change with a rapid increase in the number of cases. The firefighters used social networks to spread the campaign.

Poster encouraging citizens to call the RSPCA
Figure 15: News about the decline of animal rescue services.
New article on the BBC about the campaign
Figure 17: BBC campaign

And, in February of 2017, an interactive map was published:


Interactive Map
Figure 18: Interactive Map

It is clear that these types of campaigns must be repeated periodically in order to maintain effectiveness.



Final conclusion

The greater availability of open data on public services, along with the different tools that allow them to be combined, help us to analyse patterns and create visual models that can quickly translate into cost savings, and leave citizens satisfied and involved with their environment.

This has been a very simple example, but with palpable results. If in 2016 we discounted the outputs of the Fire Department relating to mishaps of domestic or small animals, which we considered "avoidable", the savings would have been £ 215,160.

If we take into account the potential of applying Data Science to the entire arsenal of data collected and stored by institutions and companies today, we realize the great opportunity we have to improve our environment and our lives. Let's take it!



No comments:

Post a Comment