I spent Tuesday at Hacks & Hackers Hack. Organised by Scraperwiki, and sponsored by the Guardian Open Platform, NUJ Dublin Freelance Branch and Innovation Dublin, HH&H was a one day exercise uniting Hacks (journalists) and Hackers (computer coders) to extract data from public databases.
Since the introduction of safety camera zones has been in the news, our group decided to look at some road safety information. Some of the data was pretty easy to locate and transfer to a spreadsheet. From the Central Statistics Office (CSO), we found population figures, broken down by county in the 2006 census. From the Road Safety Authority (RSA), data on number of road deaths by county, covering the year 2009 (preliminary). Finally, from An Garda Siochána, the number of camera zones in each county.
Finally, the tricky part. Thanks to our hacker, Victor Akujobi, we were able to extract from the RSA website a breakdown of penalty points by county for 2009.
Those numbers aren’t very fine grained. It would be nice to get details of deaths, penalty points and cameras down to the level of individual DEDs, for example, and roads (or road types, be it national, regional or local) within the DEDs, by we work with what we got. In addition, the population figures from the census are total population, taking no account of age profiles (under 16s are included) or numbers of adult non drivers.
When I got home on Tuesday evening, I tracked down some further road death/county statistics, covering the years 2000-2008, to check if the patterns we’d seen using the 2009 figures were a fluke. I was worried that in a single year, a few “blips” might throw off the findings. Over ten years however, a long-term pattern should emerge.
The number crunching isn’t very sophisticated. Using an OpenOffice spreadsheet, I generated a number of scatter diagrams. The patterns I see by simple visual examination aren’t verified by any sophisticated statistical tools, so if anyone wants to do some fancy number crunching and verify that what my eyes show me is true, feel free.
With those reservations in mind, our objective was to test the publicly stated reasoning behind the introduction of the speed cameras. Justice minister Dermot Ahern has said that the cameras will reduce deaths and injuries on the road, and were deployed at locations “which have been identified as having a high incidence of speed related collisions.” While no details is given on a county level on speed related deaths, we should expect to see correlations
(1) Camera zones and death rates
At the most basic, we would expect to see a correlation between these two variables. Using 2009 data, no apparent pattern emerged. The data seems a little less scattered when ten year data on fatalities is used, but there is still no clear pattern.
(2) Penalty points and death rates
I didn’t do this at H&HH, but thought it worth examining with ten year data. You can see the result below.
(3) Camera zones and penalty points
When we looked at the number of camera zones per county, compared with the number of penalty points, a pattern seemed to emerge. However, one of the things that also stood out was the size of one of the outliers. That solitary dot in the top right corner represents Dublin. The same pattern can be seen in both the 2009 figures, and over the ten year period 2000-2009.
(5) Population and Penalty Points
We also looked at populations and penalty points. With a few exceptions, the dots seem to cluster in the lower left hand corner. However, I emphasise this is based on a visual inspection only. The points to population ratio varies from 3.3 (Tippereary) to over 10 (Donegal and Galway). That is, in Donegal, there is one penalty point for every ten people, while in Tippereary there is one penalty point for every three people.
To be honest, I’m not sure quite what this tells us.
(6) Camera zones and penalty points revisited
Going back to one of the earlier graphs, we noted that Dublin seemed to be an outlier for some reason. We decided to run the numbers again, only this time excluding Dublin, ,and see if any clearer pattern emerged. Again, from visual inspection only, it appears there is a good correlation between the number of penalty points collected in a given county, and the number of speed zones announced this week. This pattern hold both on 2009 numbers only and over the ten year period.
So what does this mean? Some possible conclusions:
Perhaps Penalty points primarily target speeders, and speed-related deaths are not typical of all deaths. This would explain the apparent correlation between points and camera zones, and the lack of correlation between cameras/points and deaths.
Secondly, the Dublin “spike” in penalty points is interesting. Maybe there are more speeders in the capital. Or maybe there are more “fast” roads to tempt drivers. Or maybe, there are simply more traffic corps Guards and more speed traps.
Alternatively, perhaps both points and cameras are, in practice, revenue generating exercises, picking on easy targets. While that was the first possible conclusion that we reached in our presentation at the end of Hacks & Hackers, we should add that it would be nice to have the more detailed information the RSA and Gardai used in reaching their decision. Perhaps at a finer level of detail, patterns emerge that make more sense than those we found.
There is a widespread distrust of many government decisions in this country at present, fuelled by a suspicion that many policies are driven, not to obtain worthwhile ends (a reduction in road deaths, or a switch to greener cars, for example) but by a simple desire to raise more money in taxation. The patterns we found may not stand up to a sustained statistical tests, (its been too long since I studied statistics for me to remember how to calculate best fit lines, never mind calculate significance levels), but I can make the raw numbers available to anyone who wants them for such tests. Who knows, maybe you can even find better numbers than we did (number of provisional and full licence holders for example) which shed more light on the number and location of camera zones.
Sometimes, you get the feeling the government doesn’t trust the public to share the information paid for by their taxes. But stop and think for a minute, how transparent government would be if that data was put on the web for all to see. Imagine anyone could crunch the numbers and reassure themselves that the speed cameras really were in the right places. Fingal county council won kudos yesterday for the Fingal Open Data initiative (see http://www.fingal.ie/opendata/). Scraperwiki, the software used to power many of the projects, is open source. Open source works in part on the Many Eyes principle. Just as a journalist’s work is sub edited to catch errors, open source code is there for anyone to review, spot errors, and make improvements. Imagine open government on the same principles, where anyone could check the data. I think that would make for a better, more involved citizenry, and a true republic, of, by and for the people.
And it’d be cheaper than a consultants report too.
[Click here to download graphs in PDF format]
[Click here for the numbers used to generate the graphs (GoogleDocs)]
Gerard,
It was nice to meet you the other day at Hacks Hackers. There were certainly a lot of interesting ideas at the event, and hopefully it will lead to more public interest journalism. Inviting all those journalists certainly produced a lot of online and offline copy 🙂
My own humble effort is called Technical Nous With a Nose for News.
John: