Monday, July 13, 2015

Building a Better Crime Map: Learning QGIS for Mapping in Tableau




Hover over the map above to view crime rates by census tract. Use the zoom tools in the upper left to better locate your neighborhood.


I don't usually post how-to Tableau blog posts because there are so many good ones out there, but I learned some new tools for this one and wanted to share.

A friend of mine was recently complaining about local hackathons, "Great, just what we need, another crime map." Although I'm not usually confused by sarcasm, I decided to build a crime map. The Atlanta neighborhood crime maps I found were based on counts instead of rates, which is misleading when neighborhoods have different populations. There were other weaknesses too; some maps don't reveal their methodology, and most only offer a map, without the type of actions to other data that are easy in Tableau.

The Atlanta Police department shares a great crime dataset on their website that includes location coordinates, crime type, time, date, and neighborhood for the past five+ years. The problem is that Atlanta neighborhoods don't have good population data, making a crime rate per person difficult to calculate. So instead of using neighborhoods I used census tracts. To show crime rate per census tract in Tableau, I first had to merge the crime points to census shapes, something I had never done.

QGIS

I understood that I'd have to use GIS software (which I had never used) to merge points to shapes. I downloaded QGIS and did a couple beginner tutorials. QGIS is a free, open source software, and I found it easy to use, and very powerful.

I then tried a points in polygon analysis with my data. I merged the APD data with Georgia census tracts. I had to use a subset of the crime data because this merge was very slow. The result had the number of crimes per census tract, but I actually needed the census tract for each individual crime so I could take advantage of Tableau's aggregation and filtering features.

A spatial join allowed me to perform the merge I needed. The APD data was my target vector layer and the census data was my join vector layer. This technique was also much faster (not sure why) so the full data set was not a problem.

Excel

I then saved the crime data as an excel file and merged in income and population data by census tract, using Excel because sometimes I'm lazy. I used census tract population and income data from the 2013 American Communities Survey 5-year data set, available on American FactFinder.

Alteryx

Although I merged census tract to points, I still needed to load the census shapes in Tableau for the visual. To do this, I used the Tableau Shapefile to Polgon Converter from Alteryx. Theres a few way to do this, but I find the Alteryx solution very easy, as long as the shapefile isn't too huge.

Tableau

I then loaded the census tract file in Tableau to make a polygon-shaded map and blended in the crime data on census tract geoID to make the color variable. The resulting visual, with some additional graphs as hover actions, is at the top of the post.

Counts vs. Rates

To see the value of switching to counts, see the two maps below using counts on the right and rates on the left. The maps are noticeably different, with the highest crime area moving from downtown to a southeast tract.



Understanding the Data

I also investigated the resulting product based on my curiosity about crime in my city. I eventually realized that the larceny variable had a big impact on the map. Larceny is the most common and mild crime in the dataset- theft without the use of force or trespassing. This includes shoplifting, so the larceny variable was highlighting commercial neighborhoods, which is misleading because my rate only controls for residential population. So dropping the larceny variable made the map much more accurate as a picture of residential crime rate. I would have missed this if I hadn't spent time investigating the resulting visual. See below. Including the larceny offenses really highlights downtown, which I don't think is accurate.



I also added a couple other visuals with the crime data that didn't map census tracts. I put the visual together in this post geared towards a less technical audience.

6 comments:


  1. Hadoop online training .All the basic and get the full knowledge of hadoop.
    hadoop online training

    ReplyDelete
  2. Truth be told, web recreations are being seen as the fate of the intelligent stimulation industry. download snaptube

    ReplyDelete
  3. Numerous free internet amusements are glimmer or java recreations that require modules and. download mobdro

    ReplyDelete
  4. different downloads keeping in mind the end goal to play. On the upside, all these free internet. download xender

    ReplyDelete
  5. Hadoop bigdata online training,with all the basic knowledge.
    hadoop online training.

    ReplyDelete
  6. That is very interesting; you are a very skilled blogger. I have shared your website in my social networks! A very nice guide. I will definitely follow these tips. Thank you for sharing such detailed article.
    Tableau Online Training

    ReplyDelete