Monday, July 13, 2015

Building a Better Crime Map: Learning QGIS for Mapping in Tableau




Hover over the map above to view crime rates by census tract. Use the zoom tools in the upper left to better locate your neighborhood.


I don't usually post how-to Tableau blog posts because there are so many good ones out there, but I learned some new tools for this one and wanted to share.

A friend of mine was recently complaining about local hackathons, "Great, just what we need, another crime map." Although I'm not usually confused by sarcasm, I decided to build a crime map. The Atlanta neighborhood crime maps I found were based on counts instead of rates, which is misleading when neighborhoods have different populations. There were other weaknesses too; some maps don't reveal their methodology, and most only offer a map, without the type of actions to other data that are easy in Tableau.

The Atlanta Police department shares a great crime dataset on their website that includes location coordinates, crime type, time, date, and neighborhood for the past five+ years. The problem is that Atlanta neighborhoods don't have good population data, making a crime rate per person difficult to calculate. So instead of using neighborhoods I used census tracts. To show crime rate per census tract in Tableau, I first had to merge the crime points to census shapes, something I had never done.

QGIS

I understood that I'd have to use GIS software (which I had never used) to merge points to shapes. I downloaded QGIS and did a couple beginner tutorials. QGIS is a free, open source software, and I found it easy to use, and very powerful.

I then tried a points in polygon analysis with my data. I merged the APD data with Georgia census tracts. I had to use a subset of the crime data because this merge was very slow. The result had the number of crimes per census tract, but I actually needed the census tract for each individual crime so I could take advantage of Tableau's aggregation and filtering features.

A spatial join allowed me to perform the merge I needed. The APD data was my target vector layer and the census data was my join vector layer. This technique was also much faster (not sure why) so the full data set was not a problem.

Excel

I then saved the crime data as an excel file and merged in income and population data by census tract, using Excel because sometimes I'm lazy. I used census tract population and income data from the 2013 American Communities Survey 5-year data set, available on American FactFinder.

Alteryx

Although I merged census tract to points, I still needed to load the census shapes in Tableau for the visual. To do this, I used the Tableau Shapefile to Polgon Converter from Alteryx. Theres a few way to do this, but I find the Alteryx solution very easy, as long as the shapefile isn't too huge.

Tableau

I then loaded the census tract file in Tableau to make a polygon-shaded map and blended in the crime data on census tract geoID to make the color variable. The resulting visual, with some additional graphs as hover actions, is at the top of the post.

Counts vs. Rates

To see the value of switching to counts, see the two maps below using counts on the right and rates on the left. The maps are noticeably different, with the highest crime area moving from downtown to a southeast tract.



Understanding the Data

I also investigated the resulting product based on my curiosity about crime in my city. I eventually realized that the larceny variable had a big impact on the map. Larceny is the most common and mild crime in the dataset- theft without the use of force or trespassing. This includes shoplifting, so the larceny variable was highlighting commercial neighborhoods, which is misleading because my rate only controls for residential population. So dropping the larceny variable made the map much more accurate as a picture of residential crime rate. I would have missed this if I hadn't spent time investigating the resulting visual. See below. Including the larceny offenses really highlights downtown, which I don't think is accurate.



I also added a couple other visuals with the crime data that didn't map census tracts. I put the visual together in this post geared towards a less technical audience.

Sunday, July 12, 2015

Atlanta Crime Map



Crime maps often show only counts, but this map uses census data to show crime as a rate, giving a more accurate picture. 

Hover over the map above to view crime rates by census tract. Use the zoom tools in the upper left to better locate your neighborhood. 

The city has large variations in crime rates. The northern Atlanta neighborhood with the lowest crime rate in the city (between Northside Parkway and Peachtree Road), has a crime rate of 2.8 per 1000 residents, while some neighborhoods have crime rates of over 100 per 1000 residents.

To better understand the map consider the graph below with my own neighborhood highlighted.

In the "2014 Crime Rate by Type" graph, I'm glad to see that my neighborhood's crime rate  is below the city average for each category. Burglary is the most common crime and closest to the city average as a percentage, which seems to mirror the concerns I see on my neighborhood Facebook page. However, that rate is about 10 per 1000 persons (1%), so risk of being burglarized in a given year is not high.

The "Crimes per Year" graph also has promising trends for my neighborhood. Last year (2014) was a relatively low year for crime in my neighborhood, and there has been a large drop since 2009.



These maps include all crime recorded in the public Atlanta Police Department dataset, with the exception of larceny (theft without unlawful entry or threat of force). Larceny was excluded because it includes shoplifting, and resulting maps then show highest crime rates in commercial areas, and differing levels of residential crime are harder to see.

The maps above use census tracts instead of city neighborhoods because census tracts have better population data. The visual below uses neighborhoods to show changes in crime rates and to map the location of each crime. Larceny is included in these maps. Again, my own neighborhood is highlighted, but the user can click on any neighborhood using the graph on the left.

The map shows exact locations of crimes and I'm able to see there were only three robberies (the category that includes muggings) in Kirkwood during the first half of 2015 (January-May 18). I also found it useful to pick a longer time range (slider on top right) and a single crime type (drop down on the top left), then see what times of day that crime occurs. In Kirkwood, robberies tend to happen in the evenings, but burglaries are more likely to happen in the morning.



Update (7/12): I had a couple people  ask about the relationship between income in crime. It's pretty strong; see the graph below. Census tract income data during non-census years is not very precise (I'm using the ACS 2013 five-year file); otherwise the relationship would be even stronger.




For a more information on to build these visuals, see this post.