Chapter 40 Plotting Maps with R: An Example-Based Tutorial

Jonathan Santoso and Kevin Wibisono

In this short tutorial, we would like to introduce several different ways of plotting choropleth maps, i.e. maps which use differences in shading, colouring, or the placing of symbols within areas to indicate a particular quantity associated with each area, using R. The data set used throughout this tutorial is the 2015 to 2019 crime data from the city of Milwaukee, Wisconsin (obtained from https://data.milwaukee.gov/dataset/wibr). The variables of interest are as follows:

  1. ReportedYear, which takes integer values from 2015 to 2019.
  2. ALD (Aldermanic District), which takes integer values from 1 to 15. Each of these districts will be represented with an area in our choropleth maps. The data set contains some observations whose ALD equal 0 or NA, and we decided not to include these observations in our exploratory data analysis and visualisation.
  3. Arson, which takes binary values. It has a value of 1 if and only if the crime can be categorised as an arson.
  4. AssaultOffense, which takes binary values. It has a value of 1 if and only if the crime can be categorised as an assault offence.
  5. CriminalDamage, which takes binary value. It has a value of 1 if and only if the crime can be categorised as a criminal damage to property.
  6. LockedVehicle, which takes binary value. It has a value of 1 if and only if the crime can be categorised as a locked vehicle entry.
  7. VehicleTheft, which takes binary value. It has a value of 1 if and only if the crime can be categorised as a vehicle theft.

We note that a crime can be categorised as more than one categories. For example, the 27th row of the data set refers to a crime categorised as both an assault offence and a criminal damage to property.

As a first step, we load all the necessary libraries.

In order to plot custom map boundaries, we will need a .shp file for the boundaries, which can be obtained from https://data.milwaukee.gov/dataset/aldermanic-districts. This file contains coordinates, labels and shapes, and can be read (and automatically parsed) using the ‘rgdal’ package. In order to access the data in the .shp file, we use the command shapefile@data. Also, we use the fortify method to convert the .shp file into a dataframe.

The chunk of code below converts the .shp file into a dataframe

## OGR data source with driver: ESRI Shapefile 
## Source: "/home/travis/build/jtr13/cc19/resources/plotting_maps_tutorial/alderman_coord.shp", layer: "alderman_coord"
## with 15 features
## It has 6 fields
## Integer64 fields read as strings:  ALD COLORCAT

Now, we load the crime data, focussing on the seven columns mentioned above. Note that we delete rows whose ALD values are 0 or NA.

In order to label the plots nicely, we will need to plot the legends at the centroid of each polygon, whence centroid calculations must be performed. Also, we will need to map the ID column in the .shp file to our desired labelling, which is ALD. The mapping can be found in the .shp data file, where the row names corresponds to ID and the ALD column corresponds to our labels.

Reference: https://stackoverflow.com/questions/28962453/how-can-i-add-labels-to-a-choropleth-map-created-using-ggplot2.

40.1 Plotting using base R

Now, let’s use base R to visualise the number of vehicle-related crimes in each of the fifteen districts in 2018.

##  [1] (400,600]   (200,400]   (400,600]   (400,600]   (0,200]     (200,400]  
##  [7] (400,600]   (600,800]   (400,600]   (200,400]   (800,1e+03] (400,600]  
## [13] (600,800]   (400,600]   (600,800]  
## 7 Levels: (0,200] (200,400] (400,600] (600,800] ... (1.2e+03,1.4e+03]

Plotting maps in base R can be frustrating sometimes. Even though we only need to write a relatively short code, we are required to manually define the colour schemes. Moreover, we will also need to modify the data in the .shp file since we can only plot from the S4 data type.

40.3 Plotting interactively using leaflet

Now, we will create our map using leaflet, which provides interactivity features. For this plot, we are using the same dataframe.

40.4 Plotting using tmap

We will now use tmap to generate a faceted map of the sum of vehicle-related crimes in 2016 to 2019.

We can also generate an animated map based on the faceted map above.

The tmap_animation method will automatically generate a .gif file named ‘edav.gif’ in the same working directory as this .Rmd file. In order to display the .gif file, one may need to upload the file to giphy, and insert the link in the plain text of the .Rmd file. The link for this .gif file is https://media.giphy.com/media/lp6S5QJA78fMFfnUAh/giphy.gif.

From these plots, at least two insights can be drawn:

  1. Vehicle-related crimes more often happened in downtown districts (e.g. 3, 4, 6 and 12). This trend is consistent across the years.
  2. Using the facet map or the animated map, we can clearly see that the number of vehicle-related crimes in most districts had decreased quite signiificantly throughout the year.

In conclusion, ggplot2 offers a practical yet powerful way to plot maps. The same holds for leaflet, which provides interactivity. One may also consider tmap, a “powerful and flexible map-making package” which allows for a broader range of spatial classes. As tmap is built on the basis of a grammar of graphics, users already familiar with ggplot2 should be able to learn to use this versatile package easily. In the future, this tutorial can be expanded to create interactive plots that display how crime varies across years and potentially selectors to visualise different crimes in a single plot.

Happy coding with R!