Chapter 42 Using Stamen Maps for Plotting Spatial Data

Kumari Nishu and Neelam Patodia

Objective: We intend to highlight the usability of Stamen Maps for visualizing spatial data.

Approach: We conduct a comparative study between different types of graphs to understand the visualization tool which best demonstrates spatial data.We use the publicly available data on the number of vehicle collisions in New York for this demonstration.

For the purpose of our analysis, we would try to see patterns associated with number of persons injured. For this we have selected the following potential parameters : Time, Latitude, Longitude, Borough and Contributing Factor with respect to vehicles. Since data was available for each granular time, we’ve created a broader bucket to categorise time into 4 slots.

Using the plots above, we had to plot 5 graphs (3rd GGPLOT) to understand the pattern of accidents occuring across differnt Boroughs at different times and the factor contributing the most to the accident. However, this results in assimilating information from multiple graphs which is indeed tedious. Additionally, if we were to further analyse the sub locatlities within each Borough, plotting the latitudes and longitudes on an x-y axis would not convey much information. We, thus try a different approach of adding these plots on an actual map of the area recorded.

  • We have used Stamen maps as opposed to Google maps as the latter requires an API and costs money beyond a certain number of views per day.

  • The qmplot function in Stamen maps is used to add the map background.

  • We get the base map of New York based on the Latitudes and longitudes (Step 1).

  • Similar to ggplot, we can add layers on the maps which enbales us with a better visualization of events occuring in a locality.

  • Step 2 shows how the map would look if we plotted all the accidents in the raw format based solely on the latitudes and longitudes.

  • In the next steps, we have thus grouped data on potential parameters such as Borough and Contributing Factor. Additionally, instead of plotting all data points for all latitudes and longitudes, we consider the mean latitude and longitude to give a better perspective of the number of accidents in a locality.

  • Stamen Maps have various map backgrounds such as toner, watermark, burning, terrain etc for varied requirements. For our analysis, toner seemed to be the best fit as it highlighted the cities and cross sections which are vital for understanding vehicle collisions on the road.

  • We use a variation of size and colour to highlight 2 parameters i.e the size of the bubble denotes the if more or less number of accidents occured in a region based on latitude and longitude. And colour denotes if there is a pattern across different time stamps. The colour gradient can be changed to accomodate for different parameter.

  • Using maps have thus enabled us to seamlessly identify patterns in the data by accomodating for multiple parameters into one graph. Overall, it offers a better visualization of spatial data as opposed to normal graphs.