10 Chart: Bar Chart
10.1 Overview
This section covers how to make bar charts
10.2 tl;dr
I want a nice example. Not tomorrow, not after breakfast. NOW!
Here’s a bar chart showing the survival rates of passengers aboard the RMS Titanic:
And here’s the code:
library(datasets) # data
library(ggplot2) # plotting
library(dplyr) # manipulation
# Combine Children and Adult stats together
<- as.data.frame(Titanic) %>%
ship_grouped group_by(Class, Sex, Survived) %>%
summarise(Total = sum(Freq))
ggplot(ship_grouped, aes(x = Survived, y = Total, fill = Sex)) +
geom_bar(position = "dodge", stat = "identity") +
geom_text(aes(label = Total), position = position_dodge(width = 0.9),
vjust = -0.4, color = "grey68") +
facet_wrap(~Class) +
# formatting
ylim(0, 750) +
ggtitle("Don't Be A Crew Member On The Titanic",
subtitle = "Survival Rates of Titanic Passengers by Class and Gender") +
scale_fill_manual(values = c("#b2df8a", "#a6cee3")) +
labs(y = "Passenger Count", caption = "Source: titanic::titanic_train") +
theme(plot.title = element_text(face = "bold")) +
theme(plot.subtitle = element_text(face = "bold", color = "grey35")) +
theme(plot.caption = element_text(color = "grey68"))
For more info on this dataset, type ?datasets::Titanic
into the console.
10.3 Simple examples
My eyes were bigger than my stomach. Much simpler please!
Let’s use the HairEyeColor
dataset. To start, we will just look at the different categories of hair color among females:
<- as.data.frame(HairEyeColor)
colors
# just female hair color, using dplyr
<- colors %>%
colors_female_hair filter(Sex == "Female") %>%
group_by(Hair) %>%
summarise(Total = sum(Freq))
# take a look at data
head(colors_female_hair)
## # A tibble: 4 × 2
## Hair Total
## <fct> <dbl>
## 1 Black 52
## 2 Brown 143
## 3 Red 37
## 4 Blond 81
Now let’s make some graphs with this data.
10.3.1 Bar graph using base R
barplot(colors_female_hair[["Total"]],
names.arg = colors_female_hair[["Hair"]],
main = "Bar Graph Using Base R")
We recommend using Base R only for simple bar graphs for yourself. Like all of Base R, it is simple to setup. Note: Base R expects a vector or matrix, hence the double brackets in the barplot call (gets columns as lists).
10.3.2 Bar graph using ggplot2
library(ggplot2) # plotting
ggplot(colors_female_hair, aes(x = Hair, y = Total)) +
geom_bar(stat = "identity") +
ggtitle("Bar Graph Using ggplot2")
Bar plots are very easy in ggplot2
. You pass in a dataframe and let it know which parts you want to map to different aesthetics. Note: In this case, we have a table of values and want to plot them as explicit bar heights. Because of this, we specify the y aesthetic as the Total
column, but we also have to specify stat = "identity"
in geom_bar()
so it knows to plot them correctly. Often you will have datasets where each row is one observation and you want to group them into bars. In that case, the y aesthetic and stat = "identity"
do not have to be specified.
10.4 Theory
- For more info about plotting categorical data, check out Chapter 4 of the textbook.
10.5 When to use
Bar Charts are best for categorical data. Often you will have a collection of factors that you want to split into different groups.
10.6 Considerations
10.6.1 Not for continuous data
If you are finding that your bar graphs aren’t looking right, make sure your data is categorical and not continuous. If you want to plot continuous data using bars, that is what histograms are for!
10.7 Modifications
These modifications assume you are using ggplot2
.
10.7.1 Flip Bars
To flip the orientation, just tack on coord_flip()
:
ggplot(colors_female_hair, aes(x = Hair, y = Total)) +
geom_bar(stat = "identity") +
ggtitle("Bar Graph Using ggplot2") +
coord_flip()
10.7.2 Reorder the bars
With both base R and ggplot2 bars are drawn in alphabetical order for character data and in the order of factor levels for factor data. However, since the default order of levels for factor data is alphabetical, the bars will be alphabetical in both cases. Please see this tutorial for a detailed explanation on how bars should be ordered in a bar chart, and how the forcats package can help you accomplish the reordering.
10.7.3 Facet Wrap
You can split the graph into small multiples using facet_wrap()
(don’t forget the tilde, ~):
ggplot(colors, aes(x = Sex, y = Freq)) +
geom_bar(stat = "identity") +
facet_wrap(~Hair)
10.8 External resources
- Cookbook for R: Discussion on reordering the levels of a factor.
- DataCamp Exercise: Simple exercise on making bar graphs with
ggplot2
. - ggplot2 cheatsheet: Always good to have close by.
with