library(tidyverse)
library(gganimate)
make_df <- function(n) {
all <- tibble(
x = 0:n,
size = paste("n =", n),
prob = dbinom(0:n, n, .5),
stage = paste("n =", n),
group = 1:(n + 1),
fill = "p(x) n trials",
column = "n"
)
heads <- tibble(
x = 1:(n + 1),
size = paste("n =", n + 1),
prob = dbinom(0:n, n, .5) / 2,
stage = paste("n =", n, "transitioning to n =", n + 1),
group = 1:(n + 1),
fill = "p(x-1) n-1 trials",
column = "n - 1"
)
tails <- tibble(
x = 0:n,
size = paste("n =", n + 1),
prob = dbinom(0:n, n, .5) / 2,
stage = paste("n =", n, "transitioning to n =", n + 1),
group = 1:(n + 1),
fill = "p(x) n-1 trials",
column = "n - 1"
)
rbind(all, heads, tails) %>%
mutate(
fill = fct_relevel(
factor(fill),
"p(x-1) n-1 trials",
"p(x) n-1 trials",
"p(x) n trials"
),
column = fct_relevel(column, "n - 1")
)
}
num <- 12
df <- purrr::map_dfr(1:num, make_df) %>%
mutate(size = factor(size, levels = unique(size)),
stage = factor(stage, levels = unique(stage)))
If \(p=.5\), half of the sum of \(P(X=x) + (P = x-1)\) for \(n\) trials gives us \((P = x)\) for \(n+1\) trials. (For a more thorough explanation on why this is true, see Understandng the Binomial Recursively.)
That is,
\[\frac{b(x; n, .5)}{2} + \frac{b(x-1; n, .5)}{2} = b(x; n+1, .5)\]
This animation demonstrates this recursion visually for \(n = 2\) to \(n = 12\): the red and blue represent half of the probabilities for \(n\) trials, while the purple bars represent the binomial probabilities for \(n+1\) trials. (Note that the blue bars are half the height of the purple bars–the \(x's\) are the same so the purple bars shrink into the blue ones for \(n+1\) trials.)
g <-
ggplot(df, aes(x, prob, fill = fill, group = group)) +
geom_col() +
scale_fill_manual(values = c("red", "blue", "purple")) +
scale_x_continuous(breaks = 0:max(df$x)) +
labs(title = 'Binomial Distribution: {closest_state}', y = 'probability') + theme(legend.position = "bottom",
legend.title = element_blank())
anim <-
g + transition_states(stage) + exit_recolor(fill = "purple")
animate(anim, fps = 5)
Animations can be hard to digest, so here’s a static version:
df %>%
mutate(sizenum = parse_number(as.character(size))) %>%
filter(sizenum != 1, sizenum != max(sizenum)) %>%
ggplot(aes(x, prob, fill = fill)) +
geom_col() +
scale_x_continuous(breaks = 0:max(df$x)) +
scale_fill_manual(values = c("red", "blue", "purple")) +
facet_grid(size ~ column) +
ggtitle("Binomial distribution") +
theme(legend.position = "bottom",
legend.title = element_blank())