Undergraduate students are often given poor explanations of the logic behind the binomial coefficient. I had someone tell me they understood such “combinatorial methods” when it was explained to them in high school, but that their statistics courses left them confused. I checked my old undergraduate textbook in mathematical statistics and, I have to say, it’s terrible. University textbooks in mathematics and statistics, often costing $100 or more, should provide logic and intuition behind derivations. What follows is my attempt at making the derivation of the binomial coefficient more clear.
Let’s say you have n unique blog posts to read and you want to decide in which order to read them. In how many ways can you order the blog posts? The first item you read can be chosen from n blog posts. The second item you read can be chosen from n-1 blog posts, because you already selected the first item you intend to read (there’s one less to chose from). And so on, leading to n (n-1) (n-2)…3 x 2 x 1 different ways you can order your reading list. The shorthand for this is known as “n factorial”, usually written n ! in equations.
Example: If there are five blog posts then there are n ! = 5 ! = 120 ways to order your reading list.
Now let’s say you only have time to read r blog posts out of your list of n blog posts, where r is less than n. In how many ways can you select and order r blog posts to read? We know from above that there are n ! ways to order all the blog posts. If you’re going to read r blog posts from this list then you have (n–r) blog posts that you’re not going to read. Therefore you need to divide out the (n–r) ! different ways you can order the articles you are not going to read from the n ! ways to order all the blog posts. In other words you have n ! / (n–r) ! ways to select and order the r blog posts to read.
Example: If there are five blog posts and you only have time to read two of them then there are n ! = 5 ! = 120 ways to order the entire reading list, (n–r) ! = (5-2) ! = 6 ways to order the articles you are not going to read, and therefore n ! / (n–r) ! = 120 / 6 = 20 ways to select and order two articles to read.
But now let’s say you don’t care in which order you read the r blog posts. In how many ways can you select r blog posts to read from the n blog posts available to read? The problem is the same as above except that now you don’t care about the order of the r blog posts, and therefore you also need to divide out the r ! different ways you can order articles you can read. In other words you have n ! / (n–r) ! / r ! ways to select the r blog posts to read.
Example: If there are five blog posts and you only have time to read two articles, but you don’t care in what order, then there are n ! = 5 ! = 120 ways to order the entire reading list, (n–r) ! = (5-2) ! = 6 ways to order articles you are not going to read, r ! = 2 ! = 2 ways to order two articles you can read, and therefore n ! / (n–r) ! / r ! = 120 / 6 / 2 = 10 ways to select two articles to read.
The last derivation is the binomial coefficient, sometimes referred to by saying n choose r, since you have n distinct items from which you can choose r at a time. It comes up in the study of the binomial distribution, when we want to determine the probability of getting r successes in n independent trials and the probability of success doesn’t change for each trial.
The classic example is a coin toss. Say there are n tosses of a coin, and you’re betting on the number of heads that come up—let this be r. Now r ! is the number of ways you can order heads (two heads are better than one, but re-ordering them is pointless), and (n–r)! is the number of ways you can order tails. The point is you’re dividing these terms out of n !, the total number of ways you can order the heads and tails, since you don’t care about the repeated arrangements of heads or the repeated arrangements of tails (it’s the unique arrangements of heads and tails that matter to you).