Luk Arbuckle

Binomial coefficient explained simply

In probability on 18 June 2008 at 12:40 am

Undergraduate students are often given poor explanations of the logic behind the binomial coefficient. I had someone tell me they understood such “combinatorial methods” when it was explained to them in high school, but that their statistics courses left them confused. I checked my old undergraduate textbook in mathematical statistics and, I have to say, it’s terrible. University textbooks in mathematics and statistics, often costing $100 or more, should provide logic and intuition behind derivations. What follows is my attempt at making the derivation of the binomial coefficient more clear.

Let’s say you have n unique blog posts to read and you want to decide in which order to read them. In how many ways can you order the blog posts? The first item you read can be chosen from n blog posts. The second item you read can be chosen from n-1 blog posts, because you already selected the first item you intend to read (there’s one less to chose from). And so on, leading to n (n-1) (n-2)…3 x 2 x 1 different ways you can order your reading list. The shorthand for this is known as “n factorial”, usually written n ! in equations.

Example: If there are five blog posts then there are n ! = 5 ! = 120 ways to order your reading list.

Now let’s say you only have time to read r blog posts out of your list of n blog posts, where r is less than n. In how many ways can you select and order r blog posts to read? We know from above that there are n ! ways to order all the blog posts. If you’re going to read r blog posts from this list then you have (n-r) blog posts that you’re not going to read. Therefore you need to divide out the (n-r) ! different ways you can order the articles you are not going to read from the n ! ways to order all the blog posts. In other words you have n ! / (n-r) ! ways to select and order the r blog posts to read.

Example: If there are five blog posts and you only have time to read two of them then there are n ! = 5 ! = 120 ways to order the entire reading list, (n-r) ! = (5-2) ! = 6 ways to order the articles you are not going to read, and therefore n ! / (n-r) ! = 120 / 6 = 20 ways to select and order two articles to read.

But now let’s say you don’t care in which order you read the r blog posts. In how many ways can you select r blog posts to read from the n blog posts available to read? The problem is the same as above except that now you don’t care about the order of the r blog posts, and therefore you also need to divide out the r ! different ways you can order articles you can read. In other words you have n ! / (n-r) ! / r ! ways to select the r blog posts to read.

Example: If there are five blog posts and you only have time to read two articles, but you don’t care in what order, then there are n ! = 5 ! = 120 ways to order the entire reading list, (n-r) ! = (5-2) ! = 6 ways to order articles you are not going to read, r ! = 2 ! = 2 ways to order two articles you can read, and therefore n ! / (n-r) ! / r ! = 120 / 6 / 2 = 10 ways to select two articles to read.

The last derivation is the binomial coefficient, sometimes referred to by saying n choose r, since you have n distinct items from which you can choose r at a time. It comes up in the study of the binomial distribution, when we want to determine the probability of getting r successes in n independent trials and the probability of success doesn’t change for each trial.

The classic example is a coin toss. Say there are n tosses of a coin, and you’re betting on the number of heads that come up—let this be r. Now r ! is the number of ways you can order heads (two heads are better than one, but re-ordering them is pointless), and (n-r)! is the number of ways you can order tails. The point is you’re dividing these terms out of n !, the total number of ways you can order the heads and tails, since you don’t care about the repeated arrangements of heads or the repeated arrangements of tails (it’s the unique arrangements of heads and tails that matter to you).

  1. Thanks for the most excellent explanation of combinatorics. Clear explanations like this are far too rare.

  2. Right on my learning curve. Thank you.

  3. Thanks for trying (honestly), but I have to say that this explanation is extremely lacking, and doesn’t actually explain anything besides why there are n! ways of ordering n blog posts. If I want to know how many ways to order r blog posts chosen from a possible n, why should the unchosen blog posts matter? Why would I take the possible ways of ordering the (n-r) blog posts and divide n! by that number? You are leaving out the most important steps… can hardly call this an explanation, anymore than it would be if you were to just state the result.

  4. “Why should the unchosen blog posts matter?” Because you know there are n! orderings, but you need to remove the blog posts you are not going to read from consideration. Remember that you are choosing r blog posts at random, from a collection of n blog posts, so you can’t just consider r! alone (which assumes there are only r blog posts to begin with). Take a dozen books (n), and try selecting three (r) in every way possible. Play with the ideas until they make sense to you. Each step of the process was broken down and explained to some degree. The rest is up to you.

  5. I have to agree with Jim. I really appreciate someone taking the time to explain this, and the dividing out r! made sense, just not the “divide out the ones you aren’t considering” step…could you elaborate please?

    And yeah, I’ve already wasted a lot of paper writing out examples, and the penny still has potential energy…

  6. The discussion of the blog posts is clear, I find. But I don’t see how the coin toss situation is analogous. While there are n different blog posts and n different coin tosses, each blog post is whatever it is; while by contrast, each coin toss can result in one of two different outcomes, heads or tails. What would be nice then is a derivation of the binomial coefficient purely in the context of the coin toss situation. Thanks.

  7. There are two kinds of coin toss outcomes. Or: there are two kinds of outcome.
    Analogously, there are two kinds of blog posts statusses. The read ones and the unread ones.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.