Undergraduate students are often given poor explanations of the logic behind the binomial coefficient. I had someone tell me they understood such “combinatorial methods” when it was explained to them in high school, but that their statistics courses left them confused. I checked my old undergraduate textbook in mathematical statistics and, I have to say, it’s terrible. University textbooks in mathematics and statistics, often costing $100 or more, should provide logic and intuition behind derivations. What follows is my attempt at making the derivation of the binomial coefficient more clear.

Let’s say you have *n* unique blog posts to read and you want to decide in which order to read them. In how many ways can you order the blog posts? The first item you read can be chosen from *n* blog posts. The second item you read can be chosen from *n*-1 blog posts, because you already selected the first item you intend to read (there’s one less to chose from). And so on, leading to *n* (*n*-1) (*n*-2)…3 x 2 x 1 different ways you can order your reading list. The shorthand for this is known as “*n* factorial”, usually written *n* ! in equations.

**Example**: If there are five blog posts then there are *n* ! = 5 ! = 120 ways to order your reading list.

Now let’s say you only have time to read *r* blog posts out of your list of *n* blog posts, where *r* is less than *n*. In how many ways can you select and order *r* blog posts to read? We know from above that there are *n* ! ways to order all the blog posts. If you’re going to read *r* blog posts from this list then you have (*n*–*r*) blog posts that you’re *not* going to read. Therefore you need to divide out the (*n*–*r*) ! different ways you can order the articles you are *not* going to read from the *n* ! ways to order all the blog posts. In other words you have *n* ! / (*n*–*r*) ! ways to select and order the *r* blog posts to read.

**Example**: If there are five blog posts and you only have time to read two of them then there are *n* ! = 5 ! = 120 ways to order the entire reading list, (*n*–*r*) ! = (5-2) ! = 6 ways to order the articles you are not going to read, and therefore *n* ! / (*n*–*r*) ! = 120 / 6 = 20 ways to select and order two articles to read.

But now let’s say you don’t care in which order you read the *r* blog posts. In how many ways can you select *r* blog posts to read from the *n* blog posts available to read? The problem is the same as above except that now you don’t care about the order of the *r* blog posts, and therefore you also need to divide out the *r* ! different ways you can order articles you can read. In other words you have *n* ! / (*n*–*r*) ! / *r* ! ways to select the *r* blog posts to read.

**Example**: If there are five blog posts and you only have time to read two articles, but you don’t care in what order, then there are *n* ! = 5 ! = 120 ways to order the entire reading list, (*n*–*r*) ! = (5-2) ! = 6 ways to order articles you are not going to read, *r* ! = 2 ! = 2 ways to order two articles you can read, and therefore *n* ! / (*n*–*r*) ! / *r* ! = 120 / 6 / 2 = 10 ways to select two articles to read.

The last derivation is the binomial coefficient, sometimes referred to by saying *n choose r*, since you have *n* distinct items from which you can *choose* *r* at a time. It comes up in the study of the binomial distribution, when we want to determine the probability of getting *r* successes in *n* independent trials and the probability of success doesn’t change for each trial.

The classic example is a coin toss. Say there are *n* tosses of a coin, and you’re betting on the number of heads that come up—let this be *r*. Now *r* ! is the number of ways you can order heads (two heads are better than one, but re-ordering them is pointless), and (*n*–*r*)! is the number of ways you can order tails. The point is you’re dividing these terms out of *n* !, the total number of ways you can order the heads and tails, since you don’t care about the repeated arrangements of heads or the repeated arrangements of tails (it’s the unique arrangements of heads and tails that matter to you).

Thanks for the most excellent explanation of combinatorics. Clear explanations like this are far too rare.

11 December 2010at4pmRight on my learning curve. Thank you.

7 January 2011at4pmThanks for trying (honestly), but I have to say that this explanation is extremely lacking, and doesn’t actually explain anything besides why there are n! ways of ordering n blog posts. If I want to know how many ways to order r blog posts chosen from a possible n, why should the unchosen blog posts matter? Why would I take the possible ways of ordering the (n-r) blog posts and divide n! by that number? You are leaving out the most important steps… can hardly call this an explanation, anymore than it would be if you were to just state the result.

24 October 2011at5pm“Why should the unchosen blog posts matter?” Because you know there are n! orderings, but you need to remove the blog posts you are not going to read from consideration. Remember that you are choosing r blog posts at random, from a collection of n blog posts, so you can’t just consider r! alone (which assumes there are only r blog posts to begin with). Take a dozen books (n), and try selecting three (r) in every way possible. Play with the ideas until they make sense to you. Each step of the process was broken down and explained to some degree. The rest is up to you.

27 October 2011at2pmI have to agree with Jim. I really appreciate someone taking the time to explain this, and the dividing out r! made sense, just not the “divide out the ones you aren’t considering” step…could you elaborate please?

And yeah, I’ve already wasted a lot of paper writing out examples, and the penny still has potential energy…

22 November 2011at9amThe discussion of the blog posts is clear, I find. But I don’t see how the coin toss situation is analogous. While there are n different blog posts and n different coin tosses, each blog post is whatever it is; while by contrast, each coin toss can result in one of two different outcomes, heads or tails. What would be nice then is a derivation of the binomial coefficient purely in the context of the coin toss situation. Thanks.

23 April 2012at4amThere are two kinds of coin toss outcomes. Or: there are two kinds of outcome.

Analogously, there are two kinds of blog posts statusses. The read ones and the unread ones.

11 May 2012at8amClear and concise. This is exactly the kind of explanation I was looking for. Thank you!

17 October 2013at12pmI third Jim’s comment. And it is not helpful to say “play around with the ideas and think for yourself” — people would not come here if they were not having trouble playing around with ideas. There is not enough space for me to give a truly adequate explanation, but I will try to help by sharing the result of my own mathematical struggles.

1) If you have n objects, we already know they can be arranged in n! ways (where order matters) How many ways can you choose r objects from n, with order mattering? Well, this is n*(n-1)*(n-2)*…*(n-r)*(n-(r-1)) = n!/(n-r)!, since every member of the product after (n-(r-1)) cancels.

2) We now have to get rid of the “order matters” part. We need the number of ways r objects can be sampled from n, but in no specific order. Dividing by r! someone cancels the repetitions of these sample orderings (ex: ABC, ACB, BCA, etc. are all repetitions of a sampling consisting of A,B,C) to give us the answer.

So how does dividing by r! do this? Well, r! is “the number of ways that r things can be arranged”. n*(n-1)*(n-2)*…*(n-r)*(n-(r-1))/r! is “the number of times that (the number of ways that r things can be arranged) fits into the total ways our sampling of r things from n can be arranged, with order mattering”. But there are r! orderings of each unique sample of r things from n. Each unique sample of r things from n consequently has r! arrangements occurring in n*(n-1)*(n-2)*…*(n-r)*(n-(r-1)). Therefore, n*(n-1)*(n-2)*…*(n-r)*(n-(r-1))/r! is “the number of unique samples of r things in n”.

Perhaps thinking about it from the answer backwards is more helpful. Suppose n*(n-1)*(n-2)*…*(n-r)*(n-(r-1))/r! = k. This means that “there are k unique populations of r objects that can be chosen from n objects”. So, each population k has r elements in it. r elements can be arranged in r! ways. So, what is the sum of the total ways that each population in k can be ordered? This is k*r!, which we know from the answer is (n-2)*…*(n-r)*(n-(r-1)). So, to find k, the number of unique populations of r objects taken from n total objects, we would divide (n-2)*…*(n-r)*(n-(r-1)) by r!.

The key for me was recognizing that r! stands for a unique sampling of r things. Thus (n-2)*…*(n-r)*(n-(r-1))/r! is the number of such unique samples in (n-2)*…*(n-r)*(n-(r-1)), which is equal to n*(n-1)*(n-2)*…*(n-r)*(n-(r-1)) = n!/(n-r)! as I explain in pt. 1

12 December 2013at8amIt’s a prerequisite in math that you “play around with the ideas”. More often than not you’ll need to put in the work until something “clicks”. It may require reading explanations from multiple sources, working on problems, asking questions, formulating your own answers, etc. I did my best to provide a clear and concise answer, based on my playing around with the ideas, and I appreciate your taking the time to try and add more to the discussion.

12 December 2013at1pmNailed it! Great explanation.

7 April 2014at10pm