Luk Arbuckle

Posts Tagged ‘distributions’

Binomial coefficient explained simply

In probability on 18 June 2008 at 12:40 am

Undergraduate students are often given poor explanations of the logic behind the binomial coefficient. I had someone tell me they understood such “combinatorial methods” when it was explained to them in high school, but that their statistics courses left them confused. I checked my old undergraduate textbook in mathematical statistics and, I have to say, it’s terrible. University textbooks in mathematics and statistics, often costing $100 or more, should provide logic and intuition behind derivations. What follows is my attempt at making the derivation of the binomial coefficient more clear.

Let’s say you have n unique blog posts to read and you want to decide in which order to read them. In how many ways can you order the blog posts? The first item you read can be chosen from n blog posts. The second item you read can be chosen from n-1 blog posts, because you already selected the first item you intend to read (there’s one less to chose from). And so on, leading to n (n-1) (n-2)…3 x 2 x 1 different ways you can order your reading list. The shorthand for this is known as “n factorial”, usually written n ! in equations.

Example: If there are five blog posts then there are n ! = 5 ! = 120 ways to order your reading list.

Now let’s say you only have time to read r blog posts out of your list of n blog posts, where r is less than n. In how many ways can you select and order r blog posts to read? We know from above that there are n ! ways to order all the blog posts. If you’re going to read r blog posts from this list then you have (nr) blog posts that you’re not going to read. Therefore you need to divide out the (nr) ! different ways you can order the articles you are not going to read from the n ! ways to order all the blog posts. In other words you have n ! / (nr) ! ways to select and order the r blog posts to read.

Example: If there are five blog posts and you only have time to read two of them then there are n ! = 5 ! = 120 ways to order the entire reading list, (nr) ! = (5-2) ! = 6 ways to order the articles you are not going to read, and therefore n ! / (nr) ! = 120 / 6 = 20 ways to select and order two articles to read.

But now let’s say you don’t care in which order you read the r blog posts. In how many ways can you select r blog posts to read from the n blog posts available to read? The problem is the same as above except that now you don’t care about the order of the r blog posts, and therefore you also need to divide out the r ! different ways you can order articles you can read. In other words you have n ! / (nr) ! / r ! ways to select the r blog posts to read.

Example: If there are five blog posts and you only have time to read two articles, but you don’t care in what order, then there are n ! = 5 ! = 120 ways to order the entire reading list, (nr) ! = (5-2) ! = 6 ways to order articles you are not going to read, r ! = 2 ! = 2 ways to order two articles you can read, and therefore n ! / (nr) ! / r ! = 120 / 6 / 2 = 10 ways to select two articles to read.

The last derivation is the binomial coefficient, sometimes referred to by saying n choose r, since you have n distinct items from which you can choose r at a time. It comes up in the study of the binomial distribution, when we want to determine the probability of getting r successes in n independent trials and the probability of success doesn’t change for each trial.

The classic example is a coin toss. Say there are n tosses of a coin, and you’re betting on the number of heads that come up—let this be r. Now r ! is the number of ways you can order heads (two heads are better than one, but re-ordering them is pointless), and (nr)! is the number of ways you can order tails. The point is you’re dividing these terms out of n !, the total number of ways you can order the heads and tails, since you don’t care about the repeated arrangements of heads or the repeated arrangements of tails (it’s the unique arrangements of heads and tails that matter to you).

Debunking the “law” of averages

In probability on 12 June 2008 at 2:40 am

One example of the “law of averages”—not to be confused with the law of large numbers—describes the belief that a particular event becomes more or less likely to occur in order to move a series of random events closer to the long-run average.  A coin toss, for example, is believed more likely to come up tails if ten heads have been successively thrown beforehand.  The belief is that the long-run average of 50/50 must be made up for, and therefore the probability of the single event must change to suit that long-run average.  

It’s not surprising that anyone would believe the law of averages, even if it is misguided.  Although the coin isn’t aware of what was thrown beforehand, there is the notion of a long-run average to contend with.  The binomial distribution, used to consider success and failure experiments like coin tosses, can be used to show why this law is false.  But we’ll skip the math and only discuss results.

Watch your heads
The variability of the number of successes increases with an increasing number of trials.  You toss a coin 10 times and get 10 heads: 5 heads more than average.  You toss a coin 100 times and get 60 heads: 10 heads more than average.  That’s an increase in the variability of the number of successes from 5 to 10.  But clearly you’re getting closer to the 50/50 split of heads and tails.  The problem, then, is that we’re considering the wrong measure.

The variability of the proportion of successes, however, decreases with an increasing number of trials.  You toss a coin 10 times and get 10 heads: a proportion of 10 over 10, or 1.  You toss a coin 100 times and get 60 heads: a proportion of 60 over 100, or 0.6.  As we toss the coin more times, the proportion of heads (heads divided by the number of tosses) gets closer to the 50/50 average we’re expecting.  

Throwing ten heads in succession will not change the probability of getting heads in the eleventh coin toss: it’s still 50/50 heads or tails.  But as the coin is repeatedly thrown those ten heads will have less effect on the proportion of heads that come up.