People often confuse the meaning of the probability (or confidence) associated with a confidence interval—the probability is not that the parameter is in a particular interval, but that the intervals in repeated experiments will contain the parameter. No wonder people get confused, as it sounds like the same thing if you’re not paying close attention to the wording. Even then I’m not sure that it’s clear.
Take polling data for elections. When it’s reported that a political party is currently getting a specified level of support (say 37%), with an accuracy of plus or minus some amount (say 2%), they normally state that the results are true 19 times out of 20 (that’s a 95% confidence level). This means that if they were to repeat the polling 20 times, the true level of support for that political party would fall within 19 intervals out of 20. It does not mean that there’s a 95% chance that the true level of support for that political party is within the range of support being quoted (35 to 39%) in that specific poll.
The intervals, they are a changin’
The point is that the probability statement is about the interval, not the parameter. Let’s say you’re building a confidence interval of the mean. The population mean is an unknown constant, not a random variable. The random variables are the sample mean and sample variance used to build the interval, which vary between experiments. In other words it is the interval that varies and which can be considered a “random variable” of sorts. Once values for the sample mean and sample variance have been calculated for an interval, it’s not correct to make probability statements about the population mean—that would imply that it’s a random variable. The population mean is a constant that either is or isn’t in the interval.
Another point to keep in mind is that all values in a confidence interval are plausible. So although support for a political party may be at 37% (as in the previous example), with an accuracy of plus or minus 2% the true level of support at the time of polling could be anything from 35 to 39% (with a confidence of 95%). And if you want to compare the level of support between two parties, in general you don’t want much overlap (for statistical significance at the 5% level the overlap should be no more than 1% support in our example).