Luk Arbuckle

Statistically relevant or statistically significant?

In hypothesis testing on 31 July 2008 at 10:25 am

I came across the use of “statistically relevant” in something I was reading online and, since I had never heard of it before, decided to look it up. But it’s usage varies. Some use it to mean statistically significant, which seems wrong since we have a precise definition of that, and in other cases I’m not sure what they mean, exactly.

I asked a few people in applied statistics and they had never seen the use of statistically relevant, or come across a formal definition. A long conversation ensued as we attempted to figure out its precise meaning. The terms practical significance came up, meaning something that is statistically significant and also of practical use. Medical or health scientists sometimes call this biological significance. The terms practical (or biological) relevance also came up for the case that something is not statistically significant but still practical.

Enter philosophy
As it happens, the definition of statistical relevance is from philosophy (bear with me). The property C is statistically relevant to B within A if and only if P(B, A&C) does not equal P(B, A-C). The definition is then used in combination with a partitioning of A via a property C to create a model that states that if P(B, A&C) > P(B, A) then C explains B. It’s a model trying to define what constitutes a “good” explanation.

We can say that “copper (C) is statistically relevant to things that melt at 1083 degrees Celsius (B) within the class of metals (A)”. Considering the definition, we have that P(B, A&C) = 1 (it melts at 1083 and is copper) and, given that no other metal melts at 1083 degrees, P(B, A-C) = 0 (it melts at 1083 and is a metal that is not copper), which implies statistical relevance.

Note that property C in the above example partitions the reference set A with (A&C) and (A-C), and P(B, A&C) = 1 > P(B, A) (since copper is the only metal that melts at 1083, and there are currently 86 known metals, the probability that it melts at 1083 and is metal is 1/86). Therefore, using this model of a good explanation, we can say that it melts at 1083 degrees because it is copper (or, following the language in the model, that it is copper explains why it melts at 1083).

Correlation is not causation
What I’ve found is that people familiar with this definition from philosophy use “A is statistically relevant to B” to mean two things: (i) A is related to B (correlated), (ii) B is explained by A (causal). The definition supports (i), but I believe they’re using it incorrectly in (ii) with the model of a good explanation in mind (which, by the way, is by a researcher named Salmon).

I’m no philosophy major, but I think it’s safe to say that the terms statistically relevant should not be confused with statistically significant. Extremely low probability events can be statistically relevant, and since it’s not saying anything more than “there’s a slight correlation”, it’s not really saying all that much in the context of statistics. Terms such as practical significance, or practical relevance, seem appropriate in the contexts described above, but avoid using statistically relevant unless you, and your readers, know the definition.

  1. I have just found your blog coming from here (http://mastersinhealthinformatics.com/2009/top-50-health-informatics-blogs/) and I’m really impressed. Congratulations it’s awesome.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: