Friday, March 5, 2010

Statistics versus Causality - A predictable impasse

Stop LandminesImage by Cedric Favero via Flickr
My undignified reply to Andrew Gelmans's take on Causality and Statistical Learning


The causality people and the statistics people are talking past each other, your [Andrew Gelman's] 12 page magnum opus included.

Point 0) Sense of responsibility → decision → commitment to action/inaction → action/inaction ⇒ implies you possess a general description of reality, unless you are limiting yourself to a very narrow sphere of responsibility.

Point 1) Statistics cannot be the basis for a general description of reality because of Simpson's Paradox.  When it arises, the paradox can only be eliminated by an appeal to plausible causality, directly or indirectly.  Also, no statistical test exist, for a static situation, to make a prediction of what relationships would prevail if conditions change -- again, only an appeal to causality can do such.  (See Judea Pearl's book Causality, chapter 6)

Point 2) Causality cannot be the basis for a general description of reality because reality violates the assertion of independent variables needed for effective causal analysis ("no true zeroes" as you put it).  Reality doesn't even adhere to the laws of conditional probability [ http://www.stat.columbia.edu/~cook/movabletype/archives/2009/09/the_laws_of_con.html ] much less the structure of independence needed for causal analysis.

Illustration of the continuous version of Simp...Image via Wikipedia
Point 3) There are no other contenders for general descriptions of reality besides statistics or causality.

Conclusion) SOL

So people, under the burden of responsibility, must maintain several models of reality, over smaller and larger domains of applicability, some statistical, some causal, some based on symmetry & curve fitting, some based on the laws of probability, some based on scientific laws, some based on economic laws, some based on rules of thumb, some based on multiple simulation runs, some hybrids.  These models compete against each other, at the cost of maintenance, data collection, computation, and comparison, with the benefit of correct probabilistic predictions of consequences of action/inaction, or the benefit of demonstrations of broad range of uncertainty that swamps discernment of effects between decisions.

And the sense of responsibility is made of shifting sands, and human values and goals are not static.  So you could pay all the costs for a model, just to dispense with it.

An Inglehart-Welzel Cultural Map of the World:...Image via Wikipedia
But all this *still* can be done for individuals or small groups.  Once you get past 30 members, what is rewarded are techniques for rubber stamping decisions already taken by the politically powerful, under the name of "objective analysis" for political cover.

So "small" decisions can be made quite well, with effort.  And "large" decisions are made quite poorly, because evidence of a cold calculated analysis would be blood on the hands of the politically powerful (besides, the ability to perform such analysis is in opposition to dumb loyalty, which is the most prized character trait of the privileged in-group).  But these "large" lousy decisions possess notoriety, and thus human appeal.  So a thousand pages each over describing a thousand theories chase after a relative small number of very poor decision making processes.

The consequences of all this may dim my sparkling optimism, so I must leave that as an exercise for others.



Reblog this post [with Zemanta]

No comments: