Monday, September 27, 2010

Fetishizing p-Values - The Cult of Statistical Significance

en.wikipedia.org William Sealy Gosset
Fetishizing p-Values; Tom Leinster - The n-Category Cafe
Recovering the insight of "Student" Gosset from the over-simplification of Ronald A. Fisher
Leinster: Now there’s a whole book making the same point: The Cult of Statistical Significance, by two economists, Stephen T. Ziliak and Deirdre N. McCloskey. You can see their argument in this 15-page paper with the same title. Just because they’re economists doesn’t mean their prose is sober: according to one subheading, ‘Precision is Nice but Oomph is the Bomb’.
Leinster: it is true that p-value does not measure the magnitude of the effect (but then, anyone who has taken at least one course in statistics should know that)
I think Jost, Ziliak and McCloskey would completely agree that anyone who has taken at least one course in statistics should know that. They’re pointing out, open-mouthed, that this incredibly basic mistake is being made on a massive scale, including by many people who should know much, much better. Bane used the term ‘collective self-deception’; one might go further and say ‘mass delusion’. It’s a situation where a fundamental mistake has become so ingrained in how science is done that it’s hard to get your paper accepted if you don’t perpetuate that mistake.
That last statement is probably putting it too strongly, but as I understand it, the point they’re making is along those lines.
From the 15-page paper "The Cult of Statistical Significance":
In 1937 Gosset, the inventor and original calculator of “Student’s” t-table told Egon, then editor of Biometrika, that a significant finding is by itself “nearly valueless”:
...obviously the important thing in such is to have a low real error, not to have a "significant" result at a particular station. The latter seems to me to be nearly valueless in itself. . . . Experiments at a single station [that is, tests of statistical significance on a single set of data] are almost valueless. . . . What you really want is a low real error. You want to be able to say not only "We have significant evidence that if farmers in general do this they will make money by it", but also "we have found it so in nineteen cases out of twenty and we are finding out why it doesn't work in the twentieth.” To do that you have to be as sure as possible which is the 20th—your real error must be small...
Gosset to E. S. Pearson 1937, in Pearson 1939, p. 244.
Gosset, we have noted, is unknown to most users of statistics, including economists. Yet he was proposing and using in his own work at Guinness a characteristically economic way of looking at the acquisition of knowledge and the meaning of “error.” The inventor of small sample econometrics focused on the opportunity cost of each observation; he tried to minimize random and non-random errors, real errors.
Edit 11/12/10
A very nice write-up here, along same lines: Significance Tests in Climate Science -- Maarten H. P. Ambaum -- http://www.met.reading.ac.uk/~sws97mha/Publications/jclim_ambaum_rev2.pdf
Consider a scientist who is interested in measuring some effect and who does an experiment in the lab. Now consider the following thought process that the scientist goes through:
  1. My measurement stands out from the noise.
  2. So my measurement is not likely to be caused by noise.
  3. It is therefore unlikely that what I am seeing is noise.
  4. The measurement is therefore positive evidence that there is really something happening.
  5. This provides evidence for my theory.
This apparently innocuous train of thought contains a serious logical fallacy, and it appears at a spot where not many people notice it.
To the surprise of most, the logical fallacy occurs between step 2 and step 3. Step 2 says that there is a low probability of finding our specific measurement if our system would just produce noise. Step 3 says that there is a low probability that the system just produces noise. These sound the same but they are entirely different.
This can be compactly described using Bayesian statistics...
This comes from a summary of the paper: How significance tests are misused in climate science -- Guest post by Dr Maarten H. P. Ambaum from the Department of Meteorology, University of Reading, U.K. -- http://www.skepticalscience.com/news.php?n=456#
Edit 11/21/10

Significance Tests, frequentist vs. bayesian

When we perform a test of statistical significance test, what we
would really like to ask is “what is the probability that the
alternative hypothesis is true?”. A frequentist analysis
fundamentally cannot give a direct answer to that question, as
they cannot meaningfully talk of the probability of a hypothesis
being true – it is not a random variable, it is either true or
false and has no “long run frequency”. Instead, the frequentists
gives a rather indirect answer to the question by telling you the
likelihood of the observations assuming the null hypothesis is
true and leaving it up to you to decide what to conclude from
that. A Bayesian on the other hand can answer the question
directly as the Bayesian definition of probability is not based
on long run frequencies but on the state of knowledge of the
truth of a proposition. The problem with frequentist statistical
test is that there is a tendency to interpret the result as if it
were the result of a Bayesian test, which is natural as that is
the form of answer we generally want, but still wrong.

The frequentist approach avoids the “subjectivity” of the
Bayesian approach (although the extent of that “subjectivity” is
debatable), but this is only achieved at the expense of not
answering the question we would most like to ask. It could be
argued that the frequentist approach merely shifts the
subjectivity from the analysis to the interpretation (what should
we conclude based on our p-value). Which form of analysis you
should use depends on whether you find the “subjectivity” of the
Bayesian approach or the “indirectness” of the frequentist
approach most abhorrent! ;o)

At the end of the day, as long as the interpretation is
consistent with the formulation, there is no problem and both
forms of analysis are useful.
This was my favorite comment, the whole sub-thread underneath is interesting. The original Open Mind | tamino.wordpress.com article has good qualifications to Dr Maarten H. P. Ambaum's Skeptical Science post.

Enhanced by Zemanta

Tuesday, September 21, 2010

Valuing stewardship of the environment for future generations, or not

commons.wikimedia.org Sheep_eating_grass_edit02.jpg
Attempted to post comment to Offsetting Behaviour - Eric Crampton : Yer either fer us or agin us
[ I am not speaking to the game-theoretic analysis of New Zealand leaving Kyoto -- Bjorn's swaying this way or that notwithstanding, there is no rational reason for NZ to stay inside Kyoto, unless it was seen as the price for signaling environmental concern. ]
The same climate scientists that Lomborg disparaged for stating evidence of high sensitivities for carbon emissions are now the same climate scientists he will trust to run geo-engineering. This is the the most embarrassing contradiction of Lomborg's evolving stance.
The Copenhagen Consensus cost-benefit analysis put carbon taxes at the bottom by valuing stewardship of the environment for future generations at zero. The same way pre-school for my toddler would be at the bottom of a cost-benefit analysis of all uses of my money, if I valued his own future earnings and quality of life at zero.
If you are standing on the train tracks with a freight train coming in five minutes, you have the choice to leap off the tracks. A "compromise" position of shifting over a few inches will have no effect, no matter how much you value "moderation and reasonableness". If you limit your analysis to only the next step minutes and fifty-nine seconds, the energy expended in the leap is a waste.
I wish we had the choice to live in an "warmer average" world -- it would be nice. If you put two bullets in a six chambered gun to play Russian roulette, on "average", you are still alive but with a headache. But the "average" is an abstraction, and in reality you have to deal with the consequences of the spun barrel. The risk is not a warmer world -- the risk is an over-energetic world that no longer has the climate stability that allowed civilization and large-scale agriculture and inexpensive & quick transportation to be developed and maintained.
It is fine to consider all possible humanitarian uses of scarce capital. The weight that stewardship of the environment for future generations should not be infinite, lest you indulge in pointless profligacy towards but a single goal. But that does not imply that stewardship of the environment for future generations should be weighted at zero.
[ This implies value placed on trying to give future generations a "western/first world" standard of living much like what we currently enjoy. If we are satisfied with a few hundred thousand on each continent living under conditions like indigenous peoples, living along the new raised coastlines and grasslands freed from permafrost, with climate instability but the net warmth & wetness still giving the ability to feed from the meat of small grazing animals, the costs we would bare would be slight. ]
Edit 9/21/10: Reply via Google Buzz from Eric Crampton:
If investing in tech reduces more warming per dollar spent than do other things, what's the problem with redirecting spending towards tech?
Copenhagen valued future generations the same way that cost-benefit analysis typically values future generations: by applying a standard discount rate. That doesn't say that future people don't count; rather, it says that future people might prefer being given cash.
My reply:
"""That doesn't say that future people don't count; rather, it says that future people might prefer being given cash"""
If I am the victim of blunt trauma, I may not value a cash dispersal later over a medical intervention now. There is a rational case to be made that the two are hardly substitutes in some circumstances.
I agree that I should have been more careful and said "valuing stewardship of the environment for future generations, *particularly* in reducing the risk of the very worst outcomes". I will be more careful in future.
"""If investing in tech reduces more warming per dollar spent than do other things, what's the problem with redirecting spending towards tech?"""
No argument here. But the lack of breakthrough tech *now* implies non-zero carbon taxes *now* (and there is a moral argument for quite substantial taxes now). I am certain it will take a few decades of people seeing global military preparation for the worst possible outcomes of climate disruption before it is plain that environmental stewardship may be worth 5 or more points of global economic activity. It is not surprising that substantial carbon taxes have near zero political traction in the two largest economies, now.
Enhanced by Zemanta

Monday, September 13, 2010

Selling Fantasies: Breakthrough Institute

Breakthrough Institute works like the Copy Protection technology wizards selling their tech to record companies. It cannot work, because the pirates will always find a workaround towards copy protection - you are merely punishing your customers and training them to be pirates when they try to use your product in convenient ways. The Copy Protection technology wizards are not selling a working solution - because a solution is impossible - they are selling a pleasant fantasy to the record companies in the few years their business model has left.
People do not confine themselves to buying working products. Sometimes they will purchase fantasies. Look at the exercise gizmos that people buy from TV.
The Breakthrough Institute doesn't have to provide solutions that work - it will provide fantasies that it can sell. So lets try and figure out who their customers are.
If you are in the top 0.5% of incomes, you are intelligent and you may be slightly distressed that your great grandchildren will be born into a boiling world (when you can be bothered to consider the issue). You have the ability to direct funding, and in these few years before the climate disruption really hits human agriculture and infrastructure, you are in the market for fantasies, sold to you by the semi-knowledgeable folk (who are probably sincere, because their confidence in their tech solutions surpasses their scientific capabilities). That is what people like the Breakthrough Institute are selling. For example, Warren Buffet doesn't consider himself a bad person, and he cares for his grandchildren. But he has also made a huge bet on coal transport infrastructure. He would love to support the Breakthrough Institute by some means, to reconcile his position on the responsibility of environmental stewardship for future generations.
"The Breakthrough Institute, a project of Rockefeller Philanthropy Advisors, Inc." lets you know about the customers they are after. How did good ol' John D. make is money?
Lets predict their structure. They will rarely speak in absolute moral terms - they will never flatly state that it is craven to leave future generations a boiling world just because a handful of generations could not bare to lower their standard of living. The absolute moral issues will always be left unspoken. Those that talk about the moral issues will be marginalized as "un-serious" or "alarmist".
They will strive to distance themselves from the worst of the denialists. Pielke Jr and Fuller practically fell over their own feet trying to run away from Virginia State Attorney General Cuccinelli. But they will take "warmist" commentators that have a record of limiting themselves to the published science, like Romm, and equate them with denialists that spout off bat-shit nonsense - even thought the implication of equivalence is ridiculous. But you will know them by their actions, because they will spend most of their energy arguing against those with the clearest grasp of the facts, and moral issues, and political challenges.
It is the foolish "moderate" position of shifting your stance a few inches when you are standing on tracks, freight train coming. The half measure doesn't leave you just half-dead.
All you can do is make the case to ethical decision makers that they are being sold a bill of goods, by comparing the statements and techniques and rhetorical stances of the Breakthrough Institute to bunglers that stood in the way of decisions of moral courage, and the weavers of the Emperors New Clothes. These are the "moderate" apologists for moral failures - like those who stood in the way of eradicating slavery, or were the audience for the Letter from a Birmingham Jail, or were willing to negotiate with Hitler, or were willing to overlook Stalin's crimes. In all these cases, you could find "moderates" that participated in moral failures, and argued for positions with shabby facts and shabby rhetorical devices.
Edit: 09/14/10
Moe, Rockefeller is the BTI fiscal sponsor; the main funder throughout has been the Nathan Cummings Foundation. By itself the fiscal sponsorship doesn't mean much,although it may well in this instance.
Enhanced by Zemanta