Statistical Inference Versus
Substantive Inference
[The Cult of Statistical
Significance: How Standard Error Costs Us Jobs, Justice, and Lives, by Stephen T. Ziliak and
Deirdre N. McCloskey (Ann Arbor: University of Michigan Press, ISBN-13:
978-472-05007-9, 2007) http://www.cs.trinity.edu/~rjensen/temp/DeirdreMcCloskey/StatisticalSignificance01.htm ] [ Page 206] Ò Like scientists
today in medical and economic and other sizeless
[sic] sciences, Pearson mistook a large sample size for the definite,
substantive significance — evidence as Hayek put it — of
"wholes." But it was, as Hayek declared, "just an
illusion." Pearson's columns of sparkling asterisks, though quantitative
in appearance and as appealing as is the simple truth of the sky, signified
nothing.Ó
A scholar with the commentary name Centurian comments as follows following the following
article: "One
Economist's Mission to Redeem the Field of Finance," by Robert Schiller,
Chronicle of Higher Education, April 8, 2012 --- http://chronicle.com/article/Robert-Shillers-Mission-to/131456/
Economics as a
"science" is no different than Sociology, Psychology, Criminal
Justice, Political Science, etc.,etc..
To those in the "hard sciences" [physics, biology, chemistry,
mathematics], these "soft sciences" are dens of thieves. Thieves who have stolen the "scientific method" and
abused it. These soft sciences all apply
the scientific method to biased and insufficient data
sets, then claim to be "scientific", then assert their opinions and
biases as scientific results. They point to "correlations". Correlations which are made even though they know they do
not know all the forces/factors involved nor the ratio of effect from the
forces/factors.
They know their
mathematical formulas and models are like taking only a few pieces of evidence
from a crime scene and then constructing an elaborate "what happened"
prosecution and defense. Yet neither side has any real idea, other than in
the general sense, what happened. They certainly have
no idea what all the factors or human behaviors were involved, nor the true
motives.
Hence the growing
awareness of the limitations of all the quantitative models that led to the
financial crisis/financial WMDs going off.
Take for example the now thoroughly discredited financial
and economic models that claimed validity through the use of the same
mathematics used to make atomic weapons; Monte Carlo simulation. MC worked
on the Manhattan Project because real scientists, who obeyed the laws of
science when it came to using data, were applying the mathematics to a valid
data set.
Economists and Wall
Street Quants threw out the data set disciplines of
science. The Quant's of Wall Street and those scientists who claimed the data
proved man made global warming share the same sin of deception. Why? For the
same reason, doing so allowed them to continue their work in the lab. They got
to continue to experiment and "do science". Science paid for by those
with a deep vested financial interest in the the false correlations proclaimed by these soft
science dogmas.
If you take away a
child's crayons and give him oil paints used by Michelangelo, you're not going
to get the Sistine Chapel. You're just going to get a bigger mess. If Behavioral
Finance proves anything it is how far behind the other Social Sciences
economists really are. And if the "successes" of the Social Sciences
are any indication, a lot bigger messes are waiting down the road.
Centurion "The Standard Error of
Regressions," by Deirdre N. McCloskey and Stephen T. Ziliak, Journal of Economic Literature, 1996, pp. 97-114
THE IDEA OF
statistical significance is old, as old as Cicero writing on forecasts (Cicero,
De Divinatione, I. xiii. 23). In 1773 Laplace used it
to test whether comets came from outside the solar system (Elizabeth Scott
1953, p. 20). The first use of the very word "significance" in a
statistical context seems to be John Venn's, in 1888, speaking of differences
expressed in units of probable error.[ See Venn diagrams]
They inform us which of the differences in the above tables are permanent and
significant, in the sense that we may be tolerably confident that if we took
another similar batch we should find a similar difference; and which are merely
transient and insignificant, in the sense that another similar batch is about
as likely as not to reverse the conclusion we have obtained. (Venn, quoted in
Lancelot Hogben 1968, p. 325).
Statistical
significance has been much used since Venn, and especially since Ronald Fisher.
The problem, and our main point, is that
a difference can be permanent (as Venn put it) without being
"significant" in other senses, such as for science or policy. And
a difference can be significant for science or policy and yet be insignificant
statistically, ignored by the less thoughtful researchers. In the 1930s Jerzy Neyman and Egon S. Pearson, and then more explicitly Abraham Wald, argued that actual investigations
should depend on substantive not merely statistical significance. In 1933 Neyman and Pearson wrote of type I and type II errors:
Is it more serious to convict
an innocent man or to acquit a guilty? That will depend on the consequences of
the error; is the punishment death or fine; what is the danger to the community
of released criminals; what are the current ethical views on punishment? From
the point of view of mathematical theory all that we can do is to show how the
risk of errors may be controlled and minimised. The
use of these statistical tools in any given case, in determining just how the
balance should be struck, must be left to the investigator. (Neyman and Pearson 1933, p. 296; italics supplied)
Wald went further:
The question as to how the form of the weight
[that is, loss] function . . . should be determined, is not a mathematical or
statistical one. The statistician who wants to test certain hypotheses must
first determine the relative importance of all possible errors, which will
depend on the special purposes of his investigation. (1939, p. 302, italics
supplied)
To date no empirical studies have been undertaken
measuring the use of statistical significance in economics. We here examine the
alarming hypothesis that ordinary usage in economics takes statistical
significance to be the same as economic significance. We compare
statistical best practices against leading textbooks of recent decades and
against the papers using regression analysis in the 1980s in the American
Economic Review.