| |
 |
|
|
Science Forum Index » Statistics - Math Forum » Jarque-Bera test: confidence intervals for normal data
Page 3 of 4 Goto page Previous 1, 2, 3, 4 Next
|
| Author |
Message |
| Luis A. Afonso |
Posted: Sun Mar 11, 2007 2:22 pm |
|
|
|
Guest
|
wwwpub.utdallas.edu/~herve/Abdi-Lillie2007-pretty.pdf
UUUUUUUUUUUUUUUUUUUUUUUUUUUUU
MY VALUES
size___alpha=5%_1% ___Connover_____Abdi.
_10___0.264___0.305___.258_.294__.2616_.3037
_15___0.220___0.255___.220_.257__.2196_.2545
_20___0.192___0.224___.190_.231__.1920_.2226
_25___0.174___0.202___.173_.200__.1726_.2010
_30___0.159___0.185___.161_.187__.1590_.1848
_35___0.148___0.172_____________.1478_.1720
_40___0.139___0.161_____________.1386_.1616
_45___0.131___0.152_____________.1309_.1525
_50___0.124___0.145_____________.1246_.1457
(for each sample size, 500´000 samples were simulated
by my work, 100´000 by Abdi & Molin).
JACK TOMSKY is so unlearned and shameless that deserves to be exposed every time he posts an opinion on Hypotheses Tests. The less experience people should take in attention that HE IS A CLOWN.
Readers. Do appreciate what I found out at WEB.
*** Lilliefors/Van Soest´s test of normality ***
_____Hervé Abdi & Paul Molin
1. OVERVIEW
The normality assumption is at the core of the majority of standard statistical procedures, and it is important to be able to test this assumption. In addition, showing that a sample does not come from a normally distributed population is sometimes of importance per se. Among the many procedures used to test this assumption, one of the most well-known is a modification of the Kolmogorov-Smirnov test of goodness of fit, generally referred to as the Lilliefors test for normality (or Lilliefors test for short).This test was developed independently by Lilliefors (1967) and by Van Soest (1967). The null hypotheses for this test is that the error is normally distributed (i.e. there is no difference between the observed distribution f and the normal distribution). The alternative hypotheses is that the error is not normally distributed.
Like most statistical tests, this test of normality defines a criterion and gives its sampling distribution. When the probability associated with the criterion s smaller than a given [alpha]-level, the alternative hypotheses is accepted (i.e. we conclude that the sample does not come from a normal distribution). An interesting peculiarity of the Lilliefors test is the technique used to derive the sampling distribution of the criterion. In general mathematical statisticians derive the sampling distribution of the criterion using analytical techniques. However in this case, this approach fails and consequently Lilliefors decided to calculate an approximation of the sampling distribution by using the Monte Carlo technique.
Essentially the procedure consists of extracting a large number of samples from a Normal Population and computing the value of the criterion for each of these samples. The empirical distribution of the values o the criterion gives an approximation of the sampling distribution of the criterion under the null hypotheses.
Specifically, both Lilliefors and Van Soest used, for each sample size chosen, 1000 random samples derived from a standardized normal distribution to approximate the sampling distribution of a Kolmogorov-Smirnov criterion of goodness of fit. He critical values given by Lilliefors and Van Soest are quite similar, the relative error being of the order of 10^ (-2).
According to Lilliefors (1967) this test of normality is more powerful than others procedures for a wide range of nonnormal conditions. Dagnelie (1968) indicated, in addition, that the critical values reported by Lilliefors can be approximated by an analytical formula. Such a formula facilitates writing computer routines because it eliminates the risk of creating errors when keying in the values of the table. Recently, Molin and Abdi (1998), refined the approximation given by Dagnelie and computed new tables using a larger number o runs (i.e. K=100,000) in their simulations. ***
(End of citation).
____TOMSKY´s ABSOLUTELY K.O.!!!!!
____licas (Luis A. Afonso) |
|
|
| Back to top |
|
| Jack Tomsky |
Posted: Sun Mar 11, 2007 2:31 pm |
|
|
|
Guest
|
Quote: wwwpub.utdallas.edu/~herve/Abdi-Lillie2007-pretty.pdf
UUUUUUUUUUUUUUUUUUUUUUUUUUUUU
MY VALUES
size___alpha=5%_1% ___Connover_____Abdi.
_10___0.264___0.305___.258_.294__.2616_.3037
_15___0.220___0.255___.220_.257__.2196_.2545
_20___0.192___0.224___.190_.231__.1920_.2226
_25___0.174___0.202___.173_.200__.1726_.2010
_30___0.159___0.185___.161_.187__.1590_.1848
_35___0.148___0.172_____________.1478_.1720
_40___0.139___0.161_____________.1386_.1616
_45___0.131___0.152_____________.1309_.1525
_50___0.124___0.145_____________.1246_.1457
(for each sample size, 500´000 samples were
simulated
by my work, 100´000 by Abdi & Molin).
JACK TOMSKY is so unlearned and shameless that
deserves to be exposed every time he posts an opinion
on Hypotheses Tests. The less experience people
should take in attention that HE IS A CLOWN.
Although there is no evidence that anyone has ever used any of Afonso's faulty statistics, it is important that his errors be corrected so that no one will ever think that confidence levels and significance levels are synonomous, that null hypotheses are never allowed to be accepted, and that no one can tell if 8/13 is greater than 5/13.
Jack
Quote:
Readers. Do appreciate what I found out at WEB.
*** Lilliefors/Van Soest´s test of normality ***
_____Hervé Abdi & Paul Molin
1. OVERVIEW
The normality assumption is at the core of the
majority of standard statistical procedures, and it
is important to be able to test this assumption. In
addition, showing that a sample does not come from a
normally distributed population is sometimes of
importance per se. Among the many procedures used to
test this assumption, one of the most well-known is a
modification of the Kolmogorov-Smirnov test of
goodness of fit, generally referred to as the
Lilliefors test for normality (or Lilliefors test for
short).This test was developed independently by
Lilliefors (1967) and by Van Soest (1967). The null
hypotheses for this test is that the error is
normally distributed (i.e. there is no difference
between the observed distribution f and the normal
distribution). The alternative hypotheses is that the
error is not normally distributed.
Like most statistical tests, this test of normality
defines a criterion and gives its sampling
distribution. When the probability associated with
the criterion s smaller than a given [alpha]-level,
the alternative hypotheses is accepted (i.e. we
conclude that the sample does not come from a normal
distribution). An interesting peculiarity of the
Lilliefors test is the technique used to derive the
sampling distribution of the criterion. In general
mathematical statisticians derive the sampling
distribution of the criterion using analytical
techniques. However in this case, this approach fails
and consequently Lilliefors decided to calculate an
approximation of the sampling distribution by using
the Monte Carlo technique.
Essentially the procedure consists of extracting a
large number of samples from a Normal Population and
computing the value of the criterion for each of
these samples. The empirical distribution of the
values o the criterion gives an approximation of the
sampling distribution of the criterion under the null
hypotheses.
Specifically, both Lilliefors and Van Soest used, for
each sample size chosen, 1000 random samples derived
from a standardized normal distribution to
approximate the sampling distribution of a
Kolmogorov-Smirnov criterion of goodness of fit. He
critical values given by Lilliefors and Van Soest are
quite similar, the relative error being of the order
of 10^ (-2).
According to Lilliefors (1967) this test of normality
is more powerful than others procedures for a wide
range of nonnormal conditions. Dagnelie (1968)
indicated, in addition, that the critical values
reported by Lilliefors can be approximated by an
analytical formula. Such a formula facilitates
writing computer routines because it eliminates the
risk of creating errors when keying in the values of
the table. Recently, Molin and Abdi (1998), refined
the approximation given by Dagnelie and computed new
tables using a larger number o runs (i.e. K=100,000)
in their simulations. ***
(End of citation).
____TOMSKY´s ABSOLUTELY K.O.!!!!!
____licas (Luis A. Afonso) |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Mon Mar 12, 2007 9:51 am |
|
|
|
Guest
|
YES, I REPEAT MY STATEMENT
ªªªª… allowed to be accepted, and that no one can tell if 8/13 is greater than 5/13.***
At the proper CONTEXT I never denied: it.
The NOTATION *a/b* I adopted with a precise meaning: that are * a * successes in * b * trials. I wasted my time to write a full thread putting this clear. You unethically and your *boss* Bob Ling, read intentionally (in order to attack me) the notation as plain fractions!!! And you are repeating ad nausea with the same purpose.
When I did write (as you say) 8/13 > 5/13 a unique way of interpretation: is valid: comparing the event 8 successes in 13 trials with 5/13 we can state that the latter is less favorable (to successes) at alpha significance level.
(I do not remember exactly, but I think that was 5%)
IS IT THE LAST TIME YOU USE THIS *TRASH* TO BULLING ME? IS IT?
_________licas (Luis A. Afonso) |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Tue Mar 13, 2007 4:06 am |
|
|
|
Guest
|
In a DOZEN of posts, Jack Tomsky, wanted to stop my job to find out the critical values of the Jarque-Bera test. He faced the drawback not be successful
Meanwhile two points were obvious
1) The total ignorance of the existence of this test (showing very weak awareness to be updated, even from Web’s material). This test is known since 1980.
2) The most serious: a 40 years technique ignorance to reach confidence intervals (by simulation).
In his *opaque* mind
____a confidence interval is only possible to be obtained throughout a real sample and it is unique.
Consequently, for him, the procedure:
a) Simulating samples a great number (1 million),
b) For each of them evaluating the sample statistics under study,
c) And from this empirical distribution to get the quantiles of interest for the test
is WRONG, ABUSIVE, CONDENABLE.
This procedure, since 1967 through H. Lilliefors, is currently used for the goal in view.
To ignore it nowadays is ABSOLURELY NOT ACCEPTABLE to statistically learned people.
What´s the Reader´s opinion about?
_______licas (Luis A. Afonso) |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Tue Mar 13, 2007 5:22 am |
|
|
|
Guest
|
Test J-B: POWER for exponential samples
Conventionally *beta* is used to denote the probability to make a type II error (i. e. to accept the hypotheses H0 when we should not).
*** The power, 1-beta, is the probability to reject H0 when we should do it.***
When we are dealing with a GOF test (goodness of fit) the null hypotheses is H0: the sample was drawn from the Population of law W. The power is the probability to reject H0 when this is true, i.e., when the population has a law different from W, therefore when the alternative hypotheses, Ha, occurs.
This tine we test random samples from the exponential law of density
_____ f(x) = (1 / L)*exp(-x / L) ___ L real positive
0 <= x < infinite.
For alpha=5% exponential samples (L=1):
__N______________Power
__10______________0.332__
__15______________0.496__
__20______________0.631__
__25______________0.734__
__30______________0.821__
__35______________0.884__
__40______________0.928__
__45______________0.957__
__50______________0.977__
(Note: the powers doesn’t vary with L)
_______licas (Luis A. Afonso)
REM "JBexp"
CLS
DEFDBL A-Z
PRINT " JB test for exp. distr. "
INPUT " LAMBDA = "; lbd
INPUT " sample size = "; nn
DIM w(1, 50)
DATA 2.54,2.71,2.87,3.02,3.16,3.29,3.41,3.52
DATA 3.62,3.72,3.81,3.89,3.96,4.03,4.09,4.15
DATA 4.21,4.26,4.31,4.36,4.40,4.44,4.48,4.52
DATA 4.56,4.59,4.62,4.66,4.69,4.72,4.74,4.77
DATA 4.80,4.82,4.85,4.87,4.89,4.91,4.92,4.94
DATA 4.95
FOR t = 10 TO 50: READ w(1, t): NEXT t
jc = w(1, nn)
PRINT jc
DIM x(nn)
all = 40000
FOR k = 1 TO all
LOCATE 4, 50
PRINT USING "##########"; all - k
s = 0
RANDOMIZE TIMER
FOR i = 1 TO nn: x(i) = 0
x(i) = -1 / lbd * LOG(1 - RND)
s = s + x(i) / nn
NEXT i
m1 = s: m2 = 0: m3 = 0: m4 = 0
FOR j = 1 TO nn: d = x(j) - m1
m2 = m2 + d * d / nn
m3 = m3 + d * d * d / nn
m4 = m4 + d * d * d * d / nn
NEXT j
SK = m3 / (m2 ^ (1.5))
Ku = m4 / (m2 * m2)
JB = (nn / 6) * (SK * SK + (Ku - 3) * (Ku - 3) / 4)
IF JB > jc THEN ww = ww + 1
LOCATE 6, 50
PRINT USING "#.###"; ww / k
NEXT k |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Wed Mar 14, 2007 12:23 am |
|
|
|
Guest
|
J-B test, POWER for Chi-square
From Wikipedia:
*** The power of a statistical test is the probability that the test will reject a false null hypothesis (that it will not make a Type II error). As power increases, the chances of a Type II error decrease, and vice versa. The probability of a Type II error is referred to as *beta*.
Statistical power depends on:
a)__the statistical significance criterion used in the test
b)__the size of the difference or the strength of the similarity (that is, the effect size) in the population ***
____________________________
TABLE
Jarque - Bera normality test , 5% significance level: POWER for Ch-squared Distributions, df degrees of freedom.
______df=3_______5_______7_______10__
N=
__10__0.251____0.176____0.146____0.117_
__20__0.492____0.351____0.276____0.216_
__30__0.687____0.500____0.396____0.313_
__40__0.821____0.635____0.506____0.394_
__50__0.913____0.746____0.617____0.480_
_____________________________________
For each distribution (column) the power increases from N=10 to 50, whereas for each line (N constant) it decreases when the dg increases because the Chi-squared distributions are progressively more alike to normal one. Forthis reason the Jarque - Bera seems to be progressively less able to distinguish them.
_________licas (Luis A. Afonso)
REM "JBchi"
CLS
DEFDBL A-Z
PRINT " JB test for CHI "
INPUT " sample size = "; nn
INPUT " df = "; df
pi = 4 * ATN(1)
DIM w(1, 50)
DATA 2.54,2.71,2.87,3.02,3.16,3.29,3.41,3.52
DATA 3.62,3.72,3.81,3.89,3.96,4.03,4.09,4.15
DATA 4.21,4.26,4.31,4.36,4.40,4.44,4.48,4.52
DATA 4.56,4.59,4.62,4.66,4.69,4.72,4.74,4.77
DATA 4.80,4.82,4.85,4.87,4.89,4.91,4.92,4.94
DATA 4.95
FOR t = 10 TO 50: READ w(1, t): NEXT t
jc = w(1, nn)
DIM x(nn)
all = 40000
FOR k = 1 TO all
LOCATE 4, 50
PRINT USING "##########"; all - k
s = 0
RANDOMIZE TIMER
FOR i = 1 TO nn: x(i) = 0
FOR dgg = 1 TO df
a = SQR(-2 * LOG(RND))
x = a * COS(2 * pi * RND)
x(i) = x(i) + x * x
NEXT dgg
s = s + x(i)
NEXT i
m1 = s / nn: m2 = 0: m3 = 0: m4 = 0
FOR j = 1 TO nn: d = x(j) - m1
m2 = m2 + d * d / nn
m3 = m3 + d * d * d / nn
m4 = m4 + d * d * d * d / nn
NEXT j
SK = m3 / (m2 ^ (1.5))
Ku = m4 / (m2 * m2)
JB = (nn / 6) * (SK * SK + (Ku - 3) * (Ku - 3) / 4)
IF JB > jc THEN ww = ww + 1
LOCATE 6, 50
PRINT USING "#.###"; ww / k
NEXT k: END |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Thu Mar 15, 2007 12:36 pm |
|
|
|
Guest
|
JB by Bootstrap: reporting a failure
The procedure
From an unique normal N sized *source sample* a set of B Bootstrap samples are simulated (with the same size) and the JB statistics evaluated.
Analysing this set I count how many these *pseudo-samples* have JB´s greater than the 5% significance level critical value. This frequency of this occurrence is the *Bootstrap* significance level. (BSL).
_______________________________________
size = 10
100 *sources* each one Bootstrapped 4000 times ____values from 9% to 81%, mode = 15% , with 8 occurrences.
size = 20
idem
____values from 6% to 84%, mode = 8% with 12 occurrences.
size = 30
idem
____values from 5% to 73%, mode = 6% with 12 occurrences.
size = 40
idem
____values from 5% to 100%, mode = 6% with 15 occurrences.
size = 50
idem
____values from 5% to 99%, mode = 8% with 13 occurrences.
_______________________________________
size = 10
100 *sources* each one Bootstrapped 10000 times____values from 8% to 79%, mode = 12%-13%, with 9 occurrences each.
Conclusion
The Bootstrap doesn´t work for the Jarque-Bera test.
________licas (Luis A. Afonso) |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Fri Mar 16, 2007 10:53 am |
|
|
|
Guest
|
Significance level, alpha, by CDF
Acceptance (no-rejection) interval,
right bounded: interval (-infinity, b] such that
________1- alpha = F(b)
The rejection is (b, infinity) defined by
________ alpha = 1- F(b) - p(X=b)
This way to define them is the same the density f(X) be continuous, i.e. p(X=b)=0 or has at *b* a discontinuity. |
|
|
| Back to top |
|
| Jack Tomsky |
Posted: Fri Mar 16, 2007 11:21 am |
|
|
|
Guest
|
Quote: Significance level, alpha, by CDF
Acceptance (no-rejection) interval,
right bounded: interval (-infinity, b] such that
________1- alpha = F(b)
The rejection is (b, infinity) defined by
________ alpha = 1- F(b) - p(X=b)
This way to define them is the same the density f(X)
be continuous, i.e. p(X=b)=0 or has at *b* a
discontinuity.
What happens if there is no b such that 1- alpha = F(b)? Then your acceptance region is undefined.
Jack |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Fri Mar 16, 2007 12:08 pm |
|
|
|
Guest
|
Jack Tomsky wrote:
*** What happens if there is no b such that 1- alpha = F(b)? Then your acceptance region is undefined. Jack ***
My response
It seems to me *homework* *master* Jack. You must tell what you got yet in this matter. Ask your teacher to direct you in the right way.
________licas (Luis A. Afonso) |
|
|
| Back to top |
|
| Jack Tomsky |
Posted: Fri Mar 16, 2007 12:15 pm |
|
|
|
Guest
|
Quote: Jack Tomsky wrote:
*** What happens if there is no b such that 1- alpha
= F(b)? Then your acceptance region is undefined.
Jack ***
My response
It seems to me *homework* *master* Jack. You must
tell what you got yet in this matter. Ask your
teacher to direct you in the right way.
________licas (Luis A. Afonso)
You obviously don't know that the answer is that your definition is wrong.
Jack |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Fri Mar 16, 2007 1:03 pm |
|
|
|
Guest
|
***Jack wrote:
You obviously don't know that the answer is that your definition is wrong. Jack ***
My response
Let be X= Bin (n=4, p=0.5)
___p(X=0) = 1/16
______ F(0) = 1/16
___p(X=1) = 4*1/16
___ F(1) = 5/16
___p(X=2) = 6*1/16
___1- F(2) = 1-11/16 = 5/16
_________ = 0.3125
___p(X=3) = 4*1/16
___1- F(3) = 1-15/16 =1/16
__________= 0.0625
___p(X=4) = 1*1/16
______F(4) = 16/16
if we try to solve X such that
_________ 1- F(X) = alpha = 0.05
_____(I)
_________ you get a value between
X=3 and X=4.
THERE IS SUCH A VALUE for X.
________Equation (I) has not solution.
When you Jack Tomsky start to *mettre en marche* your little grey cells (Hercule Poirot)?
_______licas (Luis A. Afonso) |
|
|
| Back to top |
|
| Jack Tomsky |
Posted: Fri Mar 16, 2007 1:18 pm |
|
|
|
Guest
|
Quote: ***Jack wrote:
You obviously don't know that the answer is that your
definition is wrong. Jack ***
My response
Let be X= Bin (n=4, p=0.5)
___p(X=0) = 1/16
______ F(0) = 1/16
___p(X=1) = 4*1/16
___ F(1) = 5/16
___p(X=2) = 6*1/16
___1- F(2) = 1-11/16 = 5/16
_________ = 0.3125
___p(X=3) = 4*1/16
___1- F(3) = 1-15/16 =1/16
__________= 0.0625
___p(X=4) = 1*1/16
______F(4) = 16/16
if we try to solve X such that
_________ 1- F(X) = alpha = 0.05
_____(I)
_________ you get a value between
X=3 and X=4.
THERE IS SUCH A VALUE for X.
________Equation (I) has not solution.
When you Jack Tomsky start to *mettre en marche* your
little grey cells (Hercule Poirot)?
_______licas (Luis A. Afonso)
According to your own calculations, there is no x for which 1-F(x) = 0.05.
For 3 <= x < 4, 1-F(x) = 0.0625.
For x = 4, 1-F(x) = 0.
Jack |
|
|
| Back to top |
|
| Luis A. Afonso |
Posted: Fri Mar 16, 2007 2:09 pm |
|
|
|
Guest
|
Jack Tomsky wrote
*** According to your own calculations, there is no x for which 1-F(x) = 0.05.
For 3 <= x < 4, 1-F(x) = 0.0625.
For x = 4, 1-F(x) = 0. Jack ***
My response
*** And according of yours, Tomsky, what is the x value that makes F(x)=0.95 ? ***
________licas (Luis A. Afonso)
I MUST CONCLUDE that your comment:
*** You obviously don't know that the answer is that your definition is wrong.***
It is one more STUPID and UNLEARNED statement of yours.
The Significance Level ,alpha, defined by means of the Cumulative Distribution Function is correct, exactly the opposite you claimed.
______licas (Luis A. Afonso) |
|
|
| Back to top |
|
| Jack Tomsky |
Posted: Fri Mar 16, 2007 2:25 pm |
|
|
|
Guest
|
Quote: Jack Tomsky wrote
*** According to your own calculations, there is no x
for which 1-F(x) = 0.05.
For 3 <= x < 4, 1-F(x) = 0.0625.
For x = 4, 1-F(x) = 0. Jack
***
My response
*** And according of yours, Tomsky, what is the x
value that makes F(x)=0.95 ? ***
________licas (Luis A. Afonso)
I MUST CONCLUDE that your comment:
*** You obviously don't know that the answer is that
your definition is wrong.***
It is one more STUPID and UNLEARNED statement of
yours.
The Significance Level ,alpha, defined by means of
the Cumulative Distribution Function is correct,
exactly the opposite you claimed.
______licas (Luis A. Afonso)
The reason that you can't produce an x such that 1-F(x) = 0.05 is that the x doesn't exist.
Jack |
|
|
| Back to top |
|
| |
Page 3 of 4 Goto page Previous 1, 2, 3, 4 Next
All times are GMT - 5 Hours
The time now is Sun Oct 12, 2008 1:45 am
|
|