Main Page | Report this Page
 
   
Science Forum Index  »  Statistics - Math Forum  »  Jarque-Bera test: confidence intervals for normal data
Page 3 of 4    Goto page Previous  1, 2, 3, 4  Next
Author Message
Luis A. Afonso
Posted: Sun Mar 11, 2007 2:22 pm
Guest
wwwpub.utdallas.edu/~herve/Abdi-Lillie2007-pretty.pdf

UUUUUUUUUUUUUUUUUUUUUUUUUUUUU

MY VALUES




size___alpha=5%_1% ___Connover_____Abdi.
_10___0.264___0.305___.258_.294__.2616_.3037
_15___0.220___0.255___.220_.257__.2196_.2545
_20___0.192___0.224___.190_.231__.1920_.2226
_25___0.174___0.202___.173_.200__.1726_.2010
_30___0.159___0.185___.161_.187__.1590_.1848
_35___0.148___0.172_____________.1478_.1720
_40___0.139___0.161_____________.1386_.1616
_45___0.131___0.152_____________.1309_.1525
_50___0.124___0.145_____________.1246_.1457

(for each sample size, 500´000 samples were simulated
by my work, 100´000 by Abdi & Molin).


JACK TOMSKY is so unlearned and shameless that deserves to be exposed every time he posts an opinion on Hypotheses Tests. The less experience people should take in attention that HE IS A CLOWN.

Readers. Do appreciate what I found out at WEB.

*** Lilliefors/Van Soest´s test of normality ***
_____Hervé Abdi & Paul Molin

1. OVERVIEW

The normality assumption is at the core of the majority of standard statistical procedures, and it is important to be able to test this assumption. In addition, showing that a sample does not come from a normally distributed population is sometimes of importance per se. Among the many procedures used to test this assumption, one of the most well-known is a modification of the Kolmogorov-Smirnov test of goodness of fit, generally referred to as the Lilliefors test for normality (or Lilliefors test for short).This test was developed independently by Lilliefors (1967) and by Van Soest (1967). The null hypotheses for this test is that the error is normally distributed (i.e. there is no difference between the observed distribution f and the normal distribution). The alternative hypotheses is that the error is not normally distributed.
Like most statistical tests, this test of normality defines a criterion and gives its sampling distribution. When the probability associated with the criterion s smaller than a given [alpha]-level, the alternative hypotheses is accepted (i.e. we conclude that the sample does not come from a normal distribution). An interesting peculiarity of the Lilliefors test is the technique used to derive the sampling distribution of the criterion. In general mathematical statisticians derive the sampling distribution of the criterion using analytical techniques. However in this case, this approach fails and consequently Lilliefors decided to calculate an approximation of the sampling distribution by using the Monte Carlo technique.
Essentially the procedure consists of extracting a large number of samples from a Normal Population and computing the value of the criterion for each of these samples. The empirical distribution of the values o the criterion gives an approximation of the sampling distribution of the criterion under the null hypotheses.
Specifically, both Lilliefors and Van Soest used, for each sample size chosen, 1000 random samples derived from a standardized normal distribution to approximate the sampling distribution of a Kolmogorov-Smirnov criterion of goodness of fit. He critical values given by Lilliefors and Van Soest are quite similar, the relative error being of the order of 10^ (-2).
According to Lilliefors (1967) this test of normality is more powerful than others procedures for a wide range of nonnormal conditions. Dagnelie (1968) indicated, in addition, that the critical values reported by Lilliefors can be approximated by an analytical formula. Such a formula facilitates writing computer routines because it eliminates the risk of creating errors when keying in the values of the table. Recently, Molin and Abdi (1998), refined the approximation given by Dagnelie and computed new tables using a larger number o runs (i.e. K=100,000) in their simulations. ***
(End of citation).

____TOMSKY´s ABSOLUTELY K.O.!!!!!

____licas (Luis A. Afonso)
Jack Tomsky
Posted: Sun Mar 11, 2007 2:31 pm
Guest
Quote:
wwwpub.utdallas.edu/~herve/Abdi-Lillie2007-pretty.pdf

UUUUUUUUUUUUUUUUUUUUUUUUUUUUU

MY VALUES




size___alpha=5%_1% ___Connover_____Abdi.
_10___0.264___0.305___.258_.294__.2616_.3037
_15___0.220___0.255___.220_.257__.2196_.2545
_20___0.192___0.224___.190_.231__.1920_.2226
_25___0.174___0.202___.173_.200__.1726_.2010
_30___0.159___0.185___.161_.187__.1590_.1848
_35___0.148___0.172_____________.1478_.1720
_40___0.139___0.161_____________.1386_.1616
_45___0.131___0.152_____________.1309_.1525
_50___0.124___0.145_____________.1246_.1457

(for each sample size, 500´000 samples were
simulated
by my work, 100´000 by Abdi & Molin).


JACK TOMSKY is so unlearned and shameless that
deserves to be exposed every time he posts an opinion
on Hypotheses Tests. The less experience people
should take in attention that HE IS A CLOWN.




Although there is no evidence that anyone has ever used any of Afonso's faulty statistics, it is important that his errors be corrected so that no one will ever think that confidence levels and significance levels are synonomous, that null hypotheses are never allowed to be accepted, and that no one can tell if 8/13 is greater than 5/13.

Jack





Quote:

Readers. Do appreciate what I found out at WEB.

*** Lilliefors/Van Soest´s test of normality ***
_____Hervé Abdi & Paul Molin

1. OVERVIEW

The normality assumption is at the core of the
majority of standard statistical procedures, and it
is important to be able to test this assumption. In
addition, showing that a sample does not come from a
normally distributed population is sometimes of
importance per se. Among the many procedures used to
test this assumption, one of the most well-known is a
modification of the Kolmogorov-Smirnov test of
goodness of fit, generally referred to as the
Lilliefors test for normality (or Lilliefors test for
short).This test was developed independently by
Lilliefors (1967) and by Van Soest (1967). The null
hypotheses for this test is that the error is
normally distributed (i.e. there is no difference
between the observed distribution f and the normal
distribution). The alternative hypotheses is that the
error is not normally distributed.
Like most statistical tests, this test of normality
defines a criterion and gives its sampling
distribution. When the probability associated with
the criterion s smaller than a given [alpha]-level,
the alternative hypotheses is accepted (i.e. we
conclude that the sample does not come from a normal
distribution). An interesting peculiarity of the
Lilliefors test is the technique used to derive the
sampling distribution of the criterion. In general
mathematical statisticians derive the sampling
distribution of the criterion using analytical
techniques. However in this case, this approach fails
and consequently Lilliefors decided to calculate an
approximation of the sampling distribution by using
the Monte Carlo technique.
Essentially the procedure consists of extracting a
large number of samples from a Normal Population and
computing the value of the criterion for each of
these samples. The empirical distribution of the
values o the criterion gives an approximation of the
sampling distribution of the criterion under the null
hypotheses.
Specifically, both Lilliefors and Van Soest used, for
each sample size chosen, 1000 random samples derived
from a standardized normal distribution to
approximate the sampling distribution of a
Kolmogorov-Smirnov criterion of goodness of fit. He
critical values given by Lilliefors and Van Soest are
quite similar, the relative error being of the order
of 10^ (-2).
According to Lilliefors (1967) this test of normality
is more powerful than others procedures for a wide
range of nonnormal conditions. Dagnelie (1968)
indicated, in addition, that the critical values
reported by Lilliefors can be approximated by an
analytical formula. Such a formula facilitates
writing computer routines because it eliminates the
risk of creating errors when keying in the values of
the table. Recently, Molin and Abdi (1998), refined
the approximation given by Dagnelie and computed new
tables using a larger number o runs (i.e. K=100,000)
in their simulations. ***
(End of citation).

____TOMSKY´s ABSOLUTELY K.O.!!!!!

____licas (Luis A. Afonso)
Luis A. Afonso
Posted: Mon Mar 12, 2007 9:51 am
Guest
YES, I REPEAT MY STATEMENT

ªªªª… allowed to be accepted, and that no one can tell if 8/13 is greater than 5/13.***

At the proper CONTEXT I never denied: it.
The NOTATION *a/b* I adopted with a precise meaning: that are * a * successes in * b * trials. I wasted my time to write a full thread putting this clear. You unethically and your *boss* Bob Ling, read intentionally (in order to attack me) the notation as plain fractions!!! And you are repeating ad nausea with the same purpose.
When I did write (as you say) 8/13 > 5/13 a unique way of interpretation: is valid: comparing the event 8 successes in 13 trials with 5/13 we can state that the latter is less favorable (to successes) at alpha significance level.
(I do not remember exactly, but I think that was 5%)
IS IT THE LAST TIME YOU USE THIS *TRASH* TO BULLING ME? IS IT?

_________licas (Luis A. Afonso)
Luis A. Afonso
Posted: Tue Mar 13, 2007 4:06 am
Guest
In a DOZEN of posts, Jack Tomsky, wanted to stop my job to find out the critical values of the Jarque-Bera test. He faced the drawback not be successful
Meanwhile two points were obvious
1) The total ignorance of the existence of this test (showing very weak awareness to be updated, even from Web’s material). This test is known since 1980.
2) The most serious: a 40 years technique ignorance to reach confidence intervals (by simulation).
In his *opaque* mind
____a confidence interval is only possible to be obtained throughout a real sample and it is unique.
Consequently, for him, the procedure:
a) Simulating samples a great number (1 million),
b) For each of them evaluating the sample statistics under study,
c) And from this empirical distribution to get the quantiles of interest for the test
is WRONG, ABUSIVE, CONDENABLE.

This procedure, since 1967 through H. Lilliefors, is currently used for the goal in view.
To ignore it nowadays is ABSOLURELY NOT ACCEPTABLE to statistically learned people.
What´s the Reader´s opinion about?

_______licas (Luis A. Afonso)
Luis A. Afonso
Posted: Tue Mar 13, 2007 5:22 am
Guest
Test J-B: POWER for exponential samples


Conventionally *beta* is used to denote the probability to make a type II error (i. e. to accept the hypotheses H0 when we should not).
*** The power, 1-beta, is the probability to reject H0 when we should do it.***
When we are dealing with a GOF test (goodness of fit) the null hypotheses is H0: the sample was drawn from the Population of law W. The power is the probability to reject H0 when this is true, i.e., when the population has a law different from W, therefore when the alternative hypotheses, Ha, occurs.
This tine we test random samples from the exponential law of density
_____ f(x) = (1 / L)*exp(-x / L) ___ L real positive
0 <= x < infinite.

For alpha=5% exponential samples (L=1):

__N______________Power
__10______________0.332__
__15______________0.496__
__20______________0.631__
__25______________0.734__
__30______________0.821__
__35______________0.884__
__40______________0.928__
__45______________0.957__
__50______________0.977__

(Note: the powers doesn’t vary with L)

_______licas (Luis A. Afonso)


REM "JBexp"
CLS
DEFDBL A-Z
PRINT " JB test for exp. distr. "
INPUT " LAMBDA = "; lbd
INPUT " sample size = "; nn
DIM w(1, 50)
DATA 2.54,2.71,2.87,3.02,3.16,3.29,3.41,3.52
DATA 3.62,3.72,3.81,3.89,3.96,4.03,4.09,4.15
DATA 4.21,4.26,4.31,4.36,4.40,4.44,4.48,4.52
DATA 4.56,4.59,4.62,4.66,4.69,4.72,4.74,4.77
DATA 4.80,4.82,4.85,4.87,4.89,4.91,4.92,4.94
DATA 4.95
FOR t = 10 TO 50: READ w(1, t): NEXT t
jc = w(1, nn)
PRINT jc
DIM x(nn)
all = 40000
FOR k = 1 TO all
LOCATE 4, 50
PRINT USING "##########"; all - k
s = 0
RANDOMIZE TIMER
FOR i = 1 TO nn: x(i) = 0
x(i) = -1 / lbd * LOG(1 - RND)
s = s + x(i) / nn
NEXT i
m1 = s: m2 = 0: m3 = 0: m4 = 0
FOR j = 1 TO nn: d = x(j) - m1
m2 = m2 + d * d / nn
m3 = m3 + d * d * d / nn
m4 = m4 + d * d * d * d / nn
NEXT j
SK = m3 / (m2 ^ (1.5))
Ku = m4 / (m2 * m2)
JB = (nn / 6) * (SK * SK + (Ku - 3) * (Ku - 3) / 4)
IF JB > jc THEN ww = ww + 1
LOCATE 6, 50
PRINT USING "#.###"; ww / k
NEXT k
Luis A. Afonso
Posted: Wed Mar 14, 2007 12:23 am
Guest
J-B test, POWER for Chi-square



From Wikipedia:

*** The power of a statistical test is the probability that the test will reject a false null hypothesis (that it will not make a Type II error). As power increases, the chances of a Type II error decrease, and vice versa. The probability of a Type II error is referred to as *beta*.
Statistical power depends on:
a)__the statistical significance criterion used in the test
b)__the size of the difference or the strength of the similarity (that is, the effect size) in the population ***
____________________________

TABLE
Jarque - Bera normality test , 5% significance level: POWER for Ch-squared Distributions, df degrees of freedom.

______df=3_______5_______7_______10__
N=
__10__0.251____0.176____0.146____0.117_
__20__0.492____0.351____0.276____0.216_
__30__0.687____0.500____0.396____0.313_
__40__0.821____0.635____0.506____0.394_
__50__0.913____0.746____0.617____0.480_
_____________________________________


For each distribution (column) the power increases from N=10 to 50, whereas for each line (N constant) it decreases when the dg increases because the Chi-squared distributions are progressively more alike to normal one. Forthis reason the Jarque - Bera seems to be progressively less able to distinguish them.

_________licas (Luis A. Afonso)

REM "JBchi"
CLS
DEFDBL A-Z
PRINT " JB test for CHI "
INPUT " sample size = "; nn
INPUT " df = "; df
pi = 4 * ATN(1)
DIM w(1, 50)
DATA 2.54,2.71,2.87,3.02,3.16,3.29,3.41,3.52
DATA 3.62,3.72,3.81,3.89,3.96,4.03,4.09,4.15
DATA 4.21,4.26,4.31,4.36,4.40,4.44,4.48,4.52
DATA 4.56,4.59,4.62,4.66,4.69,4.72,4.74,4.77
DATA 4.80,4.82,4.85,4.87,4.89,4.91,4.92,4.94
DATA 4.95
FOR t = 10 TO 50: READ w(1, t): NEXT t
jc = w(1, nn)
DIM x(nn)
all = 40000
FOR k = 1 TO all
LOCATE 4, 50
PRINT USING "##########"; all - k
s = 0
RANDOMIZE TIMER
FOR i = 1 TO nn: x(i) = 0
FOR dgg = 1 TO df
a = SQR(-2 * LOG(RND))
x = a * COS(2 * pi * RND)
x(i) = x(i) + x * x
NEXT dgg
s = s + x(i)
NEXT i
m1 = s / nn: m2 = 0: m3 = 0: m4 = 0
FOR j = 1 TO nn: d = x(j) - m1
m2 = m2 + d * d / nn
m3 = m3 + d * d * d / nn
m4 = m4 + d * d * d * d / nn
NEXT j
SK = m3 / (m2 ^ (1.5))
Ku = m4 / (m2 * m2)
JB = (nn / 6) * (SK * SK + (Ku - 3) * (Ku - 3) / 4)
IF JB > jc THEN ww = ww + 1
LOCATE 6, 50
PRINT USING "#.###"; ww / k
NEXT k: END
Luis A. Afonso
Posted: Thu Mar 15, 2007 12:36 pm
Guest
JB by Bootstrap: reporting a failure


The procedure

From an unique normal N sized *source sample* a set of B Bootstrap samples are simulated (with the same size) and the JB statistics evaluated.
Analysing this set I count how many these *pseudo-samples* have JB´s greater than the 5% significance level critical value. This frequency of this occurrence is the *Bootstrap* significance level. (BSL).
_______________________________________
size = 10
100 *sources* each one Bootstrapped 4000 times ____values from 9% to 81%, mode = 15% , with 8 occurrences.

size = 20
idem
____values from 6% to 84%, mode = 8% with 12 occurrences.

size = 30
idem
____values from 5% to 73%, mode = 6% with 12 occurrences.

size = 40
idem
____values from 5% to 100%, mode = 6% with 15 occurrences.

size = 50
idem
____values from 5% to 99%, mode = 8% with 13 occurrences.

_______________________________________
size = 10
100 *sources* each one Bootstrapped 10000 times____values from 8% to 79%, mode = 12%-13%, with 9 occurrences each.

Conclusion
The Bootstrap doesn´t work for the Jarque-Bera test.

________licas (Luis A. Afonso)
Luis A. Afonso
Posted: Fri Mar 16, 2007 10:53 am
Guest
Significance level, alpha, by CDF



Acceptance (no-rejection) interval,
right bounded: interval (-infinity, b] such that
________1- alpha = F(b)

The rejection is (b, infinity) defined by
________ alpha = 1- F(b) - p(X=b)

This way to define them is the same the density f(X) be continuous, i.e. p(X=b)=0 or has at *b* a discontinuity.
Jack Tomsky
Posted: Fri Mar 16, 2007 11:21 am
Guest
Quote:
Significance level, alpha, by CDF



Acceptance (no-rejection) interval,
right bounded: interval (-infinity, b] such that
________1- alpha = F(b)

The rejection is (b, infinity) defined by
________ alpha = 1- F(b) - p(X=b)

This way to define them is the same the density f(X)
be continuous, i.e. p(X=b)=0 or has at *b* a
discontinuity.


What happens if there is no b such that 1- alpha = F(b)? Then your acceptance region is undefined.

Jack
Luis A. Afonso
Posted: Fri Mar 16, 2007 12:08 pm
Guest
Jack Tomsky wrote:

*** What happens if there is no b such that 1- alpha = F(b)? Then your acceptance region is undefined. Jack ***

My response

It seems to me *homework* *master* Jack. You must tell what you got yet in this matter. Ask your teacher to direct you in the right way.

________licas (Luis A. Afonso)
Jack Tomsky
Posted: Fri Mar 16, 2007 12:15 pm
Guest
Quote:
Jack Tomsky wrote:

*** What happens if there is no b such that 1- alpha
= F(b)? Then your acceptance region is undefined.
Jack ***

My response

It seems to me *homework* *master* Jack. You must
tell what you got yet in this matter. Ask your
teacher to direct you in the right way.

________licas (Luis A. Afonso)



You obviously don't know that the answer is that your definition is wrong.

Jack
Luis A. Afonso
Posted: Fri Mar 16, 2007 1:03 pm
Guest
***Jack wrote:

You obviously don't know that the answer is that your definition is wrong. Jack ***

My response

Let be X= Bin (n=4, p=0.5)
___p(X=0) = 1/16
______ F(0) = 1/16
___p(X=1) = 4*1/16
___ F(1) = 5/16
___p(X=2) = 6*1/16
___1- F(2) = 1-11/16 = 5/16
_________ = 0.3125
___p(X=3) = 4*1/16
___1- F(3) = 1-15/16 =1/16
__________= 0.0625
___p(X=4) = 1*1/16
______F(4) = 16/16

if we try to solve X such that
_________ 1- F(X) = alpha = 0.05
_____(I)
_________ you get a value between
X=3 and X=4.
THERE IS SUCH A VALUE for X.
________Equation (I) has not solution.

When you Jack Tomsky start to *mettre en marche* your little grey cells (Hercule Poirot)?

_______licas (Luis A. Afonso)
Jack Tomsky
Posted: Fri Mar 16, 2007 1:18 pm
Guest
Quote:
***Jack wrote:

You obviously don't know that the answer is that your
definition is wrong. Jack ***

My response

Let be X= Bin (n=4, p=0.5)
___p(X=0) = 1/16
______ F(0) = 1/16
___p(X=1) = 4*1/16
___ F(1) = 5/16
___p(X=2) = 6*1/16
___1- F(2) = 1-11/16 = 5/16
_________ = 0.3125
___p(X=3) = 4*1/16
___1- F(3) = 1-15/16 =1/16
__________= 0.0625
___p(X=4) = 1*1/16
______F(4) = 16/16

if we try to solve X such that
_________ 1- F(X) = alpha = 0.05
_____(I)
_________ you get a value between
X=3 and X=4.
THERE IS SUCH A VALUE for X.
________Equation (I) has not solution.

When you Jack Tomsky start to *mettre en marche* your
little grey cells (Hercule Poirot)?

_______licas (Luis A. Afonso)



According to your own calculations, there is no x for which 1-F(x) = 0.05.

For 3 <= x < 4, 1-F(x) = 0.0625.
For x = 4, 1-F(x) = 0.

Jack
Luis A. Afonso
Posted: Fri Mar 16, 2007 2:09 pm
Guest
Jack Tomsky wrote

*** According to your own calculations, there is no x for which 1-F(x) = 0.05.

For 3 <= x < 4, 1-F(x) = 0.0625.
For x = 4, 1-F(x) = 0. Jack ***

My response

*** And according of yours, Tomsky, what is the x value that makes F(x)=0.95 ? ***

________licas (Luis A. Afonso)

I MUST CONCLUDE that your comment:

*** You obviously don't know that the answer is that your definition is wrong.***

It is one more STUPID and UNLEARNED statement of yours.

The Significance Level ,alpha, defined by means of the Cumulative Distribution Function is correct, exactly the opposite you claimed.

______licas (Luis A. Afonso)
Jack Tomsky
Posted: Fri Mar 16, 2007 2:25 pm
Guest
Quote:
Jack Tomsky wrote

*** According to your own calculations, there is no x
for which 1-F(x) = 0.05.

For 3 <= x < 4, 1-F(x) = 0.0625.
For x = 4, 1-F(x) = 0. Jack
***

My response

*** And according of yours, Tomsky, what is the x
value that makes F(x)=0.95 ? ***

________licas (Luis A. Afonso)

I MUST CONCLUDE that your comment:

*** You obviously don't know that the answer is that
your definition is wrong.***

It is one more STUPID and UNLEARNED statement of
yours.

The Significance Level ,alpha, defined by means of
the Cumulative Distribution Function is correct,
exactly the opposite you claimed.

______licas (Luis A. Afonso)



The reason that you can't produce an x such that 1-F(x) = 0.05 is that the x doesn't exist.

Jack
 
Page 3 of 4    Goto page Previous  1, 2, 3, 4  Next   All times are GMT - 5 Hours
The time now is Sun Oct 12, 2008 1:45 am