Main Page | Report this Page
 
   
Science Forum Index  »  Statistics - Math Forum  »  Ask for help on sample size decision for a non-normal distri
Page 1 of 1    
Author Message
Jane
Posted: Sat Dec 30, 2006 5:15 am
Guest
Hello, group.

I have a problem on sample size decision. supposed that for a
sysmmetric distribution with mean=0 (but it is non-normal). How can I
decide the sample size that can garuntee that guarantee that the mean
of sample =0 with certian power.

Thanks very much.

Jane
Herman Rubin
Posted: Sat Dec 30, 2006 9:23 pm
Guest
In article <1167470135.308372.189240@a3g2000cwd.googlegroups.com>,
Jane <jinyinglu@hotmail.com> wrote:
Quote:
Hello, group.

I have a problem on sample size decision. supposed that for a
sysmmetric distribution with mean=0 (but it is non-normal). How can I
decide the sample size that can garuntee that guarantee that the mean
of sample =0 with certian power.

Thanks very much.

Jane

As stated, it is impossible. You would at least need
some bounds on the tail of the distribution. For any
sample size and desired power, one can give a family
of distrubutions which will not achieve it.

--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558
Jane
Posted: Mon Jan 01, 2007 1:44 pm
Guest
Thanks very much for you kindely reply.

But this situation is that first, I should find a minimum sample size
ranther than infinity due to the cost and time.

secondly, in case of non-normal distribution, is the formular --m +-
1.96 sqrt( sigma^2 / n )
---still valid ?

Thanks again for kindely reply.
Jane


"m00es дµÀ£º
"
Quote:
Jane wrote:
Hello, group.

I have a problem on sample size decision. supposed that for a
sysmmetric distribution with mean=0 (but it is non-normal). How can I
decide the sample size that can garuntee that guarantee that the mean
of sample =0 with certian power.

Easy. The answer is: infinity.

Let m be the sample mean calculated from n observations taken from a
continuous distribution with expected value mu = 0. The probability
that m is exactly 0 for finite n is zero.

However, under the assumption of normality, we know that the interval

m +- 1.96 sqrt( sigma^2 / n )

will include the value mu with probability .95, where sigma^2 is the
variance in the population. Unless you are dealing with some nasty
distribution, we know, based on the central limit theorem, that this
interval will also capture mu with approximately .95 probability as
long as n is not too small. So, if you know sigma^2, you can solve for
n to figure out how large n has to be in order for the interval to be
of a certain desired size.

m00es
Jane
Posted: Mon Jan 01, 2007 1:49 pm
Guest
Thanks very much for your kindely reply.

But my case is that:

first, I should find a minimum sample size rather than the inifinite
due to cost and time.

secondly, Is the fomular ---m +- 1.96 sqrt( sigma^2 / n ), still valide
in case of the non-normal distribution?


Jane


"m00es дµÀ£º
"
Quote:
Jane wrote:
Hello, group.

I have a problem on sample size decision. supposed that for a
sysmmetric distribution with mean=0 (but it is non-normal). How can I
decide the sample size that can garuntee that guarantee that the mean
of sample =0 with certian power.

Easy. The answer is: infinity.

Let m be the sample mean calculated from n observations taken from a
continuous distribution with expected value mu = 0. The probability
that m is exactly 0 for finite n is zero.

However, under the assumption of normality, we know that the interval

m +- 1.96 sqrt( sigma^2 / n )

will include the value mu with probability .95, where sigma^2 is the
variance in the population. Unless you are dealing with some nasty
distribution, we know, based on the central limit theorem, that this
interval will also capture mu with approximately .95 probability as
long as n is not too small. So, if you know sigma^2, you can solve for
n to figure out how large n has to be in order for the interval to be
of a certain desired size.

m00es
m00es
Posted: Mon Jan 01, 2007 8:07 pm
Guest
Jane wrote:
Quote:
Thanks very much for you kindely reply.

But this situation is that first, I should find a minimum sample size
ranther than infinity due to the cost and time.

Of course. However, you asked what sample size you need to guarantee
that the sample mean will be exactly zero. The probability of that is
zero, unless you take a sample of infinite size.

Quote:
secondly, in case of non-normal distribution, is the formular --m +-
1.96 sqrt( sigma^2 / n )
---still valid ?

It is approximately valid, based on the central limit theorem. Under
certain regularity conditions, the distribution of the sample mean
approaches normality as n increases. Therefore, as long as n is not too
small, the equation works quite well.

m00es
Richard Ulrich
Posted: Tue Jan 02, 2007 12:15 am
Guest
On 1 Jan 2007 09:44:41 -0800, "Jane" <jinyinglu@hotmail.com> wrote:

Quote:
Thanks very much for you kindely reply.

But this situation is that first, I should find a minimum sample size
ranther than infinity due to the cost and time.

secondly, in case of non-normal distribution, is the formular --m +-
1.96 sqrt( sigma^2 / n )
---still valid ?

You said before that your distribution is symmetric - Most of
the illustrations of the failures of the Central Limit Theorem
use skewed distributions, where the gain of one tail is offset
by the loss in the other. So your results might be more
robust than that, but you are not necessarily safe.

The Cauchy distribution is symmetric, and samples will
look symmetric, but will have bad behavior -- fat tails.
In fact, for a simple Cauchy, the formula is *not* valid, since
its variance is infinite. And the mean is not a useful parameter.

The best information about robustness is probably
contained in the information about how the numbers
are generated. If the range of scores is strictly limited,
most people would probably be happy to use a generous
estimate of the variance, and apply the formula.

How are the numbers generated, and how much data do you
have already on hand, in order to test the distribution?

--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Wed Dec 03, 2008 9:45 pm