| |
 |
|
|
Science Forum Index » Statistics - Math Forum » Ask for help on sample size decision for a non-normal distri
Page 1 of 1
|
| Author |
Message |
| Jane |
Posted: Sat Dec 30, 2006 5:15 am |
|
|
|
Guest
|
Hello, group.
I have a problem on sample size decision. supposed that for a
sysmmetric distribution with mean=0 (but it is non-normal). How can I
decide the sample size that can garuntee that guarantee that the mean
of sample =0 with certian power.
Thanks very much.
Jane |
|
|
| Back to top |
|
| Herman Rubin |
Posted: Sat Dec 30, 2006 9:23 pm |
|
|
|
Guest
|
In article <1167470135.308372.189240@a3g2000cwd.googlegroups.com>,
Jane <jinyinglu@hotmail.com> wrote:
Quote: Hello, group.
I have a problem on sample size decision. supposed that for a
sysmmetric distribution with mean=0 (but it is non-normal). How can I
decide the sample size that can garuntee that guarantee that the mean
of sample =0 with certian power.
Thanks very much.
Jane
As stated, it is impossible. You would at least need
some bounds on the tail of the distribution. For any
sample size and desired power, one can give a family
of distrubutions which will not achieve it.
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558 |
|
|
| Back to top |
|
| Jane |
Posted: Mon Jan 01, 2007 1:44 pm |
|
|
|
Guest
|
Thanks very much for you kindely reply.
But this situation is that first, I should find a minimum sample size
ranther than infinity due to the cost and time.
secondly, in case of non-normal distribution, is the formular --m +-
1.96 sqrt( sigma^2 / n )
---still valid ?
Thanks again for kindely reply.
Jane
"m00es дµÀ£º
"
Quote: Jane wrote:
Hello, group.
I have a problem on sample size decision. supposed that for a
sysmmetric distribution with mean=0 (but it is non-normal). How can I
decide the sample size that can garuntee that guarantee that the mean
of sample =0 with certian power.
Easy. The answer is: infinity.
Let m be the sample mean calculated from n observations taken from a
continuous distribution with expected value mu = 0. The probability
that m is exactly 0 for finite n is zero.
However, under the assumption of normality, we know that the interval
m +- 1.96 sqrt( sigma^2 / n )
will include the value mu with probability .95, where sigma^2 is the
variance in the population. Unless you are dealing with some nasty
distribution, we know, based on the central limit theorem, that this
interval will also capture mu with approximately .95 probability as
long as n is not too small. So, if you know sigma^2, you can solve for
n to figure out how large n has to be in order for the interval to be
of a certain desired size.
m00es |
|
|
| Back to top |
|
| Jane |
Posted: Mon Jan 01, 2007 1:49 pm |
|
|
|
Guest
|
Thanks very much for your kindely reply.
But my case is that:
first, I should find a minimum sample size rather than the inifinite
due to cost and time.
secondly, Is the fomular ---m +- 1.96 sqrt( sigma^2 / n ), still valide
in case of the non-normal distribution?
Jane
"m00es дµÀ£º
"
Quote: Jane wrote:
Hello, group.
I have a problem on sample size decision. supposed that for a
sysmmetric distribution with mean=0 (but it is non-normal). How can I
decide the sample size that can garuntee that guarantee that the mean
of sample =0 with certian power.
Easy. The answer is: infinity.
Let m be the sample mean calculated from n observations taken from a
continuous distribution with expected value mu = 0. The probability
that m is exactly 0 for finite n is zero.
However, under the assumption of normality, we know that the interval
m +- 1.96 sqrt( sigma^2 / n )
will include the value mu with probability .95, where sigma^2 is the
variance in the population. Unless you are dealing with some nasty
distribution, we know, based on the central limit theorem, that this
interval will also capture mu with approximately .95 probability as
long as n is not too small. So, if you know sigma^2, you can solve for
n to figure out how large n has to be in order for the interval to be
of a certain desired size.
m00es |
|
|
| Back to top |
|
| m00es |
Posted: Mon Jan 01, 2007 8:07 pm |
|
|
|
Guest
|
Jane wrote:
Quote: Thanks very much for you kindely reply.
But this situation is that first, I should find a minimum sample size
ranther than infinity due to the cost and time.
Of course. However, you asked what sample size you need to guarantee
that the sample mean will be exactly zero. The probability of that is
zero, unless you take a sample of infinite size.
Quote: secondly, in case of non-normal distribution, is the formular --m +-
1.96 sqrt( sigma^2 / n )
---still valid ?
It is approximately valid, based on the central limit theorem. Under
certain regularity conditions, the distribution of the sample mean
approaches normality as n increases. Therefore, as long as n is not too
small, the equation works quite well.
m00es |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Tue Jan 02, 2007 12:15 am |
|
|
|
Guest
|
On 1 Jan 2007 09:44:41 -0800, "Jane" <jinyinglu@hotmail.com> wrote:
Quote: Thanks very much for you kindely reply.
But this situation is that first, I should find a minimum sample size
ranther than infinity due to the cost and time.
secondly, in case of non-normal distribution, is the formular --m +-
1.96 sqrt( sigma^2 / n )
---still valid ?
You said before that your distribution is symmetric - Most of
the illustrations of the failures of the Central Limit Theorem
use skewed distributions, where the gain of one tail is offset
by the loss in the other. So your results might be more
robust than that, but you are not necessarily safe.
The Cauchy distribution is symmetric, and samples will
look symmetric, but will have bad behavior -- fat tails.
In fact, for a simple Cauchy, the formula is *not* valid, since
its variance is infinite. And the mean is not a useful parameter.
The best information about robustness is probably
contained in the information about how the numbers
are generated. If the range of scores is strictly limited,
most people would probably be happy to use a generous
estimate of the variance, and apply the formula.
How are the numbers generated, and how much data do you
have already on hand, in order to test the distribution?
--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Sun Jul 27, 2008 3:31 am
|
|