| |
 |
|
|
Science Forum Index » Statistics - Education Forum » How to Determine Sample Size with ANOVA
Page 1 of 1
|
| Author |
Message |
| JunoExpress |
Posted: Mon Mar 24, 2008 5:32 pm |
|
|
|
Guest
|
Hi,
I understand how to determine the sample size if you are doing a
simple hypothesis test (like equality of means) with 2 populations.
Recently, however, I had to undertake a one-way ANOVA test where I am
using more than one population. Nothing fancy, no complex design
issues, just good old straightforward ANOVA. I have been able to find
very little in the way of how sample size is determined, and I am
wondering if anyone can either explain the basic concept involved or
point me in the direction of a good text that is not too advanced
(maybe 1st yr grad level) yet explains this problem well.
TIA,
Matt B |
|
|
| Back to top |
|
| JunoExpress |
Posted: Mon Mar 24, 2008 9:37 pm |
|
|
|
Guest
|
Quote: Recently, however, I had to undertake a one-way ANOVA test where I am
using more than one population.
And how does "more than one" differ from the first paragraph,
where you said you understand with two populations?
Three?
Sorry, I meant to say more than 2 groups.
Quote:
Nothing fancy, no complex design
issues, just good old straightforward ANOVA. I have been able to find
very little in the way of how sample size is determined, and I am
wondering if anyone can either explain the basic concept involved or
point me in the direction of a good text that is not too advanced
(maybe 1st yr grad level) yet explains this problem well.
The standard social science reference is Cohen's book,
Statistical Power for the Social Sciences (1990).
With more than two groups, you have choices on how
to model the differences - one group vs. two at the other
end, or randomly ordered. The "effect size" is based on
computations resembling the F test, i.e., squared differences.
Thank you for the reference: I've leafed through this book before, and
I'll check it out.
What I am after however is very simple: it seems it should be possible
to extend the notion of sample size from the 2 to the "more than 2"
group case, but I can't see how to do it.
I know that for the 2 group case we define an "effect size", which is
essentially the smallest significant difference in the means we wish
to measure. So we can write our effect size d as:
d = abs(mu_0 - mu_1)
where mu_0 is the mean of our rv under H0 and mu1 is the "closest
mean" we wish to be able to detect under H1.
We let alpha be the size of our test and 1-beta be the power of our
test. We suppose the rv X is normally distributed under H0 or H1 with
the same variance just a different mean (which is the assumption of
our model in an ANOVA test anyhow). We also suppose that our effect
size is "two-tailed" in the sense that under H1, the mean of the rv
can be greater or less than mu_0.
Due to the symmetry of the distributions under H0 and H1, we can
determine the sample size from considering the right hand portion of
the sample. Let X_c be the critical value of the rv that is to satisfy
both our size and power constraints. Then the size constraint
requires:
(eqn 1) (X_c - mu_0)/(sigma/sqrt(n)) = Z_(alpha/2)
where Z_(alpha/2) is defined as the normalized normal rv value such
that
Prob{ (X_c - mu_0)/(sigma/sqrt(n)) < Z_(alpha/2) | H0 } = alpha/2
The power constraint requires
(eqn 2) (mu_0 + d - X_c)/(sigma/sqrt(n)) = Z_beta
where Z_beta is defined as the normalized normal rv value such that
Prob{ (mu_1 - X_c)/(sigma/sqrt(n)) < Z_beta | H1 } = beta
Solving for X_c from eqns (1) and (2) and equating the expressions
gives us:
d = (Z_alpha/2 + Z_beta) * sigma/sqrt(n)
from which we can solve for n.
This is my understanding of how to determine sample size from a simple
two population case. I am wondering if this sort of analysis can be
extended to more than 3 populations (I have a feeling the answer is
no, but I thought I would ask anyhow).
Matt |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Mon Mar 24, 2008 11:09 pm |
|
|
|
Guest
|
On Mon, 24 Mar 2008 20:32:14 -0700 (PDT), JunoExpress
<MTBrenneman@gmail.com> wrote:
Quote: Hi,
I understand how to determine the sample size if you are doing a
simple hypothesis test (like equality of means) with 2 populations.
Recently, however, I had to undertake a one-way ANOVA test where I am
using more than one population.
And how does "more than one" differ from the first paragraph,
where you said you understand with two populations?
Three?
Quote: Nothing fancy, no complex design
issues, just good old straightforward ANOVA. I have been able to find
very little in the way of how sample size is determined, and I am
wondering if anyone can either explain the basic concept involved or
point me in the direction of a good text that is not too advanced
(maybe 1st yr grad level) yet explains this problem well.
The standard social science reference is Cohen's book,
Statistical Power for the Social Sciences (1990).
With more than two groups, you have choices on how
to model the differences - one group vs. two at the other
end, or randomly ordered. The "effect size" is based on
computations resembling the F test, i.e., squared differences.
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Wed Mar 26, 2008 10:19 pm |
|
|
|
Guest
|
[cross-posted to sci.stat.math, where the same question
also was posted.]
On Tue, 25 Mar 2008 00:37:23 -0700 (PDT), JunoExpress
<MTBrenneman@gmail.com> wrote:
Quote:
Recently, however, I had to undertake a one-way ANOVA test where I am
using more than one population.
And how does "more than one" differ from the first paragraph,
where you said you understand with two populations?
Three?
Sorry, I meant to say more than 2 groups.
Nothing fancy, no complex design
issues, just good old straightforward ANOVA. I have been able to find
very little in the way of how sample size is determined, and I am
wondering if anyone can either explain the basic concept involved or
point me in the direction of a good text that is not too advanced
(maybe 1st yr grad level) yet explains this problem well.
The standard social science reference is Cohen's book,
Statistical Power for the Social Sciences (1990).
With more than two groups, you have choices on how
to model the differences - one group vs. two at the other
end, or randomly ordered. The "effect size" is based on
computations resembling the F test, i.e., squared differences.
Thank you for the reference: I've leafed through this book before, and
I'll check it out.
What I am after however is very simple: it seems it should be possible
to extend the notion of sample size from the 2 to the "more than 2"
group case, but I can't see how to do it.
Expanding on what I posted before -
The t-test uses a difference in means, so the effect
size can be based on that, in standardized units.
The F-test that is needed for 3 or more groups uses
the sum of squares around the grand mean, so the
effect size has to be based on that -- something like
the eta-squared or the eta. Similarly, the non-central
F-distribution is used for looking at the probabilities
for examining power.
Quote:
I know that for the 2 group case we define an "effect size", which is
essentially the smallest significant difference in the means we wish
to measure. So we can write our effect size d as:
d = abs(mu_0 - mu_1)
where mu_0 is the mean of our rv under H0 and mu1 is the "closest
mean" we wish to be able to detect under H1.
We let alpha be the size of our test and 1-beta be the power of our
test. We suppose the rv X is normally distributed under H0 or H1 with
the same variance just a different mean (which is the assumption of
our model in an ANOVA test anyhow). We also suppose that our effect
size is "two-tailed" in the sense that under H1, the mean of the rv
can be greater or less than mu_0.
Due to the symmetry of the distributions under H0 and H1, we can
determine the sample size from considering the right hand portion of
the sample. Let X_c be the critical value of the rv that is to satisfy
both our size and power constraints. Then the size constraint
requires:
(eqn 1) (X_c - mu_0)/(sigma/sqrt(n)) = Z_(alpha/2)
where Z_(alpha/2) is defined as the normalized normal rv value such
that
Prob{ (X_c - mu_0)/(sigma/sqrt(n)) < Z_(alpha/2) | H0 } = alpha/2
The power constraint requires
(eqn 2) (mu_0 + d - X_c)/(sigma/sqrt(n)) = Z_beta
where Z_beta is defined as the normalized normal rv value such that
Prob{ (mu_1 - X_c)/(sigma/sqrt(n)) < Z_beta | H1 } = beta
Solving for X_c from eqns (1) and (2) and equating the expressions
gives us:
d = (Z_alpha/2 + Z_beta) * sigma/sqrt(n)
from which we can solve for n.
This is my understanding of how to determine sample size from a simple
two population case. I am wondering if this sort of analysis can be
extended to more than 3 populations (I have a feeling the answer is
no, but I thought I would ask anyhow).
Well, the extension is not a simple extrapolation.
The general logic is not different.
I didn't study the equations above.
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Fri Dec 05, 2008 8:03 am
|
|