Main Page | Report this Page
 
   
Science Forum Index  »  Space - Consult Forum  »  Statistical power; paired data; attrition
Page 1 of 1    
Author Message
Guest
Posted: Sat Jan 06, 2007 8:24 am
Hello,

I have some more questions (related to my previous post
"Effect size with no control group").

According to: alpha, number n of subjects, and effect size,
it is possible to find the statistical power (1-beta),
by making a t-test with that alpha, n, and ES.

Tables with results are reported (for example) at:

http://www.science.mcmaster.ca/Psychology/psych2ra3/pwr1samp.pdf
(in the case of 1 Sample, or 2 Paired samples)

http://www.science.mcmaster.ca/Psychology/psych2ra3/pwr2samp.pdf
(in the case of 2 Indipendent samples)

Provided that ES = (Mean2 - Mean1) / SD
I understand why the table is different in the 2 cases:
if we have a single sample (to compare with a previous mean),
the t test is calculated with ES / SQR(1/n)
while if we have 2 independent samples to compare (n subjects each),
the t test is calculated with ES / SQR(2/n).
What is different is the SQR(2/n) instead of SQR(1/n) :
for independent samples, if m=n, SQR(1/m + 1/n) becomes SQR(2/n)

However, I am confused about the case of Paired data.
In this case ES should be calculated as usual,
ES = (Mean2 - Mean1) / SD
but the t-test must NOT be calculated with ES, but with the
paired differences, so that the SD is NOT the SD of the sample!
Usually SDdiff << SDsample, so the t-test comes out more
favorable (high statistical power, even with low alpha).

Hence, in the case of paired data, I consider wrong to
calculate power using the usual ES. The table would be fine
only if I use a "fake ES" calculated as (Mean2 - Mean1) / SDdiff
instead of the classic ES that is (Mean2 - Mean1) / SDsample
Question: am I wrong?

Another question, that is completely different, is this:
while studying several researches, I saw that, if there is an
attrition, this is not taken into account in the calculation
of ES! An example: pre-test with n=30 (experimental group)
and m=30 (control group). Post-test with n=25 and m=25.
I understand that ES is still calculated as
(MeanPost - MeanPre) / SDpooled
that does not take into account that 5 subjects have gone!
(So the ES is overestimated, in my understanding).
Question: is this a standard?

Probably the 5 subjects have gone because the treatment
is not effective, so I would consider the score of all the 30
subjects in the post-test, using, for the 5 that have gone,
their pre-test score, unchanged (I already said this in a
previous post of mine).
This would reduce the calculated ES to a more reliable value.

So, in my understanding, the ES's that I read of are
overestimated, because attrition has not been taken into
account.

Thanks for any explanation or comment.

Fabrizio Coppola
Istituto Scientia
Italy
Richard Ulrich
Posted: Sun Jan 07, 2007 12:27 am
Guest
On 6 Jan 2007 04:24:34 -0800, scientia@ipotesi.net wrote:

Quote:
Hello,

I have some more questions (related to my previous post
"Effect size with no control group").

According to: alpha, number n of subjects, and effect size,
it is possible to find the statistical power (1-beta),
by making a t-test with that alpha, n, and ES.

Tables with results are reported (for example) at:

http://www.science.mcmaster.ca/Psychology/psych2ra3/pwr1samp.pdf
(in the case of 1 Sample, or 2 Paired samples)

http://www.science.mcmaster.ca/Psychology/psych2ra3/pwr2samp.pdf
(in the case of 2 Indipendent samples)

Provided that ES = (Mean2 - Mean1) / SD
I understand why the table is different in the 2 cases:
if we have a single sample (to compare with a previous mean),
the t test is calculated with ES / SQR(1/n)
while if we have 2 independent samples to compare (n subjects each),
the t test is calculated with ES / SQR(2/n).
What is different is the SQR(2/n) instead of SQR(1/n) :
for independent samples, if m=n, SQR(1/m + 1/n) becomes SQR(2/n)

However, I am confused about the case of Paired data.
In this case ES should be calculated as usual,
ES = (Mean2 - Mean1) / SD

For independent data, the variance of the difference
is var(A)+var(B).
For paired data, the variance of the difference
is var(A)+var(B)- 2*cov(AB).


Quote:
but the t-test must NOT be calculated with ES, but with the
paired differences, so that the SD is NOT the SD of the sample!
Usually SDdiff << SDsample, so the t-test comes out more
favorable (high statistical power, even with low alpha).

It can be convenient to estimate the variance of the difference
from direct measurements. At times, it can be more reliable
to use estimates of the Standard deviation and correlation.

Quote:

Hence, in the case of paired data, I consider wrong to
calculate power using the usual ES. The table would be fine
only if I use a "fake ES" calculated as (Mean2 - Mean1) / SDdiff
instead of the classic ES that is (Mean2 - Mean1) / SDsample
Question: am I wrong?

The difference is compared using its variance.
Nothing fake about it.

Quote:

Another question, that is completely different, is this:
while studying several researches, I saw that, if there is an
attrition, this is not taken into account in the calculation
of ES! An example: pre-test with n=30 (experimental group)
and m=30 (control group). Post-test with n=25 and m=25.
I understand that ES is still calculated as
(MeanPost - MeanPre) / SDpooled
that does not take into account that 5 subjects have gone!
(So the ES is overestimated, in my understanding).
Question: is this a standard?

It is standard in clinical medicine to draw a sample N
that is larger than the one that gives the satisfactory
statistical power analysis. If you expect 10% or so attrition,
you plan for 15% extra cases.

Quote:

Probably the 5 subjects have gone because the treatment
is not effective, so I would consider the score of all the 30
subjects in the post-test, using, for the 5 that have gone,
their pre-test score, unchanged (I already said this in a
previous post of mine).
This would reduce the calculated ES to a more reliable value.

So, in my understanding, the ES's that I read of are
overestimated, because attrition has not been taken into
account.

Thanks for any explanation or comment.

If they don't plan for the attrition, with an extra margin,
it sounds like someone is messing up. Research
review committees should insist on correction.


--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html
Guest
Posted: Sun Jan 07, 2007 1:51 pm
Thanks for your answer.
There is still something that I am not sure about.
Since it is very specific, I am going to explain it
with a real example.

Consider these psychometric paired data (n=12) :

Subject Pre score Post score Difference

1 58 55 -3
2 65 58 -7
3 55 44 -11
4 60 52 -8
5 62 57 -5
6 49 45 -4
7 41 38 -3
8 53 48 -5
9 36 32 -4
10 49 51 +2
11 64 65 +1
12 52 47 -5

Mean 53.67 49.33 -4.33
St.Dev. 8.97 9.07 3.55
St.Dev/sqr(n) 2.59 2.62 1.02
t -1.66 -4.25

I calculate Effect Size
ES = (Mean2 - Mean1) / SD
(pooled SD is 9.02) so that
ES = (49.33-53.67)/9.02 = -4.34/9.02 =
ES = -0.481
(the minus is because the score has decreased,
but consider ES = 0.481)

I check if this result is significant (one-tailed t test)

Case A. I consider the 2 samples as paired
(as they really are):
t = -4.33/(9.02*sqr(1/12)) = -4.34/1.02 =
t = -4.25 (significant, p<0.001)

Of course, I consider this Case A as correct,
while the following Cases, B and C, are not correct:

Case B. I consider the 2 samples as independent:
t = (49.33-53.67)/(3.55*sqr(1/12+1/12)) = -4.34/3.69 =
t = -1.18 (non-significant)

Case C. I trivially compare the post-sample Mean2
with a fixed value that is equal to Mean1 = 53.67:
t = (49.33-53.67)/(9.02/sqr(12)) = -4.34/2.61 =
t = -1.66 (significant if alpha=.10, non-significant if alpha=.05)

Now, let's consider Statistical Power.
Remember that Effect Size is 0.481.

I calculate Statistical Power (1-beta), considering
alpha = 0.05, and I compare my result with the result
given by the famous Cohen's tables, given alpha, ES, n:
http://www.science.mcmaster.ca/Psychology/psych2ra3/pwr1samp.pdf
(in the case of 1 Sample, or 2 Paired samples)
http://www.science.mcmaster.ca/Psychology/psych2ra3/pwr2samp.pdf
(in the case of 2 Indipendent samples)

Valid results come out if I consider
the 2 "wrong" Cases, B or C:

Case B
I calculate Power (1-beta) = 0.28
On Cohen's tables for two independent samples,
given alpha=0.05 (one-tail), n=12, ES=0.50,
Power = 0.31
(it's OK, since in the tables I can find ES=0.50,
while in my case ES=0.481)

Case C
I calculate Power (1-beta) = 0.45
But, on Cohen's tables for one sample,
given alpha=0.05 (one-tail), n=12, ES=0.50,
Power = 0.48 (it's OK, compared with 0.45)

However, I do NOT get a correct result
if I use the correct Case A!

Case A
I calculate Power (1-beta) = 0.9989 (very good)
But, on Cohen's tables for paired data,
given alpha=0.05 (one-tail), n=12, ES=0.50
Power = 0.48 only!

Nevertheless, if I use a "fake ES" in Case A,
fakeES = (realES)*sqr(12) = 1.666
then I get the correct value of Power from
Cohen's tables for paired data.
But I don't think that my ES is 1.666.
I still think that ES is 0.481
(this is why I wrote "fake ES" in my previous post,
and I insist here in writing so).

This is a puzzle to me.

I have another puzzle, about attrition,
but I will post it later...

Fabrizio Coppola
Istituto Scientia
Italy

Note. There was also a Pre-pre score,
used as a kind of control:
Mean 54.25; StDev=8.72; it did not differ
significantly from the Mean of the Pre-score, 53.67.
In other words, "nothing happened before the treatment"
(54.25 --> 53.67) and "something happened due to the
treatment" (53.67 --> 49.33). I know that this was
a weak design, as everybody told me in a previous
thread; however, these are data that were already
available. Since there is not a real control group,
feel free to call it a "placebo effect", if you like,
and just consider the math.
Guest
Posted: Sun Jan 07, 2007 5:41 pm
Errors in my previous post:
the numerical results are correct,
but a few numbers that I typed in (inside the calculations)
are wrong. Here is the correct calculations:

Case A. I consider the 2 samples as paired
(as they really are):
t = -4.33/(3.55*sqr(1/12)) = -4.34/1.02 =
t = -4.25 (significant, p<0.001)

Case B. I consider the 2 samples as independent
(even if they are not):
t = (49.33-53.67)/(9.02*sqr(1/12+1/12)) = -4.34/3.69 =
t = -1.18 (non-significant)

Case C. I trivially compare the post-sample Mean2
with a fixed value that is equal to Mean1 = 53.67:
t = (49.33-53.67)/(9.02/sqr(12)) = -4.34/2.61 =
t = -1.66 (significant if alpha=.10, non-significant if alpha=.05)

Effect size is ES=0.481.

What I find strange is this: since t=-4.25 and n=12
if I calculate statistical power (1-beta) with alpha=.05 (one-tail)
I find 0.9989 (very good).

But from Cohen's tables, with alpha=.05 (one-tail), n=12
and Effect Size =0.50, for paired data, I read that statistical
power should be 0.48!

Fabrizio Coppola
Richard Ulrich
Posted: Sun Jan 07, 2007 6:11 pm
Guest
On 7 Jan 2007 09:51:03 -0800, scientia@ipotesi.net wrote:

Quote:
Thanks for your answer.
There is still something that I am not sure about.
Since it is very specific, I am going to explain it
with a real example.

Consider these psychometric paired data (n=12) :

Subject Pre score Post score Difference

1 58 55 -3
2 65 58 -7
3 55 44 -11
4 60 52 -8
5 62 57 -5
6 49 45 -4
7 41 38 -3
8 53 48 -5
9 36 32 -4
10 49 51 +2
11 64 65 +1
12 52 47 -5

Mean 53.67 49.33 -4.33
St.Dev. 8.97 9.07 3.55
St.Dev/sqr(n) 2.59 2.62 1.02
t -1.66 -4.25

I calculate Effect Size
ES = (Mean2 - Mean1) / SD
(pooled SD is 9.02) so that
ES = (49.33-53.67)/9.02 = -4.34/9.02 =
ES = -0.481
(the minus is because the score has decreased,
but consider ES = 0.481)

That is what you should call, "The effect size, in
terms of cross-sectional variation." It is *not*
the effect size in terms of change-scores.
-- That effect size is 4.33/3.55, which you have
not computed anywhere, so far as I notice.

Quote:

I check if this result is significant (one-tailed t test)

Case A. I consider the 2 samples as paired
(as they really are):
t = -4.33/(9.02*sqr(1/12)) = -4.34/1.02 =
t = -4.25 (significant, p<0.001)

Of course, I consider this Case A as correct,
while the following Cases, B and C, are not correct:

Case B. I consider the 2 samples as independent:
t = (49.33-53.67)/(3.55*sqr(1/12+1/12)) = -4.34/3.69 =
t = -1.18 (non-significant)

Case C. I trivially compare the post-sample Mean2
with a fixed value that is equal to Mean1 = 53.67:
t = (49.33-53.67)/(9.02/sqr(12)) = -4.34/2.61 =
t = -1.66 (significant if alpha=.10, non-significant if alpha=.05)
- naturally this t is greater than (B), since it ignores the

variation of the number being compared-to.

Quote:

Now, let's consider Statistical Power.
Remember that Effect Size is 0.481.

Well, not for purposes of statistical power analysis.
Quote:

I calculate Statistical Power (1-beta), considering
alpha = 0.05, and I compare my result with the result
given by the famous Cohen's tables, given alpha, ES, n:
http://www.science.mcmaster.ca/Psychology/psych2ra3/pwr1samp.pdf
(in the case of 1 Sample, or 2 Paired samples)
http://www.science.mcmaster.ca/Psychology/psych2ra3/pwr2samp.pdf
(in the case of 2 Indipendent samples)

Valid results come out if I consider
the 2 "wrong" Cases, B or C:

I don't know where you get the term 'valid', since
you are using wrong numbers.

Quote:

Case B
I calculate Power (1-beta) = 0.28
On Cohen's tables for two independent samples,
given alpha=0.05 (one-tail), n=12, ES=0.50,
Power = 0.31
(it's OK, since in the tables I can find ES=0.50,
while in my case ES=0.481)

Case C
I calculate Power (1-beta) = 0.45
But, on Cohen's tables for one sample,
given alpha=0.05 (one-tail), n=12, ES=0.50,
Power = 0.48 (it's OK, compared with 0.45)

However, I do NOT get a correct result
if I use the correct Case A!

Case A
I calculate Power (1-beta) = 0.9989 (very good)
But, on Cohen's tables for paired data,
given alpha=0.05 (one-tail), n=12, ES=0.50
Power = 0.48 only!

Nevertheless, if I use a "fake ES" in Case A,
fakeES = (realES)*sqr(12) = 1.666

I have no idea where you get that one.
For the one-sample test of the difference, the observed
value of Cohen's d for the ES is 4.33/3.55 .

Quote:
then I get the correct value of Power from
Cohen's tables for paired data.
But I don't think that my ES is 1.666.
I still think that ES is 0.481
(this is why I wrote "fake ES" in my previous post,
and I insist here in writing so).

This is a puzzle to me.

The big hazard in using computer programs for
doing power analyses is that they seldom provide
(for the hard cases -- especially, anything with repeated
measures) enough information about the parameters
you need to provide. That is one reason why you
should try to frame the question in terms of a contrast
and the SD of the contrast.


--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html
Guest
Posted: Mon Jan 08, 2007 10:42 am
scientia@ipotesi.net wrote:
Quote:
In other words: I can be confident that the treatment
works, form the analysis of the existing data (alpha=.05,
beta=.02), since they are paired.

Perhaps a small point over language choice, but I'd say you can feel
confident that the change you observed was probably not due to sampling
error. You still really have no evidence of whether it was due to the
treatment unless you do the stronger randomized design against a control.
Guest
Posted: Mon Jan 08, 2007 2:27 pm
Rich,

thanks for your help. I was confused, but
a few answers of yours made me understand all.

So, this is the useful information that I can
get from the existing data (n=12, paired):

Pre score Post score Difference
Mean 53.67 49.33 -4.33
St.Dev. 8.97 9.07 3.55

Difference Mean2 - Mean1 = -4.33

The data are paired, so:
Effect Size = 4.33/3.55 = 1.220
(this is the value that I was trying to calculate)

If I consider the data as independent
Effect Size = 4.33/9.02 = 0.481
(but they are not independent).

If I want to plan a new experiment,
with independent groups, I should consider
0.481 instead of 1.220 (this is why I called
the latter value as "fake ES", even if it is not fake,
since in this case the data were actually paired).

I will probably get an Effect Size around
0.45 or 0.50. So if I want power>=.90, with
alpha=.05, one-tail, from Cohen's table for
indepedent data, I find n=80.
Adjusting for attrition, n>=100 (for each group).

With n=12 (as in the existing pilot study)
I would get a power around 0.30 (insufficient).
However, the data in the pilot study (n=12)
were paired, that enhanced power (I "felt" this
but I was not sure how to find the actual power).
So, with n=12, paired data, and ES = 1.22
(corrected because the data are paired - it's
what I called "fake-ES"), alpha=.05, one-tail,
I have got power=.98 (very good!).

In other words: I can be confident that the treatment
works, form the analysis of the existing data (alpha=.05,
beta=.02), since they are paired. And I have got an estimation
of ES for planning a stronger test (with independent groups):
ES = 0.48. But I must have a lot of subjects (100+100)
for a "strong" design.
Is it all right?

About the attrition.

My concern is not about the problem increasing n to
compensate for attrition, while planning an experiment.
My surprise is this: if there is an attrition (that
is in most experiments), when Effect Size is calculated,
attrition is not taken into account!
This is what I understand by analyzing several cases.
It is very surprising to me, so I wonder if this
is really a standard. (This might be obvious to you
but it is not to me).

Suppose I have n=100, but 30 subjects drop out.
Effect Size is calculated from the 70 remaining subject:
say it is ES = 0.80 (for example).
I think this should be considered a overestimated value.
In fact, in a voluntary psychological treatment,
probably the subjects dropped out because the treatment
had *no* effect. In other words, Effect size = 0.00 for
those 30 subjects. So, I would calculate a "real"
Effect Size: ES = (70*0.80 + 30*0.00) / 100
that is ES = 0.56 (much less than the accepted 0.80).

But I see that nobody does anything like that.
So, all the ES I read about, in psychological
research, seem overestimated to me.

Finally: in the case I reported, with 12 paired data,
there is a problem that I did not mention before.
There was actually an attrition: 2 subjects did not
finish the treatment and, when solicited, did not give
an explanation. So, the real data is n=14, with 2
missing post-data.

How should I take into account this attrition? Will I
loose the high statistical power (and confidence) that
comes from the paired data? In other words, shouldn't
I be confident that "something has happened" during
the treatment? Or can I do what is usually done in
experiments with independent data, that is, ignore
those 2 subjects?

Thanks again,

Fabrizio Coppola
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Wed Dec 03, 2008 8:42 pm