| |
 |
|
|
Science Forum Index » Statistics - Math Forum » 2-stage study with lots of tests -- calculation of...
Page 1 of 1
|
| Author |
Message |
| Josef Frank... |
Posted: Mon Jul 07, 2008 8:40 am |
|
|
|
Guest
|
Dear all,
let there be conducted some experiment with a lot of independent
variables and therefore a lot of simple association tests
(assume n=550,000) of any of these factors with only one
dependent variable be carried out (case control study
in epidemiology with disease status as outcome);
let 0.1% (n_sig1=550) of them get significant at the
0.001 level (p<0.001), according to null hypothesis.
Now these significantly tested factors are carried on
forward to a 2nd stage of the study in an independent
sample. These factors are finally tested at the 5%-level.
Now I'd like to calculate the expected number of significant
tests (n_sig2) under null) and a critical value (upper 95%
confidence bound) in order to get some kind of "global p-value"
(Before availability of final test results of stage 2.
At this time we have only results of stage 1.)
The question now: How can this be done?
Should I:
a) just use a binomial test like:
'binom.test(n_sig2,n_sig1,0.05,"greater")' in R
disregarding the total number of tests conducted
in the 1st stage, as we are still assuming the null?
b) do a kind of 2-dimensional ChiČ (or Fisher's test)
using a table like the following:
| significant | not significant
| at stage one | at stage one
----------------+--------------+-------------------
significant | a | b
at stage two | (=n_sig2) |
----------------+--------------+-------------------
not significant| c | d
at stage two | |
----------------+--------------+-------------------
total | (=n_sig1) |
#again R-code with some sample:
#p1<0.001; p2<0.05
a<-37
b<-27500-a
c<-550-a
d<-550000-a-b-c
t<-matrix(c(a,b,c,d),2,byrow=TRUE)
t
fisher.test(t,alternative="greater")
problem here: b and d are actually 0 as they are
not done in 2nd stage. One could only use expected
values from 1st and 2nd stage; but that's exactly
what I am to do: compare some hypothetical observed
counts against expected values under null.
And it takes into account the total number of
tests from the 1st stage, in contrast to the
upper one that neglects that these factors have
been tested with a significant p-value twice
(in *two* *independent* samples).
So I would prefer that.
c) go to something completely different?
what then exactly.
And what would I do finally (when having all test results)
to correct the p-values for multiple testing
over these _two stages_?
Maybe someone could even point me to some
reference in literature?
Thanks a lot
best wishes
Josef Frank |
|
|
| Back to top |
|
| RichUlrich... |
Posted: Sat Jul 12, 2008 8:24 pm |
|
|
|
Guest
|
On Mon, 07 Jul 2008 15:40:22 +0200, Josef Frank <josef.frank at (no spam) gmx.li>
wrote:
Quote:
Dear all,
let there be conducted some experiment with a lot of independent
variables and therefore a lot of simple association tests
(assume n=550,000) of any of these factors with only one
dependent variable be carried out (case control study
in epidemiology with disease status as outcome);
let 0.1% (n_sig1=550) of them get significant at the
0.001 level (p<0.001), according to null hypothesis.
Let's see -- There are exactly the same number of
nominally significant results as one would expect by
chance. That's a pretty horrible outcome. Once
you correct for chance, using those results, there
would apparently be nothing there, at all.
Quote: Now these significantly tested factors are carried on
forward to a 2nd stage of the study in an independent
sample. These factors are finally tested at the 5%-level.
Now I'd like to calculate the expected number of significant
tests (n_sig2) under null) and a critical value (upper 95%
confidence bound) in order to get some kind of "global p-value"
(Before availability of final test results of stage 2.
At this time we have only results of stage 1.)
Given that Stage 1 resulted in exactly what you would
expect by chance, in terms of nominally significant results,
Stage 2 starts with a winnowing that gives essentially
no information. (If Stage 1 gave twice the number
as expected, you would have a slightly better
prospect. Modelling that might be possible, but I
wonder if it would require some specific
assumptions.)
Quote:
The question now: How can this be done?
Should I:
a) just use a binomial test like:
'binom.test(n_sig2,n_sig1,0.05,"greater")' in R
disregarding the total number of tests conducted
in the 1st stage, as we are still assuming the null?
b) do a kind of 2-dimensional ChiČ (or Fisher's test)
using a table like the following:
| significant | not significant
| at stage one | at stage one
----------------+--------------+-------------------
significant | a | b
at stage two | (=n_sig2) |
----------------+--------------+-------------------
not significant| c | d
at stage two | |
----------------+--------------+-------------------
total | (=n_sig1) |
#again R-code with some sample:
#p1<0.001; p2<0.05
a<-37
b<-27500-a
c<-550-a
d<-550000-a-b-c
t<-matrix(c(a,b,c,d),2,byrow=TRUE)
t
fisher.test(t,alternative="greater")
problem here: b and d are actually 0 as they are
not done in 2nd stage. One could only use expected
values from 1st and 2nd stage; but that's exactly
what I am to do: compare some hypothetical observed
counts against expected values under null.
And it takes into account the total number of
tests from the 1st stage, in contrast to the
upper one that neglects that these factors have
been tested with a significant p-value twice
(in *two* *independent* samples).
So I would prefer that.
c) go to something completely different?
what then exactly.
And what would I do finally (when having all test results)
to correct the p-values for multiple testing
over these _two stages_?
Maybe someone could even point me to some
reference in literature?
Thanks a lot
best wishes
Josef Frank
If you really have such "null" results from stage 1,
you only need to worry about the tests in stage 2,
starting as a new candidate set of tests.
If you only want to correct for experiment-wise error
and preserve a 5% alpha for a test, you need to use
Bonferroni correction... which is a somewhat doubtful
proposition, I think, for N=550 tests. Depending on the
tests, tinier p-values can be very difficult to obtain, and,
when obtained, can be increasingly less accurate.
You might what to look up FDR (False Discovery Rate)
(Benjamini-Hochberg for the original) for other possibilities
of dealing with large numbers of tests. I think that the
genetics folks using micro-arrays are moving that direction.
--
Rich Ulrich |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Sat Nov 22, 2008 4:46 pm
|
|