| |
 |
|
|
Science Forum Index » Space - Consult Forum » Effect size with no control group?
Page 3 of 3 Goto page Previous 1, 2, 3
|
| Author |
Message |
| Robin Edwards |
Posted: Mon Dec 04, 2006 4:09 pm |
|
|
|
Guest
|
In article <1164812190.128791.281220@80g2000cwy.googlegroups.com>, Old
Mac User <chendrixstats@yahoo.com> wrote:
Quote: Oh, yes, then there's the icon of the Global Warmers... the "Hockey
Stick". This saga had its roots in a large data file (a spreadsheet)
which, on inspection, has some numbers that simply make no sense at
all. These errors are visible to anyone who just looks at them. But
apparently the authors of the "Hockey Stick curve" didn't bother to
look... just bulled their way ahead with exotic analyses of "the
data". That is, until someone finally got their attention. I believe
I still have links to this stuff if you'd care to join the fray.
I've snipped a huge amount pertinent of stuff to concentrate on this
lovely contribution from OMU.
He is /exactly/ right. I have all the spreadsheet data and have
examined them in the greatest detail.
It is blatantly obvious that the "owners" of the data had never thought
to do some plots. Had they done so, the hockey stick would never have
surfaced, and all the world's politicians would have had to think of
something else to spread alarm and despondency amongst the populaces of
their respective countries. Of course, the media would have been
deprived also, poor things! As it is they are having a prolonged field
day, aided and abetted by Sir Nicholas Stern's report, which owes more
to wishful thinking than independent science.
It is very simple indeed to show that the famous diagram has only a hazy
resemblance to what the data columns actually contain.
What still surprises (indeed amazes) me is that the people who reviewed
the papers that were based on the data that OMU refers to, clearly also
committed the same gross sins omission that were perpetrated by the
papers' authors. Some commentators have read into this situation the
existence of some kind of "cooperative clique" in the world of
climatology. I can't possibly come to an informed conclusion in this
respect, but I do know that papers submitted to some leading journals in
the field that do not follow slavishly the "establishment" line seldom
pass the reviewers.
Life continues to be very strange in the "unbiased" world of science.
If you are interested in the "technical" details please contact me
privately.
Robin
--
Robin Edwards
- still supplying "1st" - the only real statistical software
for RISC OS |
|
|
| Back to top |
|
| Old Mac User |
Posted: Mon Dec 04, 2006 6:53 pm |
|
|
|
Guest
|
Robin...
Thanks for your interesting post. I have some notes and emails that
were exchanged with a person who made a "big deal" out of flaws in the
"hockey stick" data and the failure to bother to look at the data. He's
the "good guy" in this episode, and he has been threatened and cursed
by thefathers of the Hockey Stick. I'll see if I can find some of this
in my computer. I would not be comfortable posting it here because
some of it it came to me as personal notes. But I believe I can find a
way to get some of it to you. I thought I had your e-mail address, but
can't find it at this moment. I may find it in the next hour. But
please send me an e-mail to give me a link to you just in case.
I profess that... with the advent of spreadsheets (including those
embedded in statistical software)... there has been a rapid growth in
the "failure to just look at the data." I have a substantial folder of
examples of this... data that came to me as XL attachments to emails...
real data (and often very expensive data) that came from my clients.
This "failure to look at the data" sometimes leads to strange
"analyses".... really strange, indeed.
One of my realizations of this happened about 8 years ago after I had
done extensive work on certain data that had come to me from diverse
sources. The sources were several companies in the U.S.A. and also
from the U.S. Environmental Protection Agency (EPA). All of these data
had been "blessed" by a person who is an expert in acquiring this
particular kind of data. Otherwise I would not have touched it. The
expert flagged certain data as questionable (that was helpful). Part
of my work was concerned with "pulling it all together" into a
comprehensive story.
In the course of events I noticed something odd about the EPA data.
Describing this "oddity" would require several pages of details and
would confuse more than it would help. I was annoyed and frustrated by
what I saw, and my concerns grew exponentially as I continued with my
work and analysis and preparation for a "show and tell" in Washington,
DC.
There was basically nothing I could do... I was "stuck" with those
data. Trying to explain what I saw was going to be messy and
difficult. So I proceeded with what I had. Besides, there were other,
larger, issues that beset this project including a "personality" that
was giving me grief.
The big day arrived. In my presentation I casually said "I'm sure that
all of you (reps from companies and from the EPA) noticed this. I just
want you to know that I saw it also."
I mentioned that flaw in the EPA data. Even as I spoke I realized this
flaw was (1) even more urgent and critical than I'd realized" and (2)
as heads turned and the EPA people whispered among themselves... uh
uh... no one ever noticed this. Foolish me. I'd revealed that the
king was butt naked right in front of managers and worker bees... some
from "industry" and some from the EPA. Later, someone told me my voice
sort of trailed off and became weaker as the full impact sank into me.
I honestly thought this flaw was sort of obvious. After all, I saw it
just by looking at the spreadsheet data. Well, nobody had ever really
looked at the data. They had "analyzed" the heck out of it, but never
really looked at it. Some quietly got up and left the room and never
returned. One EPA manager got up and walked around the conference
table in circles without saying a word. We were on the 4th floor... I
hoped he wasn't a "jumper."
To make matters worse I had not only revealed the problem in the data
but had also explained what had probably gone wrong in the equipment
and the process used to generate the data. The engineers who collected
the data should have noticed this even while they were running the
experiments. But they didn't. And the folks who compiled the data and
who "blessed" it didn't see it either.
This was very expensive "one of a kind -- never to be done again" data.
I suspected this was the end of my consulting work with those nice
folks. I had an urgent "call of nature" and called a "time out".
We moved forward from there and the matter was never mentioned again.
The EPA data was several very large files. Most of my examples of
"failing to just look at the data" are rather small volumes of data.
But for some reason problems of this sort can appear even in small
files. I have one nasty example in which a company hired several
statisticians (one after the other) to "analyze" a set of expensive
experimental data and none could make sense of it. They invoked all
sorts of exotic methods, including principal components analysis (PCA)
and regression on PCA. When it finally came to me I found a simple
copy/paste error. Why nobody noticed this is beyond me. After fixing
this the analysis was simple and easy to explain.
The Hockey Stick matter is, of course, rooted in "the needed for
funding" at a major university. The "funding" (shall we call it the
loot?) was shared among a prof. or two and several hungry graduate
students. This is a common practice, but only in a few instances are
the consequences so wide and so deep. The ones that really scare me
are the analyis of medical data. You know... the ones that end up in
the NY Times on Sunday morning.
Thanks for your post. It made my day. Send that link to me and I'll
find some things you may or may not have found. OMU
Robin Edwards wrote:
Quote: In article <1164812190.128791.281220@80g2000cwy.googlegroups.com>, Old
Mac User <chendrixstats@yahoo.com> wrote:
Oh, yes, then there's the icon of the Global Warmers... the "Hockey
Stick". This saga had its roots in a large data file (a spreadsheet)
which, on inspection, has some numbers that simply make no sense at
all. These errors are visible to anyone who just looks at them. But
apparently the authors of the "Hockey Stick curve" didn't bother to
look... just bulled their way ahead with exotic analyses of "the
data". That is, until someone finally got their attention. I believe
I still have links to this stuff if you'd care to join the fray.
I've snipped a huge amount pertinent of stuff to concentrate on this
lovely contribution from OMU.
He is /exactly/ right. I have all the spreadsheet data and have
examined them in the greatest detail.
It is blatantly obvious that the "owners" of the data had never thought
to do some plots. Had they done so, the hockey stick would never have
surfaced, and all the world's politicians would have had to think of
something else to spread alarm and despondency amongst the populaces of
their respective countries. Of course, the media would have been
deprived also, poor things! As it is they are having a prolonged field
day, aided and abetted by Sir Nicholas Stern's report, which owes more
to wishful thinking than independent science.
It is very simple indeed to show that the famous diagram has only a hazy
resemblance to what the data columns actually contain.
What still surprises (indeed amazes) me is that the people who reviewed
the papers that were based on the data that OMU refers to, clearly also
committed the same gross sins omission that were perpetrated by the
papers' authors. Some commentators have read into this situation the
existence of some kind of "cooperative clique" in the world of
climatology. I can't possibly come to an informed conclusion in this
respect, but I do know that papers submitted to some leading journals in
the field that do not follow slavishly the "establishment" line seldom
pass the reviewers.
Life continues to be very strange in the "unbiased" world of science.
If you are interested in the "technical" details please contact me
privately.
Robin
--
Robin Edwards
- still supplying "1st" - the only real statistical software
for RISC OS |
|
|
| Back to top |
|
| Robin Edwards |
Posted: Tue Dec 05, 2006 3:41 pm |
|
|
|
Guest
|
In article <1165272789.274506.205020@16g2000cwy.googlegroups.com>,
Old Mac User <chendrixstats@yahoo.com> wrote:
Quote: Thanks for your interesting post. I have some notes and emails that
were exchanged with a person who made a "big deal" out of flaws in the
"hockey stick" data and the failure to bother to look at the data. He's
the "good guy" in this episode, and he has been threatened and cursed
by thefathers of the Hockey Stick. I'll see if I can find some of this
in my computer. I would not be comfortable posting it here because
some of it it came to me as personal notes. But I believe I can find a
way to get some of it to you. I thought I had your e-mail address, but
can't find it at this moment. I may find it in the next hour. But
please send me an e-mail to give me a link to you just in case.
Guess you've found it, Charles, because I got an email )
The "Good guys" are the two Ms, Steve McIntyre and Ross McKitrick, who
first spotted the data problems and then divined where the fatal flaw in
the "statistical" analysis lay. I have had quite a lot of contact with
them, I'm pleased to say.
Quote: I profess that... with the advent of spreadsheets (including those
embedded in statistical software)... there has been a rapid growth in
the "failure to just look at the data." I have a substantial folder of
examples of this... data that came to me as XL attachments to emails...
real data (and often very expensive data) that came from my clients.
This "failure to look at the data" sometimes leads to strange
"analyses".... really strange, indeed.
Major snip of a very amusing tale! I've filed this one away for further
delectation.
Strange how history repeats itself in other places and times. I
experienced much the same in 1976 and 1977, though did not expose the
gaffes in an important meeting, thank goodness. The blunders, which
were not accepted by the perpetrators eventually cost my company many
millions of GBP in wasted research time alone, and loads more with lost
commercial time. The head of the group whose people caused the losses
was promoted to a very senior grade because of the huge costs that were
invested in his department to try (in vain) to rescue the situation :-(
Hope to hear privately from you in due course.
Robin
--
Robin Edwards
- still supplying "1st" - the only real statistical software
for RISC OS |
|
|
| Back to top |
|
| Guest |
Posted: Mon Dec 18, 2006 11:54 am |
|
|
|
|
I am continuing this thread because I am not able
to understand why a control group which does
nothing is considered somewhat useful.
I understand that a control group doing something
is useful (see Bruce Weaver's post, quoted below).
But several sources consider useful even a control
group doing nothing (which sounds as a nonsense
to me).
Summary: our intervention is the instruction in a
meditation technique (to be practiced twice a day)
that is supposed to lower trait anxiety, with fast
results. It is not possible to make a double-blinded
study, and it is difficult even to make a single-
blinded study. Further on, we have only a few
subjects (around 20) at this time.
So we decided to make a "time-series" study,
by measuring trait anxiety at least 3 times:
1) Two weeks before the intervention;
2) Immediately before the intervention;
3) Two weeks after the intervention.
It's obvious that, by comparing 2 and 3, we can
calculate the effect size, while the 1 and 2
comparison is supposed to act as a control:
the same treatment group, before the intervention,
would be the control group.
However, I read from several source that such
a design is considered "weak", while a design
with a control group doing nothing, is stronger.
I can't figure out why. This seems so illogical
to me.
It is 100% natural for me to assume that in the
control group doing nothing, there will be no
change, unless there are magical or astrological
events.
The only reason why such control group can be
useful, is to measure the standard deviation of the
obvious zero change. But such standard deviation
can be better measured in our "weak" design,
during the first two weeks, while the treatment
group does nothing.
This seems "stronger" to me and I can't see
why it is considered a "weaker design"
(unless for those who believe in astrology and want
to verify that no supernatural event has created
the effect).
Thanks for any explanation.
Fabrizio Coppola
Istituto Scientia
Italy
Bruce Weaver ha scritto Mer 22 Nov 2006:
Quote: [snip]
Given the subject matter, you definitely need at least one control group
IMO. The control group must have pre and post scores too, but without
the TM intervention.
The reason I said you need *at least one* control group is that doing
nothing between pre and post may not be adequate. Ideally, you should
have another control group that does "something" between pre and post
under conditions that are similar to what the TM group experiences
[snip] |
|
|
| Back to top |
|
| Reef Fish |
Posted: Mon Dec 18, 2006 12:36 pm |
|
|
|
Guest
|
scientia@ipotesi.net wrote:
Quote: I am continuing this thread because I am not able
to understand why a control group which does
nothing is considered somewhat useful.
I understand that a control group doing something
is useful (see Bruce Weaver's post, quoted below).
But several sources consider useful even a control
group doing nothing (which sounds as a nonsense
to me).
Such would be the control for the "placebo effect".
If you want to test the claim "eating fish brains will
cure your headache".
In a set of randomly assigned groups, the control
group that "does nothing" would be useful to see
how they compare to the fish-brain eaters.
-- Reef Fish Bob. |
|
|
| Back to top |
|
| Marc Schwartz |
Posted: Mon Dec 18, 2006 12:49 pm |
|
|
|
Guest
|
scientia@ipotesi.net wrote:
Quote: I am continuing this thread because I am not able
to understand why a control group which does
nothing is considered somewhat useful.
I understand that a control group doing something
is useful (see Bruce Weaver's post, quoted below).
But several sources consider useful even a control
group doing nothing (which sounds as a nonsense
to me).
Summary: our intervention is the instruction in a
meditation technique (to be practiced twice a day)
that is supposed to lower trait anxiety, with fast
results. It is not possible to make a double-blinded
study, and it is difficult even to make a single-
blinded study. Further on, we have only a few
subjects (around 20) at this time.
So we decided to make a "time-series" study,
by measuring trait anxiety at least 3 times:
1) Two weeks before the intervention;
2) Immediately before the intervention;
3) Two weeks after the intervention.
It's obvious that, by comparing 2 and 3, we can
calculate the effect size, while the 1 and 2
comparison is supposed to act as a control:
the same treatment group, before the intervention,
would be the control group.
However, I read from several source that such
a design is considered "weak", while a design
with a control group doing nothing, is stronger.
I can't figure out why. This seems so illogical
to me.
It is 100% natural for me to assume that in the
control group doing nothing, there will be no
change, unless there are magical or astrological
events.
The only reason why such control group can be
useful, is to measure the standard deviation of the
obvious zero change. But such standard deviation
can be better measured in our "weak" design,
during the first two weeks, while the treatment
group does nothing.
This seems "stronger" to me and I can't see
why it is considered a "weaker design"
(unless for those who believe in astrology and want
to verify that no supernatural event has created
the effect).
Thanks for any explanation.
Fabrizio Coppola
Istituto Scientia
Italy
Fabrizio,
The problem is your assumption that a "do nothing" group, known as a
placebo group, will experience no change in the outcome of interest.
As I noted in a prior reply here, it is not uncommon that the placebo
group can experience clinically meaningful changes. Thus, the question
that has to be addressed is:
"Is the residual effect size attributable to the active treatment,
after considering the effect in the placebo group, still meaningful?"
If not, then your active treatment is largely useless, at least in the
sample in the study. In fact, your active treatment may be worse than
no treatment, if it has a safety profile that results in an
unacceptable increase in risk over the placebo. That is why BOTH
efficacy and safety need to be assessed as compared to a control.
The design that you now propose is essentially a cross-over design,
where each subject acts as their own control. It is certainly an
acceptable approach and is common in drug trials. The other common
trial model is the parallel design with two separate groups of subjects
and has already been proposed.
However, even with cross-over trials, the subjects are typically
randomized to the sequence of interventions. In your case, this means
that half of your subjects would be randomized to "non-treatment then
treatment", while the other half would get "treatment then
non-treatment". Given the issues you appear to have with blinding and
the likelihood of some carryover effect in the second case, such an
approach would be problematic.
Such a confounding issue would typically be dealt with by using a "wash
out" period, where there is a time period imposed after the first
treatment and before the second treatment in both groups, to allow for
the subject to return to a "baseline" setting. Again, that might be
problematic here and I don't know enough about your domain to offer
suggestions.
There are pros and cons to each trial model. Let me point you to a
document that might be helpful. This is known as ICH E10: CHOICE OF
CONTROL GROUP AND RELATED ISSUES IN CLINICAL TRIALS. A link to a PDF
version is:
http://www.ich.org/LOB/media/MEDIA486.pdf
In addition, the following would also be helpful:
ICH E9: STATISTICAL PRINCIPLES FOR CLINICAL TRIALS
available here:
http://www.ich.org/LOB/media/MEDIA485.pdf
Part of the problem here will also be your hypothesized effect size and
whether or not 20 subjects will be sufficient to both achieve that
effect size and have a statistically significant result. A priori
power/sample size calculations are typically done to have a reasonable
level of confidence that this outcome will occur.
You will end up wasting a lot of time and money if you proceed with the
study, find yourself with a clinically meaningful result, but with
insufficient power and a non-statistically significant result. Unless
of course the purpose of your study is to establish some baseline
parameters for a larger study at a later date (ie. this is a pilot
study), which is fine if that is the intended purpose here.
I hope that the above is helpful. There are also many books available
on clinical trials and study design. However, I would recommend that
you solicit the advice and guidance of someone who has hands on
experience, preferably in areas related to psychometrics and related
matters, as this is a domain all its own.
Marc Schwartz |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Tue Dec 19, 2006 11:54 pm |
|
|
|
Guest
|
On 18 Dec 2006 07:54:25 -0800, scientia@ipotesi.net wrote:
Quote: I am continuing this thread because I am not able
to understand why a control group which does
nothing is considered somewhat useful.
I understand that a control group doing something
is useful (see Bruce Weaver's post, quoted below).
But several sources consider useful even a control
group doing nothing (which sounds as a nonsense
to me).
Summary: our intervention is the instruction in a
meditation technique (to be practiced twice a day)
that is supposed to lower trait anxiety, with fast
results. It is not possible to make a double-blinded
study, and it is difficult even to make a single-
blinded study. Further on, we have only a few
subjects (around 20) at this time.
What do you hope to conclude?
With *no* control group, you might be able to come
to one conclusion, "There was *no* change in scores (benefit)
that seems interesting; forget about this one."
If there is some change in score, you have to explain away
all the possible reasons, before you try to claim you really
have much. Were subjects wanting to please the experimenter?
Placebo effect, or justification for wasting one's time?
A change in attitude when taking a test for the second or
third time?
Having a do-nothing group controls for a *little* bit, but
it might not be worth much. "Pleasing the experimenter"
might be a problem, for any non-blind study. So, How is
anxiety measured? A physiological measure - outside
of conscious control - would be far better than any self-report,
if you want to convince cynical readers.
The best 'meditation' study that I remember used a comparison
of Transcendental Meditation to attempting "short naps" during
the workday (eyes closed, etc.). Both seemed to be beneficial,
to about the same extent, on the same measures.
I figured that this was the state of the art - as good as napping,
and not better. That was neither surprising nor hard to believe.
Are you adding anything to that proposition?
[snip]
--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Guest |
Posted: Tue Jan 02, 2007 1:10 pm |
|
|
|
|
Thanks for the new answers.
I appreciate the help that all of you are trying
to give me.
I now understand that the problem is the following:
most of the researches that have been made on
Transcendental Meditation (since 1970) are based
on "weak" designs, so that "my" design seems quite
"strong" to me. So, even if all of you are saying
that "my" design is weak, it's hard for me to
accept it...
Many of existing researches, even those published
on peer-reviewed journals of psychology, actually
do have a control group, but the groups are very
small: often n<10 (and sometimes as low as 5)
both for the experimental and the control group.
And the control group does absolutely nothing.
Certain researches that are really trivial
(for example: only post-treatment test, compared
to a control group that is supposed to be similar
to how the experimental group was before the
treatment; both groups with n<20)
So it's hard for me to understand why my "time series"
design is weak, with n>20 and with the experimental
group also acting as control group (because it's
measured two times before the treatment).
However, I will try to accept that my design will
be considered as a "pilot study" only.
Fabrizio Coppola
Istituto Scientia
Italy |
|
|
| Back to top |
|
| |
Page 3 of 3 Goto page Previous 1, 2, 3
All times are GMT - 5 Hours
The time now is Fri Aug 29, 2008 2:14 pm
|
|