| |
 |
|
|
Science Forum Index » Statistics - Education Forum » collinearity and LR
Page 1 of 2 Goto page 1, 2 Next
|
| Author |
Message |
| Guest |
Posted: Thu Nov 30, 2006 3:59 pm |
|
|
|
|
I am running LR. There are high correlations between some of my
continuous predictors. The tests for colllinearity are positive. My
question may be a silly one......what is the difference between
collinearity and confounding. It seems like they would be the same
thing. However, one is encouraged to keep confounders in the model and
encouraged to drop collinear variables. What am I missing. I have
mostly handled my analysis by dropping one where I see evidence.
Mcap |
|
|
| Back to top |
|
| Paige Miller |
Posted: Sat Dec 02, 2006 9:29 am |
|
|
|
Guest
|
On 11/30/2006 2:59 PM, mcam54@hotmail.com wrote:
Quote: I am running LR. There are high correlations between some of my
continuous predictors. The tests for colllinearity are positive. My
question may be a silly one......what is the difference between
collinearity and confounding. It seems like they would be the same
thing. However, one is encouraged to keep confounders in the model and
encouraged to drop collinear variables. What am I missing. I have
mostly handled my analysis by dropping one where I see evidence.
Mcap
Confounding, usually used in reference to a designed experiment, means
that two or more columns of the X matrix are linearly dependent.
"Exact collinearity" means that two variables of the X matrix are
linearly dependent. Usually, we talk of high collinearity to mean that
two variables have a high correlation with one another (it is judgment
what "high correlation" means).
With regards to dropping a variable when you see collinearity, how do
you know which one to drop? Suppose the one you choose to drop is the
actual cause of changes in Y? How would you know?
I am not aware of advice that says to keep confounders in the model
(in fact, in a designed experiment, you have to do the opposite -- you
cannot estimate all effects when things are confounded), and I am
opposed to dropping one or more collinear variables simply because it
has high collinearity, without knowing which of the collinear
variables is the cause.
Since in much of my work, we do NOT know a priori causal variables,
collinearity presents a problem, and the solution I most often adopt
is to use Partial Least Squares (PLS) regression instead of Ordinary
Least Squares regression. In PLS, you do not drop regressors in the
presence of collinearity, and according to a paper by Frank and
Friedman in Technometrics (1993), PLS performs better than OLS in the
presence of collinearity. "Performs better" means that PLS has a lower
MSE of predicted values and a lower MSE of regression coefficients
than does OLS (including stepwise methods).
--
Paige Miller
pmiller5@rochester.rr.com
It's nothing until I call it -- Bill Klem, NL Umpire
If you get the choice to sit it out or dance,
I hope you dance -- Lee Ann Womack |
|
|
| Back to top |
|
| Reef Fish |
Posted: Sat Dec 02, 2006 7:18 pm |
|
|
|
Guest
|
Paige Miller wrote:
Quote: On 11/30/2006 2:59 PM, mcam54@hotmail.com wrote:
I am running LR. There are high correlations between some of my
continuous predictors. The tests for colllinearity are positive. My
question may be a silly one......what is the difference between
collinearity and confounding. It seems like they would be the same
thing. However, one is encouraged to keep confounders in the model and
encouraged to drop collinear variables. What am I missing. I have
mostly handled my analysis by dropping one where I see evidence.
Mcap
Confounding, usually used in reference to a designed experiment, means
that two or more columns of the X matrix are linearly dependent.
That is the definition that requires the "independent variables" in
a regression to be "linearly independent" a concept in LINEAR
ALBEGRA, not statistics.
Quote: "Exact collinearity" means that two variables of the X matrix are
linearly dependent.
That is NOT correct. Linear indepdence is a MATHEMATICAL
concept. A set of vectors in a vector space is either linearly
independent or they are not. The definition is EXACT dependence.
In statistics, because the computer cannot diagnose linear
dependence from nearly linearly dependent, the latter is called
the "multicollinearity" condition.
But both linear dependence and multicollinearity are based on
ALL of the variables, though not neceesarily requiring all to
be linearly dependent. That is why you statement should read
Quote: "Exact collinearity" means that two (OR MORE) variables of the
X matrix are linearly dependent.
Usually, we talk of high collinearity to mean that
two variables have a high correlation with one another (it is judgment
what "high correlation" means).
This is ALSO the misconception (Greg Heath spent months dwelling
on his error about that misconception) that linear dependence or
multicollinarity can be characterized by pairwise correlations.
There are MANY threads in sci.stat.math on the keywords
"linear indepdence" and "multicollinearity" that had extended
discussions about what they mean and how they are detected
and remedied.
The remainder of your comments indicates that YOU (Paige Miller)
needs to read those threads carefully.
In particular, one CANNOT draw causal inference from correlations
or regression without well-controlled experiments.
-- Reef Fish Bob,
Quote:
With regards to dropping a variable when you see collinearity, how do
you know which one to drop? Suppose the one you choose to drop is the
actual cause of changes in Y? How would you know?
I am not aware of advice that says to keep confounders in the model
(in fact, in a designed experiment, you have to do the opposite -- you
cannot estimate all effects when things are confounded), and I am
opposed to dropping one or more collinear variables simply because it
has high collinearity, without knowing which of the collinear
variables is the cause.
Since in much of my work, we do NOT know a priori causal variables,
collinearity presents a problem, and the solution I most often adopt
is to use Partial Least Squares (PLS) regression instead of Ordinary
Least Squares regression. In PLS, you do not drop regressors in the
presence of collinearity, and according to a paper by Frank and
Friedman in Technometrics (1993), PLS performs better than OLS in the
presence of collinearity. "Performs better" means that PLS has a lower
MSE of predicted values and a lower MSE of regression coefficients
than does OLS (including stepwise methods).
--
Paige Miller
pmiller5@rochester.rr.com
It's nothing until I call it -- Bill Klem, NL Umpire
If you get the choice to sit it out or dance,
I hope you dance -- Lee Ann Womack |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Sat Dec 02, 2006 10:45 pm |
|
|
|
Guest
|
On 2 Dec 2006 15:18:52 -0800, "Reef Fish"
<large_nassua_grouper@yahoo.com> wrote:
Quote:
Paige Miller wrote:
[snip]
RF >
Quote: There are MANY threads in sci.stat.math on the keywords
"linear indepdence" and "multicollinearity" that had extended
discussions about what they mean and how they are detected
and remedied.
Keep in mind, when you read, that Reef Fish Bob
preserves the strict definition of 'multicollinearity' that is
not of much use any more, and distorts any questions that
entail the more general definition.
[snip, slur]
Quote:
In particular, one CANNOT draw causal inference from correlations
or regression without well-controlled experiments.
Well, you do need to do multiple studies, and know a lot
about your variables. That's something Bob rejects.
When I cited Fred Mosteller's role in major observational
studies, Reef Fish Bob pretended to concede more, but
he seems to be back to his old research nihilism. He does
not like that term, but it seems to represent him better than
his own hedging does.
--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Reef Fish |
Posted: Sun Dec 03, 2006 4:32 pm |
|
|
|
Guest
|
Richard Ulrich wrote:
Quote: On 2 Dec 2006 15:18:52 -0800, "Reef Fish"
large_nassua_grouper@yahoo.com> wrote:
Paige Miller wrote:
[snip]
RF
There are MANY threads in sci.stat.math on the keywords
"linear indepdence" and "multicollinearity" that had extended
discussions about what they mean and how they are detected
and remedied.
Keep in mind, when you read, that Reef Fish Bob
preserves the strict definition of 'multicollinearity' that is
not of much use any more, and distorts any questions that
entail the more general definition.
Strict definition of multicollinearity? The trouble with Richard
Ulrich
is that he always MISREPRESENT without citing anything I said.
Multicollinearity in statistical usage is an ILL-DEFINED term
meaning "almost linearly dependent". In mathematical that is
NOT a well defined term where as "linear indepdence" or
"linear dependecne" are.
Paige Miller, for whatever misunderstanding he might have had,
certainly would be a FOOL to listen to Richard Ulrich who is
still completely muddled about these definitions and concepts!
Quote:
[snip, slur]
In particular, one CANNOT draw causal inference from correlations
or regression without well-controlled experiments.
Well, you do need to do multiple studies, and know a lot
about your variables. That's something Bob rejects.
At least you QUOTED me, and that is a correct statement
ANY TIME. If did NOT say anything about multiple
experiments -- sometimes one well designed experiment
will do. It also did not say they always succeeed in the
inference -- that's why there is no absolutely certainty in
any result of statistical hypothesis testing, but there are
two types of errors.
Quote:
When I cited Fred Mosteller's role in major observational
studies, Reef Fish Bob pretended to concede more, but
he seems to be back to his old research nihilism.
That was the experiment you cited many times that I merely
ignore, for the reason cited in the PRECEDING Paragraph.
I wasn't familiar with the experiment so I keep my mouth
SHUT (unlike Ulrich who opens and inserts foot in mouth
whether he knows what he is talking or not).
Suppose I take your word that it was ONE of the "well-
designed experiment that failed" -- first of all I don't have
anything to judge if the failure was due to a fault in the
design or just pure chance (as in Type II error).
He does
Quote: not like that term, but it seems to represent him better than
his own hedging does.
That's the typical Richard Ulrich ad hominem phrase without
any statistical substance.
-- Reef Fish Bob.
|
|
|
| Back to top |
|
| Herman Rubin |
Posted: Sun Dec 03, 2006 9:29 pm |
|
|
|
Guest
|
In article <1164916781.034385.60350@16g2000cwy.googlegroups.com>,
<mcam54@hotmail.com> wrote:
Quote: I am running LR. There are high correlations between some of my
continuous predictors. The tests for colllinearity are positive. My
question may be a silly one......what is the difference between
collinearity and confounding. It seems like they would be the same
thing. However, one is encouraged to keep confounders in the model and
encouraged to drop collinear variables. What am I missing. I have
mostly handled my analysis by dropping one where I see evidence.
Mcap
Collinearity in the independent variables is a real
condition, not due to sample size if, for example, the
sample size is twice the number of variables. It should
be dealt with in an appropriate manner, such as by ridge
regression, preferably with a reasonable prior.
From the standpoint of prediction under unchanged conditions,
it may be similar to confounding, but not always in other
situations.
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@stat.purdue.edu Phone: (765)494-6054 FAX: (765)494-0558 |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Mon Dec 04, 2006 12:05 am |
|
|
|
Guest
|
[cross-posted from sci.stat.edu, since Bob has been slurring
me everywhere lately.]
On 3 Dec 2006 12:32:48 -0800, "Reef Fish"
<large_nassua_grouper@yahoo.com> wrote:
RU > >
Quote: Keep in mind, when you read, that Reef Fish Bob
preserves the strict definition of 'multicollinearity' that is
not of much use any more, and distorts any questions that
entail the more general definition.
RF
Strict definition of multicollinearity? The trouble with Richard
Ulrich
is that he always MISREPRESENT without citing anything I said.
Multicollinearity in statistical usage is an ILL-DEFINED term
meaning "almost linearly dependent". In mathematical that is
NOT a well defined term where as "linear indepdence" or
"linear dependecne" are.
It sounds to me like RF does "preserve the strict definition",
exactly as I said. The part about 'distorting the questions
that arise' is judgmental, but not too harsh, I think.
In statistical usage, Multicollinearity has come to mean that
aspect indicated by (for instance) a high Variance Inflation
Factor (VHF). Many people have written about that. It is
useful for finding variables that are redundant, and, down the
road, important in working toward equations that are going
to replicate. It is not perfectly "well-defined" but it is usable.
Bob is sort of alone in disparaging those things. I don't quote
him here, but I think I don't misrepresent him. If he *does*
want to endorse those things, I will be surprised, and pleased.
[...]
RF > > >
Quote: In particular, one CANNOT draw causal inference from correlations
or regression without well-controlled experiments.
RU
Well, you do need to do multiple studies, and know a lot
about your variables. That's something Bob rejects.
RF
At least you QUOTED me, and that is a correct statement
ANY TIME. If did NOT say anything about multiple
experiments -- sometimes one well designed experiment
will do. It also did not say they always succeeed in the
inference -- that's why there is no absolutely certainty in
any result of statistical hypothesis testing, but there are
two types of errors.
Aha! Here is one source of Bob's errors. He does not say
anything about multiple studies. Everyone talking about
drawing any sort of firm conclusions will talk about multiple
studies, or, as I do, about knowing a lot about the variables.
At other times, I have mentioned supporting biological
knowledge (studies?) and so on.
How can Bob miss this, through dozens of posts on the subject?
- He does not read well.
The conclusions from one single statistical analysis are limited.
Social science parodies frequently close with, "More study is needed."
But a single, serendipitous, uncontrolled study *can* answer a
single open question, if that is the state of present knowledge.
RU> >
Quote: When I cited Fred Mosteller's role in major observational
studies, Reef Fish Bob pretended to concede more, but
he seems to be back to his old research nihilism.
RF
That was the experiment you cited many times that I merely
ignore, for the reason cited in the PRECEDING Paragraph.
I wasn't familiar with the experiment so I keep my mouth
SHUT (unlike Ulrich who opens and inserts foot in mouth
whether he knows what he is talking or not).
Multiple studies in Mosteller's career.
September 15, since RF's memory fails him -
=====start quote from post -
Here are some of the contributions of Reef Fish's buddy,
C. Frederick Mosteller, as cited in the NY Times, in its July 27
obituary. He seems to been a scientist making extensive use
of observational data, with several notable citations below.
These studies used multiple variables, and probably OLS regression.
"In the late 1950's Dr. Mosteller assisted in analyzing data from a
large clinical study looking at the anesthetic halothane, which was
suspected of causing fatal liver damage in some patients. The
analysis showed no evidence that halothane was more dangerous than
other forms of anesthesia.
"He worked with Daniel Patrick Moynihan... on studies looking at the
impact of home life on children's performance in school. They
argued that raising families out of poverty would have a greater
educational impact than pouring money directly into schools.
[ ... snip, analysis of prose, Madison writing disputed Federalist
Papers.]
"In the 1970's, Dr. Mosteller worked on studies that questioned
whether the benefits of some surgical procedures were worth their
costs. ...
[snip, more detail.]
=====end quote
In the middle of a long reply to that post, also on Sept. 15,
Reef Fish commented -
http://groups.google.com/group/sci.stat.math/msg/8f91f8a002357c3a
RF >
Quote: "So? ALL statisticians make PLENTY of studies on OBSERVATIONAL
data -- the condition is NOT to draw unwarranted conclusions from them."
The problem here, and why I suspected that he was hedging, is
that Bob has never accepted anything concrete as an indicator
of "warranted conclusions." I describe those things as "knowing
a lot about the variables", including other results. Bob has
regularly rejected the details of such model ling as being
irrelevant to his pursuit of regression.
"Inessential" was one word he applied, I think. I argued that
"inessential" for one purpose (like, doing one regression) was
not "inessential" for doing sound inference.
RF >
Quote: Suppose I take your word that it was ONE of the "well-
designed experiment that failed" -- first of all I don't have
anything to judge if the failure was due to a fault in the
design or just pure chance (as in Type II error).
? I don't know what Bob is talking about. I never mentioned
any well-designed experiment. Another mis-reading by Bob?
RU > > [referring to nihilism]
Quote: He does
not like that term, but it seems to represent him better than
his own hedging does.
RF
That's the typical Richard Ulrich ad hominem phrase without
any statistical substance.
[snip, end]
But isn't that the SUBSTANCE, or at least the ROOT, of our
statistical disagreements? I support the use of various, careful
methods for observational studies; Bob rejects the studies and
the methods.
Bob is nihilistic on observational studies, or he is not. He
rejects the *best* of proofs, so why should he accept lesser ones?
Bob does not accept that the studies on cigarette smoking
can contribute to the conclusion that smoking causes lung
cancer. He has made no concession to the possibility that
observational studies can be well-designed, or that multiple
studies may be considered as a totality.
I call that nihilism. Bob may correct my view if I have
misrepresented him
--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Reef Fish |
Posted: Mon Dec 04, 2006 2:14 am |
|
|
|
Guest
|
Richard Ulrich wrote:
Quote: [cross-posted from sci.stat.edu, since Bob has been slurring
me everywhere lately.]
Very interesting statement, coming from Richard Ulrich. I can
predict, with 100% certainty that Richard will be repeating whatever
error I had point out which he called it "slur"?
Quote:
[snip]
RU
Keep in mind, when you read, that Reef Fish Bob
preserves the strict definition of 'multicollinearity' that is
not of much use any more, and distorts any questions that
entail the more general definition.
That's an unsubstantiated statement and misrepresentation of
everything I've ever said about "multicollinearity" (which are
plenty in archives).
Quote: RF
Strict definition of multicollinearity? The trouble with Richard
Ulrich
is that he always MISREPRESENT without citing anything I said.
Multicollinearity in statistical usage is an ILL-DEFINED term
meaning "almost linearly dependent". In mathematical that is
NOT a well defined term where as "linear indepdence" or
"linear dependecne" are.
It sounds to me like RF does "preserve the strict definition",
exactly as I said. The part about 'distorting the questions
that arise' is judgmental, but not too harsh, I think.
What is the "strict definition"? There is no other meaning in
statistics on the usage of multicollinearity other than "almost
linearly dependent" and that's NOT a definition. Multicollinearity
cannot be "defined", only described as "almost linearly
dependent".
Quote:
In statistical usage, Multicollinearity has come to mean that
aspect indicated by (for instance) a high Variance Inflation
Factor (VHF).
That is one of the POSSIBLE EFFECT of multicollinearity, not
the meaning of multicollinearity. High VHF may be present
with or without the usual defects caused by multicollinarity.
Quote: Many people have written about that. It is
useful for finding variables that are redundant, and, down the
road, important in working toward equations that are going
to replicate. It is not perfectly "well-defined" but it is usable.
That is the CURE for the problem of redundancy that CAUSED
the multicollinarity condition.
See, Richard Ulrich, you're SO deficient in your understaning
of regression concepts that don't know the difference between
"definition", "effect" and "cure" in a problem that commonly
occurs in a multiple regression!
Quote:
Bob is sort of alone in disparaging those things. I don't quote
him here, but I think I don't misrepresent him. If he *does*
want to endorse those things, I will be surprised, and pleased.
My correction of your error, and the paragraphs above explaining
what your errors are -- is that what you call "slur"?
The best thing in this case for Richard Ulrich is that he actually
MADE some statement of statistical content so that I could point
out specifically what his errors ARE. When he makes ad
hominem attacks, either misrepresenting my position or without
quoting me, there is nothing of SPECIFIC statistical substance I
can say in reply other than Richard Ulrich has been a proven
Quack in all regression matters (that's a statement of FACT that
had been widely documented in the archives).
Quote:
[...]
RF
In particular, one CANNOT draw causal inference from correlations
or regression without well-controlled experiments.
RU
Well, you do need to do multiple studies, and know a lot
about your variables. That's something Bob rejects.
RF
At least you QUOTED me, and that is a correct statement
ANY TIME. If did NOT say anything about multiple
experiments -- sometimes one well designed experiment
will do. It also did not say they always succeeed in the
inference -- that's why there is no absolutely certainty in
any result of statistical hypothesis testing, but there are
two types of errors.
Aha! Here is one source of Bob's errors. He does not say
anything about multiple studies.
You want me to say EVERYTHING in one sentence or one
paragraph? I not only said it's "not necessary" in that post
and then even posted a separate post with the subject
"A case study of causal inference" specifically explained,
even with an example, that multiple studies is NEITHER
NECESSARY NOR SUFFICIENT. That post was specifically
elaborating on Richard Ulrich's ERRORS in the post in
which he is posting:
You made this post at 11:05 pm (Dec 3).
My post appeared on Date: Sun, Dec 3 2006 4:37 pm
more than 6 hours before. You should have replied to
THAT post instead of wasting bandwidth making
unsubstantated allegations (which are proven LIES)
that "he does not say anything about multiple studies".
I said it in the post in question. I elaborated it in the
"case study" post. What more do you NEED?
Quote: Everyone talking about
drawing any sort of firm conclusions will talk about multiple
studies, or, as I do, about knowing a lot about the variables.
At other times, I have mentioned supporting biological
knowledge (studies?) and so on.
How can Bob miss this, through dozens of posts on the subject?
- He does not read well.
Knowing a lot about the variables is the Quackery in causal inference
of drawing "causal diagrams" (to reflect what the sociological quacks
called knowing about the variables"). I pointed out THAT fallacy in
my JASA 1982 review of Kenny's book, and I pointed it out in the
"case study" post with the same example illustrating that fallacy.
I missed NOTHING in Richard Ulrich's dozens of posts. You made
the SAME error in those dozens of posts, including this one.
Quote:
The conclusions from one single statistical analysis are limited.
Social science parodies frequently close with, "More study is needed."
That is true. That is also the standard disclaimer for most Ph.D
theses because they have only scratched the surface of the
problem.
That does NOT imply that it is ALWAYS NECESSARY to repeat
a conclusive study. In the social sciences, there are so many
Quacks doing "studies" most of which drawing wrong conclusions
because of misapplication of statistics, the multiple studies is
used as an excuse and multiple studies usually did NOT
correct anything in the Quackery.
The malaria example is a good illustration of that phenomenon.
A multiple study of "bad air" as the cause for malaria will always
draw the wrong conclusion and cure. A well designed and
controlled study (only take ONE) to establish what we know
today, that the CAUSE for malaria is NOT "mal" air (which is
a remote cause, but not the immediate and TRUE CAUSE
that is sought) but by what is CAUSE by mosquitos that
carry the substance that cause the victim to have malaria.
That is on the subject of "remote cause" vs "proximity cause"
and the various other meanings of "cause". They are all
about LOGIC, clearly defined and illustrated in Copi's book
on "Introduction to Logic" which many statisticians lack the
knowledge and ALL statistical Quacks in social sciences
lack when they use correlations to make causal inference.
I am SURE I've mentioned Copi's book and the matter of
"cause" to Richard in one of the many corrections of his
errors in using regression to make causal inference.
Richard Ulrich still has not understood the meaning of CAUSE
that is sought in statistical studies and the ONLY way to
design a controlled study to establish the cause.
Quote: But a single, serendipitous, uncontrolled study *can* answer a
single open question, if that is the state of present knowledge.
I said that in the post that you question in this one. I explained
it above, repeating what's in the "case study" post, that
multiple studies is NEITHER necessary (meaning a single
study may be sufficient) NOR sufficient (meaning multiple
studies by social scientists with faulty causal methodology
can, and often do, ALL lead to the wrong conclusion).
Quote:
RU
When I cited Fred Mosteller's role in major observational
studies, Reef Fish Bob pretended to concede more, but
he seems to be back to his old research nihilism.
See my detailed elaboration of my reply to that in the "case
study" posted 6 hours prior to Richard's present post.
Quote: Multiple studies in Mosteller's career.
September 15, since RF's memory fails him -
=====start quote from post -
Here are some of the contributions of Reef Fish's buddy,
C. Frederick Mosteller, as cited in the NY Times, in its July 27
obituary. He seems to been a scientist making extensive use
of observational data, with several notable citations below.
This evidence is useless. I'll defend the dead person you are
attacking.
The data from a well-designed and well-controlled study is
ALSO "observational data".
It is the "observational data" in the ABSENCE of a well designed
study (those in the studies by social scientists on whatever
observation they can get, without a design study) -- THOSE
are the ones that are invalid.
Quote: These studies used multiple variables, and probably OLS regression.
You don't even know what kind of regression was done? In short,
what Richard gave was useless (and MIS-DIRECTED) attack of
Fred Mosteller.
I had the foresight, in my "case study" post to have covered ALL
possible cases of Mosteller's studies mentioned by the New
York Times post (which was neither necessary nor sufficient)
for the discussion of CAUSAL inference.
Richard Ulrich is just name-dropping in commiting yet ANOTHER
of his logical fallacies, known in this case as Argumentum ad
Verecundiam (to the authority of the New York times), while
mis-representing the substance or implication of the cited
paragraph.
Quote: "In the late 1950's Dr. Mosteller assisted in analyzing data from a
large clinical study looking at the anesthetic halothane, which was
suspected of causing fatal liver damage in some patients. The
analysis showed no evidence that halothane was more dangerous than
other forms of anesthesia.
Just one of the FOUR scenarios I covered.
Quote:
"He worked with Daniel Patrick Moynihan... on studies looking at the
impact of home life on children's performance in school. They
argued that raising families out of poverty would have a greater
educational impact than pouring money directly into schools.
[ ... snip, analysis of prose, Madison writing disputed Federalist
Papers.]
What has the "Federalist Papers" have to do with anything about
causal inference?
Quote:
"In the 1970's, Dr. Mosteller worked on studies that questioned
whether the benefits of some surgical procedures were worth their
costs. ...
[snip, more detail.]
=====end quote
ALL of those case studies mentioned about Mosteller had already
been covered (without evening know the specific studies and
whether they were successes of failures) in the FOUR scenarios in
From: "Reef Fish" <large_nassua_grou...@yahoo.com>
Newsgroups: sci.stat.edu,sci.stat.math,sci.stat.consult
Subject: A Case Study of Quackery in Causal Inference
Date: 3 Dec 2006 13:37:21 -0800
which Richard Ulrich is a late in reading (at least 6 hours) and
late in learning (at least 50 years).and still hasn't learned.
which is consistent with what I said in the "case study" post above.
Quote: The problem here, and why I suspected that he was hedging, is
that Bob has never accepted anything concrete as an indicator
of "warranted conclusions."
Is it possible that Richard Ulrich never CITED any well designed
study that drew a correct, warranted conclusion?
That was not 'hedging'.
It was fully covered in my post BEFORE I saw THIS one.
Subject: A Case Study of Quackery in Causal Inference
Quote: RF
Suppose I take your word that it was ONE of the "well-
designed experiment that failed" -- first of all I don't have
anything to judge if the failure was due to a fault in the
design or just pure chance (as in Type II error).
? I don't know what Bob is talking about. I never mentioned
any well-designed experiment. Another mis-reading by Bob?
Covered in
Subject: A Case Study of Quackery in Causal Inference
and also in this reply.
Quote: RU > > [referring to nihilism]
He does
not like that term, but it seems to represent him better than
his own hedging does.
RF
That's the typical Richard Ulrich ad hominem phrase without
any statistical substance.
I stand on THAT statement of mine.
Quote:
[snip, end]
I call that nihilism. Bob may correct my view if I have
misrepresented him
I have corrected your view many times, including all your ERRORS
and misrepresentations NOW.
It's all covered in
Subject: A Case Study of Quackery in Causal Inference
My conclusion there stands too:
$> In particular, one CANNOT draw causal inference from
$> correlations or regression without well-controlled experiments
That ALWAYS stands, one that Ulrich challenged.
RF> Whatever Richard Ulrich tred to infer or imply is Quackery.
That is NOW proven with specific reply to Richard's statements.
What have I "slurred"?
-- Reef Fish Bob. |
|
|
| Back to top |
|
| David Winsemius |
Posted: Mon Dec 04, 2006 10:13 am |
|
|
|
Guest
|
Richard Ulrich <Rich.Ulrich@comcast.net> wrote in news:9k27n2htstqqg1120m9qsg8if6anklmi1i@4ax.com:
Quote: quoting RF
That was the experiment you cited many times that I merely
ignore, for the reason cited in the PRECEDING Paragraph.
I wasn't familiar with the experiment so I keep my mouth
SHUT (unlike Ulrich who opens and inserts foot in mouth
whether he knows what he is talking or not).
Multiple studies in Mosteller's career.
September 15, since RF's memory fails him -
=====start quote from post -
Here are some of the contributions of Reef Fish's buddy,
C. Frederick Mosteller, as cited in the NY Times, in its July 27
obituary. He seems to been a scientist making extensive use
of observational data, with several notable citations below.
These studies used multiple variables, and probably OLS regression.
"In the late 1950's Dr. Mosteller assisted in analyzing data from a
large clinical study looking at the anesthetic halothane, which was
suspected of causing fatal liver damage in some patients. The
analysis showed no evidence that halothane was more dangerous than
other forms of anesthesia.
<http://links.jstor.org/sici?sici=0162-1459(197009)65%3A331%3C1392%3ATNHS%3E2.0.CO%3B2-J>
Have not been able to find a full copy of the National Halothane
Study but this first page from the JASA review suggests that the
committee concluded that the poorly designed analyses that had
preceded the study, coupled with the failure of their larger
"retrospective" study to show a differential association of Halothane
administration with fatal side-effects, left the case for doing a
controlled trial rather weak. The authors did propose an experimental
design which was not carried out.
If your institution has JSTOR access, you can get the rest of the
review.
--
David Winsemius |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Tue Dec 05, 2006 3:39 am |
|
|
|
Guest
|
I think I responded sufficiently to most of this 343 line post
in my long reply in the "Case Study" thread. Bob is
unnecessarily verbose.
On 3 Dec 2006 22:14:31 -0800, "Reef Fish"
<large_nassua_grouper@yahoo.com> wrote:
Quote:
Richard Ulrich wrote:
[cross-posted from sci.stat.edu, since Bob has been slurring
me everywhere lately.]
RF
Very interesting statement, coming from Richard Ulrich. I can
predict, with 100% certainty that Richard will be repeating whatever
error I had point out which he called it "slur"?
I was referring to those snide asides, Bob, that you were
slipping into so many threads. Pure noise. Pure ad-hominem.
Those never mentioned an error.
Bob's simpler errors, by type -
1) Bob does not know the meaning of a word, and
objects to its use. Appreciate. Schizophrenia.
2) Bob has an idiosyncratic meaning for a word, and
defends it, and attacks the mainstream usage. Multi-
collinear. Multivariate.
3) Bob does not know of some technique at all,
and objects to its use. Statistical power analysis.
Computing R^2 with and without adjustment for the
mean, for non-OLS regression.
4) Bob understands a technique but opposes the widespread
use, for reasons that he cannot articulate. Probably the
present subject of Inference, and data-checking that supports it.
[snip; including 'multicollinearity']
RF >
Quote: The best thing in this case for Richard Ulrich is that he actually
MADE some statement of statistical content so that I could point
out specifically what his errors ARE. When he makes ad
hominem attacks, either misrepresenting my position or without
quoting me, there is nothing of SPECIFIC statistical substance I
can say in reply other than Richard Ulrich has been a proven
Quack in all regression matters (that's a statement of FACT that
had been widely documented in the archives).
My 'discussions' with Bob on regression covered almost
exactly the same territory as Jerry Dallal's later discussions.
Similar content, differently expressed. In both cases, Bob
lost on all salient points. By my accounting, reading as
a reasoning person. Now Bob comes up with the hedge
of 'well-controlled studies' -- but he still won't concede
on smoking studies, so it is still a hedge.
[snip. Including more straw-man argument on inference.
And stuff answered in Case Study. ]
RF >
Quote: My post appeared on Date: Sun, Dec 3 2006 4:37 pm
more than 6 hours before. You should have replied to
THAT post instead of wasting bandwidth making
This sort of stuff is noise, by definition. My definition.
Who posted what, when? Who cares?
I've learned that I post better when I take time to think.
[snip, causal diagrams. More on 'bad studies', including
'malaria.' Quality of studies is a topic that Bob has
previously refused to discuss, and I doubt that he will, yet.]
[snip]
RF >
Quote: That is on the subject of "remote cause" vs "proximity cause"
and the various other meanings of "cause". They are all
about LOGIC, clearly defined and illustrated in Copi's book
on "Introduction to Logic" which many statisticians lack the
knowledge and ALL statistical Quacks in social sciences
lack when they use correlations to make causal inference.
I am SURE I've mentioned Copi's book and the matter of
"cause" to Richard in one of the many corrections of his
errors in using regression to make causal inference.
Yes, Cobi from 1955 or so. And we replied with the more
recent references, the ones that put epidemiology on firmer
ground in the 1960s. Which Bob had never heard of, despite
his professed interest in the topic of inference.
RU> >
Quote: But a single, serendipitous, uncontrolled study *can* answer a
single open question, if that is the state of present knowledge.
RF
I said that in the post that you question in this one. I explained
it above, repeating what's in the "case study" post, that
multiple studies is NEITHER necessary (meaning a single
study may be sufficient) NOR sufficient (meaning multiple
studies by social scientists with faulty causal methodology
can, and often do, ALL lead to the wrong conclusion).
My first thought was --
Bob either did not read my sentence, or Bob is displaying
more reasoning of the atrocious sort.
But I think that Bob is insisting on his use of the word
"uncontrolled" to mean "lousy, with bad variables and no
standards of any kind" -- instead of the standard use of it, which
is almost a synonym for "observational" studies. More trouble
with words.
Shortly Bob says,
[snip, down to ... ]
RF>
Quote: The data from a well-designed and well-controlled study is
ALSO "observational data".
It is the "observational data" in the ABSENCE of a well designed
study (those in the studies by social scientists on whatever
observation they can get, without a design study) -- THOSE
are the ones that are invalid.
I don't know what Bob means by "a design study." Most
published research in epidemiology, etc. -- Bob has previously
rejected most of epidemiology -- is financially supported by
someone who insists on planning and design. It might not
always be great, but it is always done. Even for tracking down
food poisoning at a picnic (one example).
[snip]
RU >
Quote: The problem here, and why I suspected that he was hedging, is
that Bob has never accepted anything concrete as an indicator
of "warranted conclusions."
RF
Is it possible that Richard Ulrich never CITED any well designed
study that drew a correct, warranted conclusion?
That was not 'hedging'.
Cigarette smoking causes lung cancer. Bob?
Bob has drawn sweeping conclusions, and Bob has painted
sweeping condemnations, *without* reading any studies.
Why should any studies help today?
Long ago, I presented details of other epidemiological
work, with never a smile of approval from Bob, and never
any mention from him that they were "well-controlled". Or not.
[snip, stuff covered]
--
Rich Ulrich, wpilib@Pitt.edu
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Reef Fish |
Posted: Tue Dec 05, 2006 8:39 am |
|
|
|
Guest
|
Richard Ulrich wrote:
Quote: I think I responded sufficiently to most of this 343 line post
in my long reply in the "Case Study" thread. Bob is
unnecessarily verbose.
What happened to your kill-file that you wanted to set up so
that we won't hear your repetitive ad hominem posts without
any change to your ERRORS, only more noise?
Quote: Richard Ulrich wrote:
[cross-posted from sci.stat.edu, since Bob has been slurring
me everywhere lately.]
RF
Very interesting statement, coming from Richard Ulrich. I can
predict, with 100% certainty that Richard will be repeating whatever
error I had point out which he called it "slur"?
My prediction was 100% true, in my response. I would be a waste
of time to repeat to Richard Ulrich's new NOISE, old errors.
Quote:
[snip; including 'multicollinearity']
My 'discussions' with Bob on regression covered almost
exactly the same territory as Jerry Dallal's later discussions.
Similar content, differently expressed. In both cases, Bob
lost on all salient points.
Quit aligning yourself with Jerry. You FLATTER yourself.
Jerry made ONE error, and that's NOT in regression. It was
his error on his webpage on using the UNPOOLED variance
in Z in testing Ho: p1 = p2.
The regression error are all YOURS, Richard Ulrich!
You are shameless in your misrepresentation of EVERYONE, and
in your completely unproductive NOISE, instead of spending your
time learning some statistical facts and methodology.
Quote: Who posted what, when? Who cares?
I've learned that I post better when I take time to think.
The first line speaks the truth for your reason for your IGNORANCE.
The second line reflected only your Good-Year blimp size of your
head when you look into your own mirror.
Quote:
[snip, causal diagrams.
[snip]
RF
I am SURE I've mentioned Copi's book and the matter of
"cause" to Richard in one of the many corrections of his
errors in using regression to make causal inference.
Yes, Cobi from 1955 or so.
And you can't even remenber the simple name of the author.
You've NEVER read or understood anything that I pointed to
in that book, which had numerous editions since I took a
course in Logic from it in 1959.
Quote: The data from a well-designed and well-controlled study is
ALSO "observational data".
It is the "observational data" in the ABSENCE of a well designed
study (those in the studies by social scientists on whatever
observation they can get, without a design study) -- THOSE
are the ones that are invalid.
I don't know what Bob means by "a design study."
Of course you don't and are just playing DUMB on my used of
designed studies to establish Causal Inference to require CONTROL
rather than purely observational data without control.
You are just wasting more bandwidth on your rhetoric on words.
Your Quackery and malpractice is getting VERY VERY old, tedious,
and boring.
-- Reef Fish Bob. |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Fri Dec 08, 2006 3:31 am |
|
|
|
Guest
|
On 5 Dec 2006 04:39:50 -0800, "Reef Fish"
<large_nassua_grouper@yahoo.com> wrote:
Quote:
Richard Ulrich wrote:
[snip]
RU > >
Quote: My 'discussions' with Bob on regression covered almost
exactly the same territory as Jerry Dallal's later discussions.
Similar content, differently expressed. In both cases, Bob
lost on all salient points.
RF
Quit aligning yourself with Jerry. You FLATTER yourself.
Jerry made ONE error, and that's NOT in regression. It was
his error on his webpage on using the UNPOOLED variance
in Z in testing Ho: p1 = p2.
Forgotten already? I had in mind the long discussion
where Jerry always did allow that the Banker, in the example
on his website, *might* have good reason to expect the
direction of a regression coefficient. June, 2005, for part of it --
http://groups.google.com/group/sci.stat.math/msg/58fcfbb9f1788c19
Jerry was always mild, but never gave up that point.
Or others.
I will "align myself with Jerry" because it was several of his
threads with Bob that I used for comparative study. They
demonstrated to me that Bob's failures of reasoning were
frequent and obvious, and were *not* an artifact of my
perceptions of my own threads with Bob. Moreover, his
discussion of regression covered the same ground.
[snip]
Quote:
RF
The data from a well-designed and well-controlled study is
ALSO "observational data".
It is the "observational data" in the ABSENCE of a well designed
study (those in the studies by social scientists on whatever
observation they can get, without a design study) -- THOSE
are the ones that are invalid.
RU
I don't know what Bob means by "a design study."
RF
Of course you don't and are just playing DUMB on my used of
designed studies to establish Causal Inference to require CONTROL
rather than purely observational data without control.
No. I am questioning the use of *words* because Bob
sounds so far off base. He is using words that have a
conventional meaning in research, but he seems to be
using them otherwise. Studies have a design. Just about
all of them, so that says very little. The phrase "without a
design study" could be meaningful, but it is odd. Did he
mean "designED study"? ... as he says in his next sentence,
so I was confused by a typo?
However, practically, they all are "designed", good
ones and bad ones, prospective or retrospective.
- I think Bob is assuming some definitions that
I would be careful with, but I'm trying to follow, here.
Now, he adds 'control' to explain what seemed
confusing in the phrase "design study." Okay.
A "designed study" must have some sort of "control",
and that might distinguish it from an "exploratory study".
Is that it?
Does Bob intend to say that what he is 'nihilistic' about is
any study that does not have an explicit control *group*
with new data? (That did seem to be the tenor of his
comments months ago, about cigarette smoking and
lung cancer, where the lack of *randomized* control
seemed fatal. Do I mis-recall?)
The word 'control' seems to be a problem.
In reply to a need for a group -
I'll return to my long-ago example of epidemiologists
trying to determine the cause of food poisoning at a
picnic. No control, by our definition. But, assuredly,
there is design, and there are multiple variables, and
there are inferences. The inferences work. Surely,
this is "purely observational data without control", unless
you accept the internal statistical controls. Isn't it?
--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Reef Fish |
Posted: Fri Dec 08, 2006 12:39 pm |
|
|
|
Guest
|
Richard Ulrich wrote:
Quote: On 5 Dec 2006 04:39:50 -0800, "Reef Fish"
large_nassua_grouper@yahoo.com> wrote:
Richard Ulrich wrote:
[snip]
RU
My 'discussions' with Bob on regression covered almost
exactly the same territory as Jerry Dallal's later discussions.
Similar content, differently expressed. In both cases, Bob
lost on all salient points.
RF
Quit aligning yourself with Jerry. You FLATTER yourself.
Jerry made ONE error, and that's NOT in regression. It was
his error on his webpage on using the UNPOOLED variance
in Z in testing Ho: p1 = p2.
Forgotten already? I had in mind the long discussion
where Jerry always did allow that the Banker, in the example
on his website, *might* have good reason to expect the
direction of a regression coefficient. June, 2005, for part of it --
http://groups.google.com/group/sci.stat.math/msg/58fcfbb9f1788c19
You forgot that I have photographic memory on the substance of every
STATISTICAL discussion in these groups?
We were discussing the MEANING of the regression coefficient as the
PARTIAL correlation information, in the presence of all OTHER
variables in the equation equation.
Jerry pulled out that example, WITHOUT giving any reason HOW Roy
Welsh or the banker reasoned the "expected sign" other than the usual
misinterpretation of the correlation coefficient.
Quote:
Jerry was always mild, but never gave up that point.
Or others.
I was being easy on Jerry for his INCOMPLETE information when he
told the tale -- if you insist to drag Jerry down the gutter with YOU,
chalk that up as Jerry's SECOND error. His THIRD was on p-values.
That's about all the errors I can think of that Jerry made.
In the banker case, he had NO REASON other than thinking it was
a SIMPLE regression coefficient and interpreted it that way. If the
sign was "correct" as it was found later, it would be RIGHT result
for the WRONG reason -- the same kind of result as those
drawing causal inference from uncontrolled and undesigned studies
once in a while came to the correct result by LUCK, not because
of correct methodolgy or statistical reason.
Quote:
I will "align myself with Jerry" because it was several of his
threads with Bob that I used for comparative study. They
demonstrated to me that Bob's failures of reasoning were
frequent and obvious, and were *not* an artifact of my
perceptions of my own threads with Bob. Moreover, his
discussion of regression covered the same ground.
[snip]
I have NOW given my definitive answer to the banker example
Jerry brought out. You dragged him down the gutter with you.
I can't say in this case how much fault belonged to Roy Welsh,
because it was second hand from Jerry.
But I know THIS FOR SURE: Belsley and Kuh wrote a book
in which they unmistakably made the MISTAKE of misinterpreting
the correlation coefficient. No question about it. So much so
that I know FIRST HAND (I cannot reveal the identity of the
reviewer of their manuscript to Wiley which recommended
Wiley not to publish it because of those ERROR) that they are
NOT statisticians and they make plenty of statistical errors.
Roy co-authored the book Regression Diagnostics with Belsley
and Kuh, so it is POSSIBLE that Roy made the same error they
did, but I don't want to draw any such conclusion without my
first hand knowledge of that banker-Welsh case.
Quote: RU
I don't know what Bob means by "a design study."
RF
Of course you don't and are just playing DUMB on my used of
designed studies to establish Causal Inference to require CONTROL
rather than purely observational data without control.
No. I am questioning the use of *words* because Bob
sounds so far off base.
You are repeating your rhetorial NOISE.
Quote: Now, he adds 'control' to explain what seemed
confusing in the phrase "design study."
For drawing CAUSAL inference, "control" is ALWAYS part
of the design. For other designs, it may or may not.
Quote:
The word 'control' seems to be a problem.
Only for Quacks like Richard Ulrich.
What happened to the kill-file you threatened to put me in
and I URGED you and STRONGLY URGED you to do so?
You're just polluting this group too much with your continued
noise of your ERRORS, now dragging Jerry Dallal down with
you without helping YOUR CASE the slightest.
-- Reef Fish Bob. |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Sat Dec 09, 2006 3:07 am |
|
|
|
Guest
|
- I will add little here -
On 8 Dec 2006 08:39:59 -0800, "Reef Fish"
<large_nassua_grouper@yahoo.com> wrote:
RU > >
Quote: Forgotten already? I had in mind the long discussion
where Jerry always did allow that the Banker, in the example
on his website, *might* have good reason to expect the
direction of a regression coefficient. June, 2005, for part of it --
http://groups.google.com/group/sci.stat.math/msg/58fcfbb9f1788c19
RF
You forgot that I have photographic memory on the substance of every
STATISTICAL discussion in these groups?
These "photos" in the memory were apparently badly maintained
or never very accurate.
Anyone interested can check and see that Jerry went on
to defend the notion of inference, etc., in the same post.
And Bob didn't remember it, and still has trouble seeing it.
[snip, a bunch. Insistence on 'control' without defining it.]
RU >
Quote: Now, he adds 'control' to explain what seemed
confusing in the phrase "design study."
RF
For drawing CAUSAL inference, "control" is ALWAYS part
of the design. For other designs, it may or may not.
RU
The word 'control' seems to be a problem.
RF
Only for Quacks like Richard Ulrich.
I suppose "control" might deserve to be added to the
list of words that Bob can't use like researchers use it.
Historical control. General population statistics as
control. Statistical control.
I think I'll agree that every study doing inference has
"control". Bob's nihilism in inference from observational
studies remains unexplained.
[snip]
RF >
Quote: What happened to the kill-file you threatened to put me in
and I URGED you and STRONGLY URGED you to do so?
You're just polluting this group too much with your continued
noise of your ERRORS, now dragging Jerry Dallal down with
you without helping YOUR CASE the slightest.
I suppose someone who is a responsible citizen with sound
judgment has to stay around to ameliorate Bob's verbal assaults
on folks, and to point out Bob's errors, whether they are strictly
statistical, or are a more general misuse of language (while
abusing other posters).
As I listed a few days ago, reformatted -
Bob's simpler errors, by type -
1) Bob does not know the meaning of a word, and
objects to its use.
Appreciate.
Schizophrenia.
2) Bob has an idiosyncratic meaning for a word, and
defends it, and attacks the mainstream usage.
Multi-collinear.
Multivariate.
3) Bob does not know of some technique at all,
and objects to its use.
Statistical power analysis.
Computing R^2 with and without adjustment for the
mean, for non-OLS regression.
4) Bob understands a technique but opposes the widespread
use, for reasons that he cannot articulate.
Probably the present subject of Inference, and data-checking that
supports it.
--
Rich Ulrich, wpilib@pitt.edu
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Reef Fish |
Posted: Sat Dec 09, 2006 9:00 am |
|
|
|
Guest
|
Richard Ulrich wrote:
Quote: - I will add little here -
No need.
We've head the same noise already, in THIS thread, and in
many other threads in which you made the errors.
You didn't even belong to the thread in the first place when you
rudely butted in!
I was LOUDING Jack Tomsky to him and to the readers, of his
manner of acknowledging his error when pointed out:
Quote: Ray, you're right. Thanks for the correction.
Jack
RF> Kudos for Jack Tomsky who is almost always right in the two
RF> years I've been in this group! Of the the RARE time of two when
RF> he was corrected (once or twice by me), the response is ALWAYS
RF> as direct and succinct, on obvious minor careless slips.
I also lamented on the behavior of a few in the group who wastes
THOUSANDS of lines making noise trying to defend their obvious
errors and concluded with:
RF> This is my Mini Editorial on "Simple admission and acknowledgment
RF> or errors vs lengthy and repetitive NOISE bullying those who
RF> pointed out and corrected their errors."
That was when Richard Ulrich bounced in, rehashing his ad hominem
attack.
When pointed out he didn't belong to that thread, he started THIS
thread and made FOUR posts continuing his NOISE and ad hominem
attacks.
Now he is back to the original thread where he didn't belong again
(I haven't read it yet).
Richard, your Quackery and malpractice are FULLY documented
and discussed dozens of times. It's ALL in the archives. No amount
of ad hominem rehash from you is going to change ANYTHING.
-- Reef Fish Bob. |
|
|
| Back to top |
|
| |
Page 1 of 2 Goto page 1, 2 Next
All times are GMT - 5 Hours
The time now is Wed Dec 03, 2008 10:53 pm
|
|