| |
 |
|
|
Science Forum Index » Statistics - Math Forum » Statistical Conventions in Social Science papers?
Page 1 of 1
|
| Author |
Message |
| cody |
Posted: Thu Dec 28, 2006 12:32 pm |
|
|
|
Guest
|
My girlfriend showed me a social science paper that she was required to
read for class. Without going into too many details that would turn
this into a personal attack, it was written by a European professor
whose English was utterly illiterate. The paper gave a table that
summarized the scores from surveys, with respondents split into four
categories.
- The table talked about the "F test with 4 df", apparently referring
to the test for overall differences. Is it allowed to just drop the
denominator df like that, or is the writer just illiterate in
statistics as well?
- The table just gave F-statistics without saying their p-values. I'm
guessing again that that's just poor statistics?
- The paper also said stuff like "2 < 1,3,4" in comparing the group
means. I don't recall seeing that sort of notation in my categorical
data analysis course; it would seem to bring up a multiple testing
issue as well. Is this an accepted convention, or more illiteracy? |
|
|
| Back to top |
|
| Guest |
Posted: Thu Dec 28, 2006 3:15 pm |
|
|
|
|
Perhaps these conventions were understood in social science circle.
"cody" <dorpus@hotmail.com> wrote in message
news:1167323570.981381.135460@n51g2000cwc.googlegroups.com...
Quote: My girlfriend showed me a social science paper that she was required to
read for class. Without going into too many details that would turn
this into a personal attack, it was written by a European professor
whose English was utterly illiterate. The paper gave a table that
summarized the scores from surveys, with respondents split into four
categories.
- The table talked about the "F test with 4 df", apparently referring
to the test for overall differences. Is it allowed to just drop the
denominator df like that, or is the writer just illiterate in
statistics as well?
- The table just gave F-statistics without saying their p-values. I'm
guessing again that that's just poor statistics?
- The paper also said stuff like "2 < 1,3,4" in comparing the group
means. I don't recall seeing that sort of notation in my categorical
data analysis course; it would seem to bring up a multiple testing
issue as well. Is this an accepted convention, or more illiteracy?
|
|
|
| Back to top |
|
| Reef Fish |
Posted: Thu Dec 28, 2006 4:21 pm |
|
|
|
Guest
|
cody wrote:
Quote: My girlfriend showed me a social science paper that she was
required to read for class. it was written by a European professor
whose English was utterly illiterate.
Is that professor Portuguese? :-)
Quote: - The table talked about the "F test with 4 df",
Yes, that is a major blunder since an F-distribution is given
by two degrees-of-freedom numbers.
Quote: - The table just gave F-statistics without saying their p-values.
I'm guessing again that that's just poor statistics?
That's more excusable. There are many in THIS group, from
Portuguese to American posters who do not know the meaning
of p-values. The lean on computer manuals that make errors
and assume that a "canned program sold for money" must be
correct, or something to that fallacious effect.
Quote: - The paper also said stuff like "2 < 1,3,4" in comparing the group
means. I don't recall seeing that sort of notation in my categorical
data analysis course; it would seem to bring up a multiple testing
issue as well. Is this an accepted convention, or more illiteracy?
That's actually even more excusable on commonsense ground
even though the statement in quote is senseless, and I have never
seen such usage or "notation", but the meaning can be correctly
guessed especially if there are other descriptions that go with
that isolated statement.
But, given the overall impression you got and the errors therein,
there is not much hope that the "paper" can teach anyone
anything other than malpractice Quackery. :-)
-- Reef Fish Bob. |
|
|
| Back to top |
|
| David Winsemius |
Posted: Thu Dec 28, 2006 9:04 pm |
|
|
|
Guest
|
"Reef Fish" <large_nassua_grouper@yahoo.com> wrote in
news:1167337286.578317.202960@42g2000cwt.googlegroups.com:
Quote: There are many in THIS group, from
Portuguese to American posters who do not know the meaning
of p-values. The lean on computer manuals that make errors
and assume that a "canned program sold for money" must be
correct, or something to that fallacious effect.
There are many in this group who disagree with Reef Fish's posting from
September. Every textbook consulted disagreed. Reef Fish even disagreed
with himself when he posted recently.
<news:1166657003.755651.226940@48g2000cwx.googlegroups.com>
One person cited SPSS, I cited a function in R (not sold for money). I
also cited textbooks:
copying from
<news:Xns9832DF4EC07dwtttttt@216.196.97.136>
--------definitons from math stats texts------------
Cox and Hinkley says a "level of significance" p_obs is defined as:
p_obs= Pr(T >= t_obs;H0)
Kalbfleisch uses significance level as
SL == Pr(D>= D_obs|H0)
DeGroot defined p-value is sup(Pr(T >= t|theta))
As I read the three authorities above, they agree with RFB's disputants
who would use the language "equal to or more extreme than".
Freund says that the [left] critical region of size alpha/2 is
X <= K_alpha ; where K_alpha is the largest integer for which
sum(Pr(Bin(y;n,theta)) <= alpha/2
As I read Freund, he also disagrees with RFB, because the first integer
for which sum Pr(...) is greater than 0.025 in the series above is X = 3,
so X=2 is in the critical region (where p-value <0.025).
-----------------------------------------------------------
Reef Fish is just trolling again from his reef in the Sea of Zero
Probability. What's the p-value for a pair of die that comes up snake
eyes, RF_Bob? P-value equal zero? What's the sum of the p-values over all
possible outcomes of throws times their expected values in that
formulation. A probability less than one? How interesting!
I invite any one to submit a problem involving a discrete outcome and see
what Bob says. If your problem is well formed, Reef Fish will
"disqualify" you.
--
David Winsemius |
|
|
| Back to top |
|
| Reef Fish |
Posted: Thu Dec 28, 2006 10:59 pm |
|
|
|
Guest
|
David Winsemius wrote:
Quote: "Reef Fish" <large_nassua_grouper@yahoo.com> wrote in
news:1167337286.578317.202960@42g2000cwt.googlegroups.com:
There are many in THIS group, from
Portuguese to American posters who do not know the meaning
of p-values. The lean on computer manuals that make errors
and assume that a "canned program sold for money" must be
correct, or something to that fallacious effect.
There are many in this group who disagree with Reef Fish's posting from
September. Every textbook consulted disagreed.
Misrepresentation and BOLD FACE LIE. Deja vu. That's why David
Winsemius has earned his Disqualification Status in view of
o Your FREE lesson days are over.
o Your HISTORY of past behavior of frequent frivolus nitpick
disqualified you from a reply from Reef Fish Bob
Here's a HINT for the other readers why what's copied is NOT
the definition NOR the standard application of a "p-value of a test"!
Quote: One person cited SPSS, I cited a function in R (not sold for money). I
also cited textbooks:
copying from
news:Xns9832DF4EC07dwtttttt@216.196.97.136
--------definitons from math stats texts------------
Cox and Hinkley says a "level of significance" p_obs is defined as:
p_obs= Pr(T >= t_obs;H0)
Where is the Alternative Hypothesis? Not the p-VALUE of a test!!!
Quote:
Kalbfleisch uses significance level as
SL == Pr(D>= D_obs|H0)
DeGroot defined p-value is sup(Pr(T >= t|theta))
Not the p-VALUE either. Where is the Alternative Hypothesis?
-- Reef Fish Bob. |
|
|
| Back to top |
|
| Reef Fish |
Posted: Sat Dec 30, 2006 7:24 am |
|
|
|
Guest
|
David A. Heiser wrote:
Quote: "Reef Fish" <large_nassua_grouper@yahoo.com> wrote in message
news:1167361171.466224.211680@k21g2000cwa.googlegroups.com...
David Winsemius wrote:
"Reef Fish" <large_nassua_grouper@yahoo.com> wrote in
news:1167337286.578317.202960@42g2000cwt.googlegroups.com:
There are many in THIS group, from
Portuguese to American posters who do not know the meaning
of p-values. The lean on computer manuals that make errors
and assume that a "canned program sold for money" must be
correct, or something to that fallacious effect.
There are many in this group who disagree with Reef Fish's posting from
September. Every textbook consulted disagreed.
+++++++++++++++++++++++++++++++++++++++++++
First of all I will get chewed out for entering this discussion.
I agree with Bob on his interpretation.
Every testbook you consulted is as Bob said made an error of ommission.
The error of omission was in NOT stating the Alternative Hypothesis,
which is critical in the DEFINITION of the p-value.
But the examples carefully selected by David Winsemius were much
worse than that! Below are the reasons for my statement, spelled out:
RF> > Not the p-VALUE either. Where is the Alternative Hypothesis?
DW> Cox and Hinkley says a "level of significance" p_obs is defined
as:
DW> p_obs= Pr(T >= t_obs;H0)
"p_obs" is clearly not the p-value; but they called their p_obs a
"level of significance" which is patently absurd!
The significance level of a test = Pr( rejecting Ho ! Ho is true).
There is NO OTHER definition.
DW is quoting out-of-context so badly that he had Cox and Hinkley
saying that ALL Alternative Hypotheses are one-tailed, the "greater
than" tail.
Those are just some of the reasons that earned DW and JD their
Disqualification Status.
What you said is correct, but in a much less EXPLICIT way of pointing
out the errors.
-- Reef Fish Bob.
Quote:
The cemtral issue is properly defining the complete domain of test results.
Just making a statement only about the hypothesis defines only a part of
this domain. The reader then has to assume what is the rest of the domain
envisioned by the proposer.
Most textbooks assume that the reader knows what the domain is, so just
giving H0 they assume that the reader knows what Ha is. Bob has however
pointed out in other messages, that this is not necessarily true. To argue
why textbooks do this is beyond this sequence.
DAH
++++++++++++++++++++++++++++++++++++++++
Not the p-VALUE either. Where is the Alternative Hypothesis?
-- Reef Fish Bob.
|
|
|
| Back to top |
|
| Kevin E. Thorpe |
Posted: Sat Dec 30, 2006 8:54 am |
|
|
|
Guest
|
On Dec 30, 6:24 am, "Reef Fish" <large_nassua_grou...@yahoo.com> wrote:
Quote: The error of omission was in NOT stating the Alternative Hypothesis,
which is critical in the DEFINITION of the p-value.
As I explained earlier this year, the difference is in whether
or not you approach this in a Fisherian way or a pure NP way.
In the case of continuous data, the strict inequality does not
affect the computation of a p-value, but for discrete data it
does, which is why NP suggests randomization on the
boundary.
Quote: But the examples carefully selected by David Winsemius were much
worse than that! Below are the reasons for my statement, spelled out:
David's examples (which I recognize) are both from
descriptions of Fisher-type significance testing, not NP
hypothesis testing. As I pointed out, Fisher did not
require a formal alternative hypothesis.
I know you disagree, that is fine. In reality, what is now
taught is often a hybrid of NP and Fisher theory. You
yourself have rejected it in favour of Bayesian approaches
anyway, suggesting there are multiple approaches to the
problem.
Kevin |
|
|
| Back to top |
|
| Reef Fish |
Posted: Sat Dec 30, 2006 10:41 am |
|
|
|
Guest
|
Kevin E. Thorpe wrote:
Quote: On Dec 30, 6:24 am, "Reef Fish" <large_nassua_grou...@yahoo.com> wrote:
The error of omission was in NOT stating the Alternative Hypothesis,
which is critical in the DEFINITION of the p-value.
As I explained earlier this year, the difference is in whether
or not you approach this in a Fisherian way or a pure NP way.
I consider my statistical education about as "old fashioned" as one
can be without being labeled as a dinosaur. Fisher's way was
already out-of-style long before I started statistics. It is
definitely
out of style NOW.
Quote:
In the case of continuous data, the strict inequality does not
affect the computation of a p-value, but for discrete data it
does, which is why NP suggests randomization on the
boundary.
That is the nonessential part of the error. To take ALWAYS the
one-sided alternative and call it a "significance level" is an ERROR,
a major BLUNDER, no matter how you argue it.
Quote:
But the examples carefully selected by David Winsemius were much
worse than that! Below are the reasons for my statement, spelled out:
David's examples (which I recognize) are both from
descriptions of Fisher-type significance testing, not NP
hypothesis testing. As I pointed out, Fisher did not
require a formal alternative hypothesis.
That is why he is THE dinosaur. What about the one-tailed
vs two-tailed tests?
I don't mind the discussion of the historical perspective of
how hypothesis testing evolved, but to pull out those anti-
que ideas and quote them as if they are APPLICABLE in
today's statistics environment, is at best a diberate distortion,
and at worst a peddling of ERRONEOUS concepts as is
correctly practiced by all reputable authors today.
Quote:
I know you disagree, that is fine. In reality, what is now
taught is often a hybrid of NP and Fisher theory. You
yourself have rejected it in favour of Bayesian approaches
anyway, suggesting there are multiple approaches to the
problem.
But that's a different issue becuase BOTH theories are
conceptually DEFECTIVE, but that's not the same as saying
I accept the Fisherian version as VALID when it can't even
distinguish a one-tail from a two-tailed test, and it disregards
the Alternative Hypothesis.
In the end, the Fisherian way NEVER had a DEFINITION of
a p-value.
It is a gross error on the part of all, who thinks that what Cox
and Hinkley and others call the p_obs a p-value.
Read this VERY carefully,
A p-value does NOT exist in the Fisherian context
It exists, and is meaningful ONLY when it works in conjection
with an Alternative Hypothesis to be able tell what "more
extreme" means.
In the end, what you are pointing out is that David Winsemius,
and possibly yourself, are using the INAPPRORIATE concepts
and definition AS IF what is NOT a p-value in the Fisher sense
is the current definition of a p-value.
That is WRONG - no matter how you cut it. A p-value does
NOT exist in the Fisherian framework. Significance testing,
yes, but in a very restricted and narrow way.
That is the bottom line.
-- Reef Fish Bob.
|
|
|
| Back to top |
|
| Kevin E. Thorpe |
Posted: Sat Dec 30, 2006 5:58 pm |
|
|
|
Guest
|
On Dec 30, 9:41 am, "Reef Fish" <large_nassua_grou...@yahoo.com> wrote:
Quote: Kevin E. Thorpe wrote:
On Dec 30, 6:24 am, "Reef Fish" <large_nassua_grou...@yahoo.com> wrote:
The error of omission was in NOT stating the Alternative Hypothesis,
which is critical in the DEFINITION of the p-value.
As I explained earlier this year, the difference is in whether
or not you approach this in a Fisherian way or a pure NP way.I consider my statistical education about as "old fashioned" as one
can be without being labeled as a dinosaur. Fisher's way was
already out-of-style long before I started statistics. It is
definitely
out of style NOW.
In the case of continuous data, the strict inequality does not
affect the computation of a p-value, but for discrete data it
does, which is why NP suggests randomization on the
boundary.
That is the nonessential part of the error. To take ALWAYS the
one-sided alternative and call it a "significance level" is an ERROR,
a major BLUNDER, no matter how you argue it.
Let me begin this reply by reminding you that when I first
ventured in to this discussion earlier this year, I did agree
that in the NP framework, extreme is clearly a strict
inequality. The critical region is a strict inequality and
by extension so is the calculation of a p-value from a
test statistic.
Second, I am not advocating for one approach or another
in these threads, just trying to present an opposing view,
hopefully, for the benefit of others. I know I have learned
things by thinking about these discussions.
My purpose in replying in this thread at all is to point out
that the context of the cited references from David Winsemius
is essential to the correct understanding of what the authors
were talking about.
For the benefit of others, I will briefly outline my understanding
of Fisher significance testing. There is a NULL hypothesis.
There is a test statistic defined so that large values are
inconsistent with the NULL in an "appropriate" way. Note:
it is this that took a great deal of heat from NP. Then,
with some data in hand, you wish to know, "are my data
inconsistent with the NULL." The SIGNIFICANCE LEVEL
is then the probability of observing a test statistic as big
or bigger as the one obtained from my data, given the NULL
is true.
This may look like it is always one sided, but it's not.
If your test statistic is a likelihood ratio test compared with
a chi-square distribution, it would be two-tailed.
An obvious problem is how to define more extreme in the
absence of an explicit alternative. Many would argue
that you can't, while others would say, "it is obvious
what is intended from the context."
Quote: Read this VERY carefully,
A p-value does NOT exist in the Fisherian context
It exists, and is meaningful ONLY when it works in conjection
with an Alternative Hypothesis to be able tell what "more
extreme" means.
I think I would agree a significance level is not the same thing
as a p-value from NP testing.
I will close with a question. When you compute Fisher's
Exact Test on a 2X2 table, how do treat the observed table
in your calculation? |
|
|
| Back to top |
|
| Reef Fish |
Posted: Sat Dec 30, 2006 6:39 pm |
|
|
|
Guest
|
Kevin E. Thorpe wrote:
Quote: On Dec 30, 9:41 am, "Reef Fish" <large_nassua_grou...@yahoo.com> wrote:
Kevin E. Thorpe wrote:
On Dec 30, 6:24 am, "Reef Fish" <large_nassua_grou...@yahoo.com> wrote:
The error of omission was in NOT stating the Alternative Hypothesis,
which is critical in the DEFINITION of the p-value.
As I explained earlier this year, the difference is in whether
or not you approach this in a Fisherian way or a pure NP way.I consider my statistical education about as "old fashioned" as one
can be without being labeled as a dinosaur. Fisher's way was
already out-of-style long before I started statistics. It is
definitely
out of style NOW.
In the case of continuous data, the strict inequality does not
affect the computation of a p-value, but for discrete data it
does, which is why NP suggests randomization on the
boundary.
That is the nonessential part of the error. To take ALWAYS the
one-sided alternative and call it a "significance level" is an ERROR,
a major BLUNDER, no matter how you argue it.
Let me begin this reply by reminding you that when I first
ventured in to this discussion earlier this year, I did agree
that in the NP framework, extreme is clearly a strict
inequality.
That is already putting the cart before the horse! The defintion
of p-value depends on the notion of "(as) or MORE extreme" --
the equality is only a secondary issue relative to what is the
Alternative Hypothesis!
But you CANNOT have a p-value WITHOUT an explicit
notion of the Alternative, to know what "more extreme" means!
Quote: The critical region is a strict inequality
Not necessarily!
Quote: and by extension so is the calculation of a p-value from a
test statistic.
The crux of your fallacy is that you kept DENYING that an
Alternative Hypothesis is a NECESSARY and indispensible
ingredient in the definition of p-value.
The rest of the equality vs inequality are just red-herrings
introduced in the discussion which did not become clear to
me until your most recent posts. They are RED HERRINGS..
The notion of a p-value DOES NOT EXIST in Fisherian
Hypothesis Testing. Only in the Neyman-Pearson framework,
taking into explicit consideration of the one-tail or two-tailed
nature of the Alternative, and the DIRECTION of the one-tail.
Quote: Second, I am not advocating for one approach or another
in these threads, just trying to present an opposing view,
hopefully, for the benefit of others. I know I have learned
things by thinking about these discussions.
That's beside the point. I am not advocating the use of N-P
hypothesis testing ideas either, but between the N-P and
Fisherian versions, the latter is UNACCEPTIBLE (in the same
manner as fiducial intervals which is at best gibberish). The
notion of a p-value is clearly and explicitly defined (as quoted
in my post in reply to Bruce Weaver). It was 100% self-contained,
for continuous OR discrete test statistics.
Quote:
My purpose in replying in this thread at all is to point out
that the context of the cited references from David Winsemius
is essential to the correct understanding of what the authors
were talking about.
That you help to make clear that it was Winsemius's MISTAKEN
notion of a p-value, which he thought was the meaning of p-
value, and argued on that basis.
You post helped ME see, unequivocally, that whatever Winsemius
quoted are out-of-context, and Fisherian type of gibberish, that
he ill-acquired in his statistical education.
Quote:
For the benefit of others, I will briefly outline my understanding
of Fisher significance testing.
I leave that podium to you. I had already said it quite clearly that
the Fisherian approach has NO p-values, and is as old and out
of date as a monkey when it first learned to stand erect, to take
a swing at some staistical fruit.
It has NO PLACE in Modern Statistics, other than its historical
value of how he blundered.
-- Reef Fish Bob.
Quote: There is a NULL hypothesis.
There is a test statistic defined so that large values are
inconsistent with the NULL in an "appropriate" way. Note:
it is this that took a great deal of heat from NP. Then,
with some data in hand, you wish to know, "are my data
inconsistent with the NULL." The SIGNIFICANCE LEVEL
is then the probability of observing a test statistic as big
or bigger as the one obtained from my data, given the NULL
is true.
This may look like it is always one sided, but it's not.
If your test statistic is a likelihood ratio test compared with
a chi-square distribution, it would be two-tailed.
An obvious problem is how to define more extreme in the
absence of an explicit alternative. Many would argue
that you can't, while others would say, "it is obvious
what is intended from the context."
Read this VERY carefully,
A p-value does NOT exist in the Fisherian context
It exists, and is meaningful ONLY when it works in conjection
with an Alternative Hypothesis to be able tell what "more
extreme" means.
I think I would agree a significance level is not the same thing
as a p-value from NP testing.
I will close with a question. When you compute Fisher's
Exact Test on a 2X2 table, how do treat the observed table
in your calculation? |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Fri Dec 05, 2008 1:49 am
|
|