 |
|
| Science Forum Index » Space - Consult Forum » ANOVA and measurement repeatability study... |
|
Page 1 of 2 Goto page 1, 2 Next |
|
| Author |
Message |
| ... |
Posted: Sat Oct 24, 2009 8:42 pm |
|
|
|
Guest
|
Greetings:
I am running an experiment to test the repeatability of a parts
measuring system.
The experiment involves taking a part from the assembly line, placing
it in its fixture, having two inspectors take measurements a pre-
selected points -- each point measured twice -- in random order. The
part is them removed fro the fixture, replaced back into it....and the
process repeated 4 more times.
Ho: There is no difference between measurements taken after repeated
mountings in the fixture
My assumption was that by comparing the p-value for the mountings, I
could determine if this null assumption should be rejected.
However, I just read a paper which stated that the repeatability
should be calculated as
REPEAT = 5.15 * SQRT(Mean Square of Error term)
Why is this? How do you use this resulting value to determine YES/NO
of system repeatability?
Is it incorrect to use the p-value?
Explanations appreciated!
Thanx |
|
|
| Back to top |
|
|
|
| Ray Koopman... |
Posted: Sat Oct 24, 2009 9:25 pm |
|
|
|
Guest
|
On Oct 24, 11:42 pm, voice_of_rea... at (no spam) australia.edu wrote:
[quote]Greetings:
I am running an experiment to test the repeatability of a parts
measuring system.
The experiment involves taking a part from the assembly line,
placing it in its fixture, having two inspectors take measurements a
pre-selected points -- each point measured twice -- in random order.
The part is them removed from the fixture, replaced back into it....
and the process repeated 4 more times.
Ho: There is no difference between measurements taken after repeated
mountings in the fixture
My assumption was that by comparing the p-value for the mountings,
I could determine if this null assumption should be rejected.
However, I just read a paper which stated that the repeatability
should be calculated as
REPEAT = 5.15 * SQRT(Mean Square of Error term)
Why is this? How do you use this resulting value to determine YES/NO
of system repeatability?
Is it incorrect to use the p-value?
Explanations appreciated!
Thanx
[/quote]
It is not clear exactly what your design is.
How many parts were tested? Were they nominally the same?
Did the same 2 inspectors do all 4 replications on all n parts?
That is. what were the nesting/crossing relations
among inspectors, parts, and replications?
How many preselecte points were there?
Do they represent different dependent variables,
or different levels of the same variable? |
|
|
| Back to top |
|
|
|
| Rich Ulrich... |
Posted: Sun Oct 25, 2009 3:50 pm |
|
|
|
Guest
|
On Sat, 24 Oct 2009 23:42:26 -0700 (PDT),
voice_of_reason at (no spam) australia.edu wrote:
[quote]Greetings:
I am running an experiment to test the repeatability of a parts
measuring system.
The experiment involves taking a part from the assembly line, placing
it in its fixture, having two inspectors take measurements a pre-
selected points -- each point measured twice -- in random order. The
part is them removed fro the fixture, replaced back into it....and the
process repeated 4 more times.
Ho: There is no difference between measurements taken after repeated
mountings in the fixture
My assumption was that by comparing the p-value for the mountings, I
could determine if this null assumption should be rejected.
[/quote]
Well, you have not specified what you are testing, or
what excess/ error might you cause you to decide that
there is *too* much irregularity/unreliability.
From what you describe, I think you could perform tests
to see if one rater is systematically different from the other;
or whether the selected points differ systematically; or
whether *order* systematically matters.
However --
a) all of those could differ "significantly" by the tests,
and your measurements *could* still be accurate enough
for your purposes; or,
b) it could happen that none of the tests "reject" while
the inherent accuracy of measurement is too poor to be
acceptable.
Your potential tests will tell you if the means are different.
You have to examine the means in order to decide, on
your own, what to do about it.
If the mean differences are all tiny enough to ignore,
despite being "significant" by a test, then you might be
best served to ignore them. - As a student, I participated
in an experiment that used experienced, paired nurses to
measure blood pressure. One nurse measured systematically
4 mm higher than the others, and that was later determined
to be blamed on her head-cold (muffling her hearing) on that
particular day. In any case, the 4 mm was a trivial difference.
- You can ignore them, or you can act on what you can learn
from them, in order to improve precision in the future.
If the mean differences are *large* enough to be a
serious concern, despite being "not significant", then
you have a problem -- since you therefore have no
confidence that any number given is a good estimate.
[quote]
However, I just read a paper which stated that the repeatability
should be calculated as
REPEAT = 5.15 * SQRT(Mean Square of Error term)
[/quote]
This is 5.15 times the "error of measurement" for your
whole system. It gives a a number to use as a range (or
maybe a half-range) for any given measurement. It does
not say anything about systematic differences that
might have been detected by the tests.
Using plus-or-minus (2* sqrt(MSE) ) would give approximately
a 95% confidence interval for a single measurement. Is this
small enough? The author of your paper was being more
strict than 95% CI, and was probably accounting for the
N (degrees of freedom) for his particular study... since
I don't recognize the relevance of 5.15. The usual CI's
are derived from tables of the t-test,
*You* have to make the decision of whether a measurement
is precise enough to be useful. What decision is being made?
Do you want a vague warning when a number seems off-target?
Do you want to be *sure* that something is wrong when
a number seems off? -- The smaller the range, the better
the decision is apt to be.
[quote]
Why is this? How do you use this resulting value to determine YES/NO
of system repeatability?
Is it incorrect to use the p-value?
[/quote]
What you learn from the p-values can tell you about
sources of error that do exist, and which *might* be
decreased; but it does not tell you whether those errors
have to matter to you.
--
Rich Ulrich |
|
|
| Back to top |
|
|
|
| ... |
Posted: Sun Oct 25, 2009 9:58 pm |
|
|
|
Guest
|
Hi, thanks for your response.....
On Oct 25, 3:25 pm, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
[quote]
How many parts were tested?
On Oct 24, 11:42 pm, voice_of_rea... at (no spam) australia.edu wrote:
The experiment involves taking a part from the assembly line....(i.e. one part)
Did the same 2 inspectors do all 4 replications on all n parts?
.....having THE SAME two inspectors take measurements at pre-selected points -- each point measured twice....
[/quote]
[quote]How many preselecte points were there?
[/quote]
How does this affect whether or not I can use the p-value in the
manner I originally assumed?
Again, my original approach was....
[quote]Ho: There is no difference between measurements taken after repeated
mountings in the fixture
My assumption was that by comparing the p-value for the mountings,
I could determine if this null assumption should be rejected.
[/quote]
To put it another way:
One part was measured repeatedly. If the mean difference in the
readings obtained is significant, then it seems to me the measurement
system is unstable (non-repeatable). Doesn't the p-value indicate the
probability that the results are explained by the null hypothesis? As
such, doesn't a p-value less than the selected alpha indicate that the
null hypothesis should be rejected? And as such doesn't that imply
that the measuring system is NOT producing repeatable results? |
|
|
| Back to top |
|
|
|
| Ray Koopman... |
Posted: Mon Oct 26, 2009 8:39 am |
|
|
|
Guest
|
On Oct 26, 12:58 am, voice_of_rea... at (no spam) australia.edu wrote:
[quote]Hi, thanks for your response.....
On Oct 25, 3:25 pm, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
How many parts were tested?
On Oct 24, 11:42 pm, voice_of_rea... at (no spam) australia.edu wrote:
The experiment involves taking a part from the assembly line....(i.e. one part)
Did the same 2 inspectors do all 4 replications on all n parts?
.....having THE SAME two inspectors take measurements at pre-selected points -- each point measured twice....
How many preselecte points were there?
How does this affect whether or not I can use the p-value in the
manner I originally assumed?
[/quote]
I asked about the procedure because it was not clear what you had done
or what statistical analyses would be justified. "One should aim not
at being possible to understand, but at being impossible to
misunderstand." [Quintillian]
[quote]
Again, my original approach was....
Ho: There is no difference between measurements taken after repeated
mountings in the fixture
[/quote]
That's a trivial hypothesis. The existence of a single observed
difference, of any magnitude, would suffice to reject it.
[quote]
My assumption was that by comparing the p-value for the mountings,
I could determine if this null assumption should be rejected.
To put it another way:
One part was measured repeatedly. If the mean difference in the
readings obtained is significant, then it seems to me the measurement
system is unstable (non-repeatable). Doesn't the p-value indicate the
probability that the results are explained by the null hypothesis? As
such, doesn't a p-value less than the selected alpha indicate that the
null hypothesis should be rejected? And as such doesn't that imply
that the measuring system is NOT producing repeatable results?
[/quote]
You don't care about the mean difference (which almost certainly is
not exactly zero) as much as you care about the tails of the
distribution of measurements. How much disagreement among measurements
can your process tolerate? What proportion of the measurments are
acceptably close to one another? Is the 99% confidence interval for a
single observation narrow enough? |
|
|
| Back to top |
|
|
|
| Rich Ulrich... |
Posted: Mon Oct 26, 2009 2:27 pm |
|
|
|
Guest
|
On Mon, 26 Oct 2009 00:58:52 -0700 (PDT),
voice_of_reason at (no spam) australia.edu wrote:
[quote]Hi, thanks for your response.....
On Oct 25, 3:25 pm, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
How many parts were tested?
On Oct 24, 11:42 pm, voice_of_rea... at (no spam) australia.edu wrote:
The experiment involves taking a part from the assembly line....(i.e. one part)
Did the same 2 inspectors do all 4 replications on all n parts?
.....having THE SAME two inspectors take measurements at pre-selected points -- each point measured twice....
How many preselecte points were there?
How does this affect whether or not I can use the p-value in the
manner I originally assumed?
[/quote]
Did my post of Oct 25 fail to appear on your server?
I think I covered your questions fairly thoroughly.
[quote]
Again, my original approach was....
Ho: There is no difference between measurements taken after repeated
mountings in the fixture
My assumption was that by comparing the p-value for the mountings,
I could determine if this null assumption should be rejected.
To put it another way:
One part was measured repeatedly. If the mean difference in the
readings obtained is significant, then it seems to me the measurement
system is unstable (non-repeatable). Doesn't the p-value indicate the
probability that the results are explained by the null hypothesis? As
such, doesn't a p-value less than the selected alpha indicate that the
null hypothesis should be rejected? And as such doesn't that imply
that the measuring system is NOT producing repeatable results?
[/quote]
--
Rich Ulrich |
|
|
| Back to top |
|
|
|
| ... |
Posted: Mon Oct 26, 2009 7:05 pm |
|
|
|
Guest
|
Thank you again for your response....
On Oct 27, 2:39 am, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
[quote]That's a trivial hypothesis. The existence of a single observed
difference, of any magnitude, would suffice to reject it.
[/quote]
Ok fine....
Ho: There is no SIGNIFICANT difference between measurements taken
after repeated
mountings in the fixture
[quote]You don't care about the mean difference (which almost certainly is
not exactly zero) as much as you care about the tails of the
distribution of measurements.
[/quote]
Ok...let me ask the question this way...and hopefully there is a
simple answer...
For the experiment as outlined above(previous posts), what does a p-
value < alpha in the "parts" row signify?
[Rem: ordinarily I would think this would signify variation BETWEEN
parts....but since I am only using ONE PART in this experiment...such
observed variation must be coming from the measurement system itself.] |
|
|
| Back to top |
|
|
|
| ... |
Posted: Mon Oct 26, 2009 7:07 pm |
|
|
|
Guest
|
On Oct 27, 4:27 am, Rich Ulrich <rich.ulr... at (no spam) comcast.net> wrote:
[quote]Did my post of Oct 25 fail to appear on your server?
I think I covered your questions fairly thoroughly.
[/quote]
It just showed up. Sorry.
Can you address the question I just asked (above post)? |
|
|
| Back to top |
|
|
|
| Rich Ulrich... |
Posted: Tue Oct 27, 2009 2:53 pm |
|
|
|
Guest
|
On Mon, 26 Oct 2009 22:07:08 -0700 (PDT),
voice_of_reason at (no spam) australia.edu wrote:
[quote]On Oct 27, 4:27 am, Rich Ulrich <rich.ulr... at (no spam) comcast.net> wrote:
Did my post of Oct 25 fail to appear on your server?
I think I covered your questions fairly thoroughly.
It just showed up. Sorry.
Can you address the question I just asked (above post)?
[/quote]
Yes, I did address it. "Signficant" means systematic
difference; which may or may not be large enough
to matter to you.
"Non-significant" means that whatever differences
exist are not systematic. However, it is again true
that the apparent size of the differences may or may
not matter to you. "What matters" is how well you
can depend on a given measurement, and how much
(and what) that tells you.
--
Rich Ulrich |
|
|
| Back to top |
|
|
|
| ... |
Posted: Tue Oct 27, 2009 5:24 pm |
|
|
|
Guest
|
On Oct 28, 4:53 am, Rich Ulrich <rich.ulr... at (no spam) comcast.net> wrote:
[quote]"Signficant" means systematic
[/quote]
Ok, so that seems to say that my methodology is correct.
A p-value < alpha means there is systematic difference in the
measurments being made. The "bewteen parts" differences are
significant....and since I am in fact only using ONE part, this means
the measuring system is producing significantly different results ->
not repeatable.
Thank you! |
|
|
| Back to top |
|
|
|
| Rich Ulrich... |
Posted: Wed Oct 28, 2009 1:19 pm |
|
|
|
Guest
|
On Tue, 27 Oct 2009 20:24:13 -0700 (PDT),
voice_of_reason at (no spam) australia.edu wrote:
[quote]On Oct 28, 4:53 am, Rich Ulrich <rich.ulr... at (no spam) comcast.net> wrote:
"Signficant" means systematic
Ok, so that seems to say that my methodology is correct.
A p-value < alpha means there is systematic difference in the
measurments being made. The "bewteen parts" differences are
significant....and since I am in fact only using ONE part, this means
the measuring system is producing significantly different results -
not repeatable.
Thank you!
[/quote]
Uh-oh. It *seems* to me that you are missing the point, in
a couple of ways.
You have "one part". But - in the paradigm that I described -
you have two raters, and you have two locations. Or more.
THAT is what can be tested. "Different" does NOT imply
"not-repeatable"; in fact, one implication may be opposite
of that .
"Systematic" requires a good degree of being "repeatable",
compared to the other sources of error and variation.
That is why I have said, several times, that the p-value does
not answer question of whether the SIZE of the effect matters.
The SIZE of the variation, as described by the article that
you cited, is closer to the point -- although, you did not
mention what they may have said further about tested
differences. Re-read what I said.
If one rater is regularly 1 point smaller than the other,
- and thus, significant -
you can be in very good shape, if the relevant criterion
for actually *using* your measurements is something that
involves a difference of 10 points or 20 points or 50 points.
--
Rich Ulrich |
|
|
| Back to top |
|
|
|
| ... |
Posted: Wed Oct 28, 2009 8:25 pm |
|
|
|
Guest
|
Thank you again
On Oct 29, 3:19 am, Rich Ulrich <rich.ulr... at (no spam) comcast.net> wrote
[quote]If one rater is regularly 1 point smaller than the other.....
[/quote]
.....then that would show up in ANOVA table as a p-value < alpha in the
RATERS row. I am discussing the p-value in the PARTS row.
[quote]"Different" does NOT imply "not-repeatable".....
[/quote]
If I in fact only have ONE part...yet my measurement system is showing
me significant differences with the measurements of that part...but
consistencies BEWTEEN raters, then it seems to me that the difference
is in fact pointing out a lack of repeatability. |
|
|
| Back to top |
|
|
|
| Ray Koopman... |
Posted: Thu Oct 29, 2009 8:25 am |
|
|
|
Guest
|
On Oct 26, 10:05 pm, voice_of_rea... at (no spam) australia.edu wrote:
[quote][...]
Ok...let me ask the question this way...and hopefully there is a
simple answer...
For the experiment as outlined above(previous posts), what does a
p-value < alpha in the "parts" row signify?
[Rem: ordinarily I would think this would signify variation BETWEEN
parts....but since I am only using ONE PART in this experiment...such
observed variation must be coming from the measurement system itself.]
[/quote]
If there is only one part, but the program nevertheless gave you a
p-value for the significance of the difference between parts, then
the program did not do the analysis that you think you told it to do. |
|
|
| Back to top |
|
|
|
| Rich Ulrich... |
Posted: Thu Oct 29, 2009 12:52 pm |
|
|
|
Guest
|
On Wed, 28 Oct 2009 23:25:49 -0700 (PDT),
voice_of_reason at (no spam) australia.edu wrote:
[quote]Thank you again
On Oct 29, 3:19 am, Rich Ulrich <rich.ulr... at (no spam) comcast.net> wrote
If one rater is regularly 1 point smaller than the other.....
....then that would show up in ANOVA table as a p-value < alpha in the
RATERS row. I am discussing the p-value in the PARTS row.
"Different" does NOT imply "not-repeatable".....
If I in fact only have ONE part...yet my measurement system is showing
me significant differences with the measurements of that part...but
consistencies BEWTEEN raters, then it seems to me that the difference
is in fact pointing out a lack of repeatability.
[/quote]
Is Ray right, that you are totally misreading the output?
What I "guessed" about the design on Oct 25 was that you
might have 3 factors: rater, location of measurement, and
order of measuring. You never confirmed that, so I don't
know. What do you mean here by "differences with the
measurements of that part"?
Whatever it is, what I emphasized about small differenced
between raters - that they may be trivial (or can be adjusted
for) - is equally true about differences between location.
Differences in order would need further exploration,
perhaps, to figure out what is going on - but those, too,
must be "systematic" in order to have a significant p-level.
And the actual size of the differences, in the context of
whatever you are using the measures for, is what matters --
NOT the p-level alone.
--
Rich Ulrich |
|
|
| Back to top |
|
|
|
| ... |
Posted: Thu Oct 29, 2009 4:11 pm |
|
|
|
Guest
|
On Oct 30, 2:25 am, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
[quote]If there is only one part, but the program nevertheless gave you a
p-value for the significance of the difference between parts, then
the program did not do the analysis that you think you told it to do.
[/quote]
There is one part that is repeatedly mounted and unmounted from its
measuring fixture. Each mounting is entered into the program as a
"new" part.
Since it is in fact the SAME part, in theory the "between parts"
variation should be insignificant (I can verify the assumption that
mounting and unmounting does not effect the dimensions of the part).
If however there is significant difference, this implies that there is
something going on in the measuring system that is causing this same
part to appear to be different...to be producing significantly
different dimensions.
In other words, repeatedly using this fixture does NOT produce
repeatable results. The system is not repeatable. |
|
|
| Back to top |
|
|
|
|
|
All times are GMT - 5 Hours
The time now is Thu Dec 10, 2009 6:59 pm
|
|