 |
|
| Science Forum Index » Statistics - Math Forum » CI for the mean of a discrete variable... |
|
Page 1 of 1 |
|
| Author |
Message |
| Ray Koopman... |
Posted: Sun Nov 01, 2009 1:33 pm |
|
|
|
Guest
|
In another newsgroup recently, I suggested that the usual procedure,
m +- t*s/sqrt[n], be used to get a confidence interval for the mean
when the data were on a discrete 1,...,k scale. Now, there is nothing
to prevent the lower limit from being < 1 or the upper limit from
being > k, so a reasonable person would truncate the interval
accordingly to keep it in [1,k].
Then it occurred to me that the logic of the WIlson interval for a
binomial proportion, which is naturally in [0,1], could be extended
to cover this problem and give an interval that is naturally in
[1,k] with no need to truncate. That is, we should solve for the
probabilities p1,...,pk that maximize (for an upper bound) or
minimize (for a lower bound) the corresponding mean, sum j*pj,
subject to 1: all pj >= 0; 2: sum pj = 1; and
3: sum ((fj - n*pj)^2 /(n*pj)) <= chi-square[k-1,alpha], where
f1,...,fk are the observed frequencies and n = sum fj. How those
optimizations are done is a purely numerical problem that (so far)
has yielded nicely to current numerical-analytic methods.
My question is whether anyone has seen anything comparing this method
to the traditional t-based method in terms of coverage probability
and small-sample behavior. |
|
|
| Back to top |
|
|
|
| Luis A. Afonso... |
Posted: Mon Nov 02, 2009 11:59 am |
|
|
|
Guest
|
A R. Koopman´s dreadfully wrong
*************************************
Date: Nov 1, 2009 6:33 PM
Author: Ray Koopman
Subject: CI for the mean of a discrete variable
In another newsgroup recently, I suggested that the usual procedure, m +- t*s/sqrt[n], be used to get a confidence interval for the mean when the data were on a discrete 1,...,k scale. Now, there is nothing to prevent the lower limit from being < 1 or the upper limit from being > k, so a reasonable person would truncate the interval accordingly to keep it in [1,k].
*********************************************
A Dreadful Nightmare,
For discrete uniform Distribution the mean value, m, of any sample, whatever the size=n, is never outside [1, k].
Values
___ k=10, n=10, Pr(m<=7.0) = 0.956
___ k=80, n=100, Pr(m<=44.29) = 0.950
LuÃs A. Afonso |
|
|
| Back to top |
|
|
|
| Luis A. Afonso... |
Posted: Wed Nov 04, 2009 12:14 am |
|
|
|
Guest
|
*************************************
Date: Nov 1, 2009 6:33 PM
Author: Ray Koopman
Subject: CI for the mean of a discrete variable
In another newsgroup recently, I suggested that the usual procedure, m +- t*s/sqrt[n], be used to get a confidence interval for the mean when the data were on a discrete 1,...,k scale. Now, there is nothing to prevent the lower limit from being < 1 or the upper limit from being > k, so a reasonable person would truncate the interval accordingly to keep it in [1,k].
*********************************************
My response
It´s absolutely ridiculous and misleading to keep wrong procedures, as that Koopman advises, though historically recordable, when better ones are available. The Table below shows what the CI really are for some k and n values. The critical values are NEVER outside [1, k].
_Table:_______ Confidence Intervals for the mean values of n
size random samples relative to Uniform Discrete Distribution
{1 ,…, k}_________________________________________
__k=40__n=20___15.45(.025)__24.75(.977)
__________15___14.67(.025)__25.40(.977)
__________10___13.30(.025)__26.50(.976)
___________5___10.40(.025)__29.00(.977)
__k=50__n=25___19.84(.025)__30.24(.975)
__________20___19.20(.026)__30.80(.976)
__________15___18.20(.025)__31.60(.975)
__________10___16.60(.026)__33.00(.976)
___________5___13.00(.026)__36.00(.976)
LuÃs A. Afonso
REM "0-Koopman"
CLS
PRINT " k*n <=8000 "
INPUT " K = "; k
INPUT " n = "; n
INPUT " all = "; all
DIM w(8001)
FOR rpt = 1 TO all
RANDOMIZE TIMER
LOCATE 14, 50: PRINT USING "########"; all - rpt
s = 0
FOR i = 1 TO n
ji = INT(k * RND) + 1
s = s + ji
NEXT i
w(s) = w(s) + 1
NEXT rpt
w(1) = .025: w(2) = .975
FOR u = 1 TO 2
ww = w(u)
FOR t = 0 TO 8000
wr = wr + w(t) / all
IF wr > ww THEN GOTO 1
NEXT t
1 PRINT USING "#####.### .### "; t / n; wr
NEXT u
END |
|
|
| Back to top |
|
|
|
| Luis A. Afonso... |
Posted: Wed Nov 04, 2009 1:23 pm |
|
|
|
Guest
|
*************************************
Date: Nov 1, 2009 6:33 PM
Author: Ray Koopman
Subject: CI for the mean of a discrete variable
In another newsgroup recently, I suggested that the usual procedure, m +- t*s/sqrt[n], be used to get a confidence interval for the mean when the data were on a discrete 1,...,k scale. Now, there is nothing to prevent the lower limit from being < 1 or the upper limit from being > k, so a reasonable person would truncate the interval accordingly to keep it in [1,k].
*********************************************
[snip, rest]
Luis,
Ray did not specify that the concern was for *uniform*
discrete values. Please re-read the original post.
You use only the 95% CI. Ray did not specify that the
CI might not be 99.9% or even more severe.
--
Rich Ulrich
***********************************
My response
Unlike a notorious Reader, I’m not STUPID. So I do not disdain to interpret (after carefully reading the post) what the main OP´s concern is. This case is the clearest one ever found: Confidence Intervals for means, when calculated based on approximate models (see note* below), can be staying beyond a Distribution of the Random Discrete Variable, such {1. …, k} in the case under study. I simply had shown that for Uniform Discrete Law data the anomaly never occurs if an exact model is chosen.
OR ARE YOU, Ulrich, persuaded that in general is acceptable that the C.I. can be <1 and/or >k? Never, ever!
*Note: The Kopman´s solution approximate the EXACT Distribution (whatever) to a Normal, which by greater disgrace is CONTINUOS. Therefore the treatment uses the Student´s Law.
Luis A. Afonso |
|
|
| Back to top |
|
|
|
| Rich Ulrich... |
Posted: Wed Nov 04, 2009 3:28 pm |
|
|
|
Guest
|
On Wed, 04 Nov 2009 05:14:14 EST, "Luis A. Afonso"
<licas_ at (no spam) hotmail.com> wrote:
[quote]*************************************
Date: Nov 1, 2009 6:33 PM
Author: Ray Koopman
Subject: CI for the mean of a discrete variable
[/quote]
[snip, previous]
[quote]
It´s absolutely ridiculous and misleading to keep wrong procedures, as that Koopman advises, though historically recordable, when better ones are available. The Table below shows what the CI really are for some k and n values. The critical values are NEVER outside [1, k].
_Table:_______ Confidence Intervals for the mean values of n
size random samples relative to Uniform Discrete Distribution
{1 ,…, k}_________________________________________
[snip, rest][/quote]
Luis,
Ray did not specify that the concern was for *uniform*
discrete values. Please re-read the original post.
You use only the 95% CI. Ray did not specify that the
CI might not be 99.9% or even more severe.
--
Rich Ulrich |
|
|
| Back to top |
|
|
|
|
|
All times are GMT - 5 Hours
The time now is Fri Dec 11, 2009 9:17 pm
|
|