| |
 |
|
|
Science Forum Index » Statistics - Math Forum » Tetrachoric correlation: Is this formula still used?...
Page 1 of 1
|
| Author |
Message |
| Bruce Weaver... |
Posted: Thu Jun 26, 2008 7:33 am |
|
|
|
Guest
|
The following formula for the tetrachoric correlation is from Glass &
Hopkins (1996, p. 137), full reference given below. They just give
the formula without citing any other works. I'm wondering if this
formula is still in use. Thanks.
YY*NN - YN*NY
r_tet = -----------------
ord(1)*ord(2)*N^2
where YY = count in the Yes-Yes cell
NN = count in the No-No cell
NY = count in the No-Yes cell
YN = count in the Yes-No cell
p(1) = p(YES) for variable 1
p(2) = p(YES) for variable 2
z(1) = z that cuts off p(1) of the area in tail of standard
normal dist.
z(2) = z that cuts off p(2) of the area in tail of standard
normal dist.
ord(1) = ordinate of the standard normal distribution at z(1)
ord(2) = ordinate of the standard normal distribution at z(2)
EXAMPLE
V2
V1 YES NO TOTAL
YES 110 10 120
NO 90 190 280
TOTAL 200 200 400
YY = 110, NN = 190, YN = 10, NY = 90, N = 400
p(1) = 120/400 = 0.3
p(2) = 200/400 = 0.5
z(1) = + or - 0.5244
z(2) = 0
ord(1) = 0.3477
ord(2) = 0.3989
110*190 - 10*90
r_tet = ------------------- = 0.901
0.3477*0.3989*400^2
Reference
Glass GV, Hopkins KD (1996). Statistical Methods in Education and
Psychology (3rd Ed.). Boston, MA: Allyn and Bacon. ISBN: 0-205-14212-5
--
Bruce Weaver
bweaver at (no spam) lakeheadu.ca
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM." |
|
|
| Back to top |
|
| Ray Koopman... |
Posted: Thu Jun 26, 2008 12:13 pm |
|
|
|
Guest
|
On Jun 26, 10:33 am, Bruce Weaver <bwea... at (no spam) lakeheadu.ca> wrote:
Quote: The following formula for the tetrachoric correlation is from Glass &
Hopkins (1996, p. 137), full reference given below. They just give
the formula without citing any other works. I'm wondering if this
formula is still in use. Thanks.
YY*NN - YN*NY
r_tet = -----------------
ord(1)*ord(2)*N^2
where YY = count in the Yes-Yes cell
NN = count in the No-No cell
NY = count in the No-Yes cell
YN = count in the Yes-No cell
p(1) = p(YES) for variable 1
p(2) = p(YES) for variable 2
z(1) = z that cuts off p(1) of the area in tail of standard
normal dist.
z(2) = z that cuts off p(2) of the area in tail of standard
normal dist.
ord(1) = ordinate of the standard normal distribution at z(1)
ord(2) = ordinate of the standard normal distribution at z(2)
EXAMPLE
V2
V1 YES NO TOTAL
YES 110 10 120
NO 90 190 280
TOTAL 200 200 400
YY = 110, NN = 190, YN = 10, NY = 90, N = 400
p(1) = 120/400 = 0.3
p(2) = 200/400 = 0.5
z(1) = + or - 0.5244
z(2) = 0
ord(1) = 0.3477
ord(2) = 0.3989
110*190 - 10*90
r_tet = ------------------- = 0.901
0.3477*0.3989*400^2
Reference
Glass GV, Hopkins KD (1996). Statistical Methods in Education and
Psychology (3rd Ed.). Boston, MA: Allyn and Bacon. ISBN: 0-205-14212-5
--
Bruce Weaver
bwea... at (no spam) lakeheadu.cawww.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."
I've never seen that particular approximation before. It's not one of
the six that Castellan (Psychometrika, 1966) reports on, and I thought
that since then just about everyone had abandoned approximations in
favor of exact answers (which in this case is .820). |
|
|
| Back to top |
|
| Bruce Weaver... |
Posted: Fri Jun 27, 2008 4:16 am |
|
|
|
Guest
|
On Jun 26, 6:13 pm, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
Quote: On Jun 26, 10:33 am, Bruce Weaver <bwea... at (no spam) lakeheadu.ca> wrote:
The following formula for the tetrachoric correlation is from Glass &
Hopkins (1996, p. 137), full reference given below. They just give
the formula without citing any other works. I'm wondering if this
formula is still in use. Thanks.
YY*NN - YN*NY
r_tet = -----------------
ord(1)*ord(2)*N^2
where YY = count in the Yes-Yes cell
NN = count in the No-No cell
NY = count in the No-Yes cell
YN = count in the Yes-No cell
p(1) = p(YES) for variable 1
p(2) = p(YES) for variable 2
z(1) = z that cuts off p(1) of the area in tail of standard
normal dist.
z(2) = z that cuts off p(2) of the area in tail of standard
normal dist.
ord(1) = ordinate of the standard normal distribution at z(1)
ord(2) = ordinate of the standard normal distribution at z(2)
EXAMPLE
V2
V1 YES NO TOTAL
YES 110 10 120
NO 90 190 280
TOTAL 200 200 400
YY = 110, NN = 190, YN = 10, NY = 90, N = 400
p(1) = 120/400 = 0.3
p(2) = 200/400 = 0.5
z(1) = + or - 0.5244
z(2) = 0
ord(1) = 0.3477
ord(2) = 0.3989
110*190 - 10*90
r_tet = ------------------- = 0.901
0.3477*0.3989*400^2
Reference
Glass GV, Hopkins KD (1996). Statistical Methods in Education and
Psychology (3rd Ed.). Boston, MA: Allyn and Bacon. ISBN: 0-205-14212-5
--
Bruce Weaver
bwea... at (no spam) lakeheadu.cawww.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."
I've never seen that particular approximation before. It's not one of
the six that Castellan (Psychometrika, 1966) reports on, and I thought
that since then just about everyone had abandoned approximations in
favor of exact answers (which in this case is .820).
Ray, the Glass & Hopkins equation I cited is identical to equation 6
in Castellan's paper, but Castellan calls the result "epsilon". And
if I follow, he then plugs that value of epsilon into equation 7, and
solves for the tetrachoric correlation.
How did you obtain the "exact" solution?
Thanks,
Bruce
--
Bruce Weaver
bweaver at (no spam) lakeheadu.ca
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM." |
|
|
| Back to top |
|
| Ray Koopman... |
Posted: Fri Jun 27, 2008 6:18 am |
|
|
|
Guest
|
On Jun 27, 7:16 am, Bruce Weaver <bwea... at (no spam) lakeheadu.ca> wrote:
Quote: On Jun 26, 6:13 pm, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
On Jun 26, 10:33 am, Bruce Weaver <bwea... at (no spam) lakeheadu.ca> wrote:
The following formula for the tetrachoric correlation is from Glass &
Hopkins (1996, p. 137), full reference given below. They just give
the formula without citing any other works. I'm wondering if this
formula is still in use. Thanks.
YY*NN - YN*NY
r_tet = -----------------
ord(1)*ord(2)*N^2
where YY = count in the Yes-Yes cell
NN = count in the No-No cell
NY = count in the No-Yes cell
YN = count in the Yes-No cell
p(1) = p(YES) for variable 1
p(2) = p(YES) for variable 2
z(1) = z that cuts off p(1) of the area in tail of standard
normal dist.
z(2) = z that cuts off p(2) of the area in tail of standard
normal dist.
ord(1) = ordinate of the standard normal distribution at z(1)
ord(2) = ordinate of the standard normal distribution at z(2)
EXAMPLE
V2
V1 YES NO TOTAL
YES 110 10 120
NO 90 190 280
TOTAL 200 200 400
YY = 110, NN = 190, YN = 10, NY = 90, N = 400
p(1) = 120/400 = 0.3
p(2) = 200/400 = 0.5
z(1) = + or - 0.5244
z(2) = 0
ord(1) = 0.3477
ord(2) = 0.3989
110*190 - 10*90
r_tet = ------------------- = 0.901
0.3477*0.3989*400^2
Reference
Glass GV, Hopkins KD (1996). Statistical Methods in Education and
Psychology (3rd Ed.). Boston, MA: Allyn and Bacon. ISBN: 0-205-14212-5
--
Bruce Weaver
bwea... at (no spam) lakeheadu.cawww.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."
I've never seen that particular approximation before. It's not one of
the six that Castellan (Psychometrika, 1966) reports on, and I thought
that since then just about everyone had abandoned approximations in
favor of exact answers (which in this case is .820).
Ray, the Glass & Hopkins equation I cited is identical to equation 6
in Castellan's paper, but Castellan calls the result "epsilon". And
if I follow, he then plugs that value of epsilon into equation 7, and
solves for the tetrachoric correlation.
But Glass & Hopkins didn't solve Castellan's (7) for rtet.
They just gave epsilon and called it rtet!
Quote:
How did you obtain the "exact" solution?
The major (IMHO) contribution of Pearson's 1901 paper was to
show that pxy - px*py = integral[0...rtet: f(x,y,r) dr], where
pxy = the observed joint proportion of successes,
px & py are the observed marginal proportions,
x = F^(px), y = F^(py),
F^(p) is the univariate standard normal inverse cdf, and
f(x,y,r) is the bivariate standard normal density.
I solve that equation numerically (using Mathematica), with the
upper bound of integration as the unknown. There are some special
cases that simplify a little, and it turns out to be better to
change r -> sin(t) and integrate with respect to t to eliminate
the division by sqrt(1-r^2) in f, but otherwise it's relatively
straightforward number crunching.
Quote:
Thanks,
Bruce
--
Bruce Weaver
bwea... at (no spam) lakeheadu.cawww.angelfire.com/wv/bwhomedir
"When all else fails, RTFM." |
|
|
| Back to top |
|
| Bruce Weaver... |
Posted: Fri Jun 27, 2008 7:33 am |
|
|
|
Guest
|
On Jun 27, 12:18 pm, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
Quote: On Jun 27, 7:16 am, Bruce Weaver <bwea... at (no spam) lakeheadu.ca> wrote:
Ray, the Glass & Hopkins equation I cited is identical to equation 6
in Castellan's paper, but Castellan calls the result "epsilon". And
if I follow, he then plugs that value of epsilon into equation 7, and
solves for the tetrachoric correlation.
But Glass & Hopkins didn't solve Castellan's (7) for rtet.
They just gave epsilon and called it rtet!
Yes, absolutely. That's what I was trying to say above.
Quote:
How did you obtain the "exact" solution?
The major (IMHO) contribution of Pearson's 1901 paper was to
show that pxy - px*py = integral[0...rtet: f(x,y,r) dr], where
pxy = the observed joint proportion of successes,
px & py are the observed marginal proportions,
x = F^(px), y = F^(py),
F^(p) is the univariate standard normal inverse cdf, and
f(x,y,r) is the bivariate standard normal density.
I solve that equation numerically (using Mathematica), with the
upper bound of integration as the unknown. There are some special
cases that simplify a little, and it turns out to be better to
change r -> sin(t) and integrate with respect to t to eliminate
the division by sqrt(1-r^2) in f, but otherwise it's relatively
straightforward number crunching.
Thanks Ray.
--
Bruce Weaver
bweaver at (no spam) lakeheadu.ca
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM." |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Fri Dec 05, 2008 10:49 am
|
|