Main Page | Report this Page
 
   
Science Forum Index  »  Space - Consult Forum  »  repeated-measures correlations?...
Page 1 of 1    
Author Message
Esther...
Posted: Tue May 06, 2008 4:29 am
Guest
I have a data set with some 700 observations. It includes (besides
ID, etc.) 20 subjective (opinion) variables, each measured for 4
different companies, for a total of 80 subjective variables.

I am interested in measuring correlations betweens pairs of these
variables. For example, between 1. Do you think the company's website
is easy to use? and 2. Would you consider working for this company?

I am not interested in differences between companies but in the
overall relationship between the variables; i.e., are respondents more
likely to consider working for companies whose website is easy (for
them) to use?

So one thing to do would be to break up the data set by company and
then line it up vertically so that there are 2800 observations. The
problem with this is that the observations are not independent because
each of the 700 subjects answered for all 4 companies. Is there some
way to address this?

Someone suggested dummy variables. Would I need one per company
(i.e., 4 dummy vars), one per respondent (i.e. 700 dummy vars), some
sort of interactions/compromise between the two? 700 dummy variables
seems a bit unrealistic!

Is there some way of using repeated measures analysis? Is this truly
repeated measures? My impression is that repeated measures is across
time, which is not what I have, and also that repeated measures does
not seem to be used with correlations. Is this true?

Are there other (better, more realistic) options?

Feedback would be appreciated.

Thank you in advance.

Esther
Ray Koopman...
Posted: Tue May 06, 2008 10:59 pm
Guest
On May 6, 7:29 am, Esther <aliza... at (no spam) gmail.com> wrote:
Quote:
I have a data set with some 700 observations. It includes (besides
ID, etc.) 20 subjective (opinion) variables, each measured for 4
different companies, for a total of 80 subjective variables.

I am interested in measuring correlations betweens pairs of these
variables. For example, between 1. Do you think the company's website
is easy to use? and 2. Would you consider working for this company?

I am not interested in differences between companies but in the
overall relationship between the variables; i.e., are respondents more
likely to consider working for companies whose website is easy (for
them) to use?

That question would be most directly addressed by the pooled within-
person correlation between the two variables. Let Xpc & Ypc denote
the
ratings of company c by person p on variables X (website) & Y (work).
For each person, get Xp and Yp, the means of that person's ratings
on variables X & Y, and subtract the means from the ratings.
Then correlate Xpc-Xp with Ypc-Yp over all 2800 X,Y pairs.
The df for testing the correlation will be 700*(4-1)-1.

Quote:

So one thing to do would be to break up the data set by company and
then line it up vertically so that there are 2800 observations. The
problem with this is that the observations are not independent because
each of the 700 subjects answered for all 4 companies. Is there some
way to address this?

Someone suggested dummy variables. Would I need one per company
(i.e., 4 dummy vars), one per respondent (i.e. 700 dummy vars), some
sort of interactions/compromise between the two? 700 dummy variables
seems a bit unrealistic!

Is there some way of using repeated measures analysis? Is this truly
repeated measures? My impression is that repeated measures is across
time, which is not what I have, and also that repeated measures does
not seem to be used with correlations. Is this true?

Are there other (better, more realistic) options?

Feedback would be appreciated.

Thank you in advance.

Esther

In general, different questions will require different analyses.
There is not one "right" anaslysis that will be appropriate for all
questions.
Ryan...
Posted: Wed May 07, 2008 3:57 am
Guest
Interesting post...If you have the time, I go through a simple example
below, and would appreciate your feedback on the accuracy of my
interpretation.

On May 7, 4:59 am, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
Quote:
On May 6, 7:29 am, Esther <aliza... at (no spam) gmail.com> wrote:

I have a data set with some 700 observations. It includes (besides
ID, etc.) 20 subjective (opinion) variables, each measured for 4
different companies, for a total of 80 subjective variables.

I am interested in measuring correlations betweens pairs of these
variables. For example, between 1. Do you think the company's website
is easy to use? and 2. Would you consider working for this company?

I am not interested in differences between companies but in the
overall relationship between the variables; i.e., are respondents more
likely to consider working for companies whose website is easy (for
them) to use?

That question would be most directly addressed by the pooled within-
person correlation between the two variables. Let Xpc & Ypc denote
the
ratings of company c by person p on variables X (website) & Y (work).

So I assume the dataset (for the first person) would look something
like this:

Person (P) Company (C) Ease of Website Use (Xpc) Consider Working
for Company (Ypc)
1 1 5 4
1 2 4 3
1 3 2 1
1 4 1 2

Quote:
For each person, get Xp and Yp, the means of that person's ratings
on variables X & Y

Xp= 5+4+2+1/4 = 3
Yp = 4+3+1+2/4 = 2.5

Quote:
, and subtract the means from the ratings.

P C Xpc Ypc Xpc-Xp Ypc-Yp
1 1 5 4 5-3=2 4-2.5=1.5
1 2 4 3 4-3=1 3-2.5=0.5
1 3 2 1 2-3=-1 1-2.5=-1.5
1 4 1 2 1-3=-2 2-2.5=-.5

The purpose of the last two steps is be able to correlate values based
on average ratings of the four companies per person.

Quote:
Then correlate Xpc-Xp with Ypc-Yp over all 2800 X,Y pairs.

rxy = .849

where x = Xpc-Xp and y = Ypc-Yp

Quote:
The df for testing the correlation will be 700*(4-1)-1.

so df = total number of participants*(number of companies - 1) - 1

For this example, df = 1*(4-1)-1 = 2

With df=2, r_cv (at alpha = .05, two tailed)= .950

Therefore, we fail to reject the null that states that there is no
correlation in the population between between rating of ease of
website and rating of considering to work for a company.
Quote:







So one thing to do would be to break up the data set by company and
then line it up vertically so that there are 2800 observations. The
problem with this is that the observations are not independent because
each of the 700 subjects answered for all 4 companies. Is there some
way to address this?

Someone suggested dummy variables. Would I need one per company
(i.e., 4 dummy vars), one per respondent (i.e. 700 dummy vars), some
sort of interactions/compromise between the two? 700 dummy variables
seems a bit unrealistic!

Is there some way of using repeated measures analysis? Is this truly
repeated measures? My impression is that repeated measures is across
time, which is not what I have, and also that repeated measures does
not seem to be used with correlations. Is this true?

Are there other (better, more realistic) options?

Feedback would be appreciated.

Thank you in advance.

Esther

In general, different questions will require different analyses.
There is not one "right" anaslysis that will be appropriate for all
questions.- Hide quoted text -

- Show quoted text -
Ray Koopman...
Posted: Wed May 07, 2008 7:54 am
Guest
On May 7, 6:57 am, Ryan <Ryan.Andrew.Bl... at (no spam) gmail.com> wrote:
Quote:
Interesting post...If you have the time, I go through a simple example
below, and would appreciate your feedback on the accuracy of my
interpretation.

On May 7, 4:59 am, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
On May 6, 7:29 am, Esther <aliza... at (no spam) gmail.com> wrote:

I have a data set with some 700 observations. It includes (besides
ID, etc.) 20 subjective (opinion) variables, each measured for 4
different companies, for a total of 80 subjective variables.

I am interested in measuring correlations betweens pairs of these
variables. For example, between 1. Do you think the company's website
is easy to use? and 2. Would you consider working for this company?

I am not interested in differences between companies but in the
overall relationship between the variables; i.e., are respondents more
likely to consider working for companies whose website is easy (for
them) to use?

That question would be most directly addressed by the pooled within-
person correlation between the two variables. Let Xpc & Ypc denote the
ratings of company c by person p on variables X (website) & Y (work).

So I assume the dataset (for the first person) would look
something like this:

Person (P) Company (C) Ease of Website Use (Xpc) Consider Working
for Company (Ypc)
1 1 5 4
1 2 4 3
1 3 2 1
1 4 1 2

For each person, get Xp and Yp, the means of that person's
ratings on variables X & Y

Xp= 5+4+2+1/4 = 3
Yp = 4+3+1+2/4 = 2.5

, and subtract the means from the ratings.

P C Xpc Ypc Xpc-Xp Ypc-Yp
1 1 5 4 5-3=2 4-2.5=1.5
1 2 4 3 4-3=1 3-2.5=0.5
1 3 2 1 2-3=-1 1-2.5=-1.5
1 4 1 2 1-3=-2 2-2.5=-.5

So far, so good.

Quote:

The purpose of the last two steps is be able to correlate values
based on average ratings of the four companies per person.

Not exactly. The question implies a within-person effect:
if I like company 1's website more than company 2's, will I
feel more inclined to work for company 1 than for company 2?
The person means are subtracted so that the correlation over all
people and companies will not be affected by person main effects.

Quote:

Then correlate Xpc-Xp with Ypc-Yp over all 2800 X,Y pairs.

rxy = .849

where x = Xpc-Xp and y = Ypc-Yp

The df for testing the correlation will be 700*(4-1)-1.

so df = total number of participants*(number of companies - 1) - 1

For this example, df = 1*(4-1)-1 = 2

With df=2, r_cv (at alpha = .05, two tailed)= .950

Therefore, we fail to reject the null that states that there is
no correlation in the population between between rating of ease
of website and rating of considering to work for a company.

Right. When n = 4 the null distribution of r is Uniform(-1,1),
so the two-tailed p = 1 - |r|.

Quote:

So one thing to do would be to break up the data set by company and
then line it up vertically so that there are 2800 observations. The
problem with this is that the observations are not independent because
each of the 700 subjects answered for all 4 companies. Is there some
way to address this?

Someone suggested dummy variables. Would I need one per company
(i.e., 4 dummy vars), one per respondent (i.e. 700 dummy vars), some
sort of interactions/compromise between the two? 700 dummy variables
seems a bit unrealistic!

Is there some way of using repeated measures analysis? Is this truly
repeated measures? My impression is that repeated measures is across
time, which is not what I have, and also that repeated measures does
not seem to be used with correlations. Is this true?

Are there other (better, more realistic) options?

Feedback would be appreciated.

Thank you in advance.

Esther

In general, different questions will require different analyses.
There is not one "right" anaslysis that will be appropriate
for all questions.
Ryan...
Posted: Wed May 07, 2008 11:58 am
Guest
On May 7, 1:54 pm, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
Quote:
On May 7, 6:57 am, Ryan <Ryan.Andrew.Bl... at (no spam) gmail.com> wrote:





Interesting post...If you have the time, I go through a simple example
below, and would appreciate your feedback on the accuracy of my
interpretation.

On May 7, 4:59 am, Ray Koopman <koop... at (no spam) sfu.ca> wrote:
On May 6, 7:29 am, Esther <aliza... at (no spam) gmail.com> wrote:

I have a data set with some 700 observations.  It includes (besides
ID, etc.) 20 subjective (opinion) variables, each measured for 4
different companies, for a total of 80 subjective variables.

I am interested in measuring correlations betweens pairs of these
variables.  For example, between 1. Do you think the company's website
is easy to use? and 2. Would you consider working for this company?

I am not interested in differences between companies but in the
overall relationship between the variables; i.e., are respondents more
likely to consider working for companies whose website is easy (for
them) to use?

That question would be most directly addressed by the pooled within-
person correlation between the two variables. Let Xpc & Ypc denote the
ratings of company c by person p on variables X (website) & Y (work).

So I assume the dataset (for the first person) would look
something like this:

Person (P)  Company (C)  Ease of Website Use (Xpc)   Consider Working
for Company (Ypc)
1                1              5                       4
1               2               4                       3
1               3               2                       1
1               4               1                       2

For each person, get Xp and Yp, the means of that person's
ratings on variables X & Y

Xp= 5+4+2+1/4 = 3
Yp = 4+3+1+2/4 = 2.5

, and subtract the means from the ratings.

P       C       Xpc     Ypc     Xpc-Xp     Ypc-Yp
1       1       5       4       5-3=2       4-2.5=1.5
1       2       4       3       4-3=1      3-2.5=0.5
1       3       2       1       2-3=-1     1-2.5=-1.5
1       4       1       2       1-3=-2     2-2.5=-.5

So far, so good.



The purpose of the last two steps is be able to correlate values
based on average ratings of the four companies per person.

Not exactly. The question implies a within-person effect:
if I like company 1's website more than company 2's, will I
feel more inclined to work for company 1 than for company 2?
The person means are subtracted so that the correlation over all
people and companies will not be affected by person main effects.







Then correlate Xpc-Xp with Ypc-Yp over all 2800 X,Y pairs.

rxy = .849

where x = Xpc-Xp and y = Ypc-Yp

The df for testing the correlation will be 700*(4-1)-1.

so df = total number of participants*(number of companies - 1) - 1

For this example, df = 1*(4-1)-1 = 2

With df=2, r_cv (at alpha = .05, two tailed)= .950

Therefore, we fail to reject the null that states that there is
no correlation in the population between between rating of ease
of website and rating of considering to work for a company.

Right. When n = 4 the null distribution of r is Uniform(-1,1),
so the two-tailed p = 1 - |r|.





So one thing to do would be to break up the data set by company and
then line it up vertically so that there are 2800 observations.  The
problem with this is that the observations are not independent because
each of the 700 subjects answered for all 4 companies.  Is there some
way to address this?

Someone suggested dummy variables.  Would I need one per company
(i.e., 4 dummy vars), one per respondent (i.e. 700 dummy vars), some
sort of interactions/compromise between the two?  700 dummy variables
seems a bit unrealistic!

Is there some way of using repeated measures analysis? Is this truly
repeated measures? My impression is that repeated measures is across
time, which is not what I have, and also that repeated measures does
not seem to be used with correlations.  Is this true?

Are there other (better, more realistic) options?

Feedback would be appreciated.

Thank you in advance.

Esther

In general, different questions will require different analyses.
There is not one "right" anaslysis that will be appropriate
for all questions.- Hide quoted text -

- Show quoted text -- Hide quoted text -

- Show quoted text -- Hide quoted text -

- Show quoted text -

Great. Thank you.
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Sun Sep 07, 2008 12:14 pm