Main Page | Report this Page
 
   
Science Forum Index  »  Statistics - Education Forum  »  logistic regression question
Page 1 of 1    
Author Message
amorphia
Posted: Fri Feb 01, 2008 6:03 am
Guest
Hi all,

I have an experimental design where subjects make a sequence of simple
binary choices A or B. I would like to test the hypothesis that
initially in the sequence subjects tend to choose A, but this bias
degrades to random (or perhaps a bias to B) as the sequence
progresses.

Initially I thought that maybe I could do a simple binary logistic
regression, with sequence position as the only covariate. But now I
think that this is probably invalid, because this would assume that
choices at sequence position t+1 are independent of choices at
sequence t. This assumption is plainly false because the choices are
made by the same individuals who may make runs of the same choice.

Can I solve this problem by including individual as a factor in the
model perhaps? Or is a more complicated solution necessary?

Thanks!

Ben
mcap
Posted: Fri Feb 01, 2008 6:17 am
Guest
On Feb 1, 11:03 am, amorphia <spam.onto...@gmail.com> wrote:
Quote:
Hi all,

I have an experimental design where subjects make a sequence of simple
binary choices A or B. I would like to test the hypothesis that
initially in the sequence subjects tend to choose A, but this bias
degrades to random (or perhaps a bias to B) as the sequence
progresses.

Initially I thought that maybe I could do a simple binary logistic
regression, with sequence position as the only covariate. But now I
think that this is probably invalid, because this would assume that
choices at sequence position t+1 are independent of choices at
sequence t. This assumption is plainly false because the choices are
made by the same individuals who may make runs of the same choice.

Can I solve this problem by including individual as a factor in the
model perhaps? Or is a more complicated solution necessary?

Thanks!

Ben

You are looking at the choices as the sequence progresses. In binary
logistic, you are actually specifying one choice in the sequence
(perhaps the last) as your outcome. Is that what you want? It
seems that you want to see the influence of early choices on several
later choices.

Marc
Bruce Weaver
Posted: Fri Feb 01, 2008 8:51 am
Guest
On Feb 1, 11:03 am, amorphia <spam.onto...@gmail.com> wrote:
Quote:
Hi all,

I have an experimental design where subjects make a sequence of simple
binary choices A or B. I would like to test the hypothesis that
initially in the sequence subjects tend to choose A, but this bias
degrades to random (or perhaps a bias to B) as the sequence
progresses.

Initially I thought that maybe I could do a simple binary logistic
regression, with sequence position as the only covariate. But now I
think that this is probably invalid, because this would assume that
choices at sequence position t+1 are independent of choices at
sequence t. This assumption is plainly false because the choices are
made by the same individuals who may make runs of the same choice.

Can I solve this problem by including individual as a factor in the
model perhaps? Or is a more complicated solution necessary?

Thanks!

Ben


How many choices are subjects making? If it is a large enough number,
a relatively easy way to see graphically if your hypothesis is
supported would be to create convenient sized bins, and plot the
proportion of A-responses in each bin. Depending on how that looks,
maybe something as simple as repeated measures ANOVA (with trend
analysis) would work.

--
Bruce Weaver
bweaver@lakeheadu.ca
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."
Ray Koopman
Posted: Fri Feb 01, 2008 9:14 am
Guest
On Feb 1, 8:03 am, amorphia <spam.onto...@gmail.com> wrote:
Quote:
I have an experimental design where subjects make a sequence of simple
binary choices A or B. I would like to test the hypothesis that
initially in the sequence subjects tend to choose A, but this bias
degrades to random (or perhaps a bias to B) as the sequence
progresses.

Initially I thought that maybe I could do a simple binary logistic
regression, with sequence position as the only covariate. But now I
think that this is probably invalid, because this would assume that
choices at sequence position t+1 are independent of choices at
sequence t. This assumption is plainly false because the choices are
made by the same individuals who may make runs of the same choice.

Can I solve this problem by including individual as a factor in the
model perhaps? Or is a more complicated solution necessary?

Yes, making the additive constant in the logistic equation
person-specific, so that it becomes 'a_i' instead of just 'a',
would be one way to attack the problem.

Another approach would be to use Cochran's Q test -- don't omit
the df-adjustment for non-sphericity -- with pairwise McNemar
tests on the positions.
amorphia
Posted: Mon Feb 04, 2008 6:00 am
Guest
Thanks for the ideas, folks.

What I have ended up doing is dividing each individual's sequence into
a first half and a second half, calculating the proportion of A
responses in each half, and then just doing a paired comparison test
over all the individuals (you can do a paired t-test but I prefer
Fisher's paired comparison randomisation test).

The only thing I don't like about this is that one can claim it is
somewhat arbitrary to cut the sequences in half, rather than divide
them in any other way. But my p value is so strongly significant that
it is difficult to argue I am fiddling anything.

One thing I didn't mention before, which complicates things and made
some of the suggestions (like the Cochran test) impossible, is that
not all individuals perform the same number of actions, so the
sequences are of different lengths. Actually, while experimenting with
logistic regression, I found that variable sequence lengths had a
rather undesirable effect:

If you generate random sequences of varying lengths for individuals,
with a different but constant p(A)=1-p(B) for each individual, and
then run a logistic regression model: outcome = individual + sequence
position, you will get a significant effect (i.e. p < 0.05) of
sequence position far more often than 5% of the time. Something is
clearly wrong! It seems to be because the ends of long sequences have
too much influence on the model - say p(A) for the longest sequence is
0.1 but the mean p(A) earlier on is higher, the mean p(A) will drop
over the sequence but that is because higher p(A) individuals are
dropping out. This shouldn't result in a significant effect of
sequence position, but it does!

No doubt I have misunderstood something badly!

Cheers,

Ben
Ray Koopman
Posted: Mon Feb 04, 2008 9:21 am
Guest
On Feb 4, 8:00 am, amorphia <spam.onto...@gmail.com> wrote:
Quote:
[...]
One thing I didn't mention before, which complicates things and
made some of the suggestions (like the Cochran test) impossible,
is that not all individuals perform the same number of actions,
so the sequences are of different lengths.
[...]

What was the termination rule for each subject?
Bruce Weaver
Posted: Mon Feb 04, 2008 10:34 am
Guest
On Feb 1, 1:51 pm, Bruce Weaver <bwea...@lakeheadu.ca> wrote:

Quote:
How many choices are subjects making?

The OP has now said that the number of choices varies by subject.

Quote:
If it is a large enough number,
a relatively easy way to see graphically if your hypothesis is
supported would be to create convenient sized bins, and plot the
proportion of A-responses in each bin. Depending on how that looks,
maybe something as simple as repeated measures ANOVA (with trend
analysis) would work.


Here's another way to plot the data that does not require binning.
Let X = trial number, and let Y = the cumulative number of times A has
been chosen. If the OP's hypothesis is correct, the slope should be
close to 1 early on, and less steep (or even flat) later on.

--
Bruce Weaver
bweaver@lakeheadu.ca
www.angelfire.com/wv/bwhomedir
"When all else fails, RTFM."
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Tue Oct 07, 2008 2:20 pm