Jerry Dallal wrote:
mcam54@hotmail.com wrote:
pbrewster@hotmail.com wrote:
If I have an interaction term in my regression model that is
statistically significant, but one or both of the original
attributes are not statistically significant, should I remove
original attributes that are not statistically significant from
the
model?
For example, I create an interaction C=A*B. In my regression
model, C is statistically significant but A and B are not. Should
I leave them in the model or remove them?
Everyone is giving very statistical answers (which I guess makes
sense for a stats newsgroup

.
My question is can you explain this interaction clinically? Is it
plausible? Can you avoid overfit? If not, you may have grounds
for droping the individual predictor terms and the interaction
term.
Marc
Some would argue that interpretation is the ONLY thing that matters!
But interpretation demands knowing the context. I haven't seen
anything that says that the problem is not one where all factors are
of the simple 2-level type, where a significant interaction term with
non-significant individual factors (given the inclusion of the
interaction term) has the interpretation that there is no effect
unless both "treatments" are present.
If the context is regression with continuous explanatory variables,
then the OP may gain some understanding by drawing contours of the
model-prediction as a function of the explanatory variables. That is,
look at the shapes of the functions being allowed into the prediction.
An interesting question looking in the opposite direction to the one
discussed on other threads is: if an interaction term X*Y is included
in the model, should all the second-order terms (ie. X^2 and Y^2) be
included also.
To try to answer the other parts of the later question: " Can you
avoid overfit? If not, you may have grounds for droping the
individual predictor terms and the interaction term. " This depends a
lot on what the OP is trying to use the regression for. There are (at
least) two possibilities ....
(a) the final model is to be treated as deciding as to whether certain
effects really should be treated are present in comparison to a
simpler model where the effects are absent. In this case "avoiding
overfit" is broadly equivalent to the significance tests already being
done.
(b) the final model is to be treated as mechanism for creating
"predicted values" for future instances. For this, the presence or
absence of certain terms in the regression is irrelavant except in so
far as this affects the likely error in the prediction. In this case
"avoiding overfit" might be undertaken by using a criterion such as
FPE (Final Prediction Error), or AIC etc. These criteria don't
necessarily avoid the question of whether to force in individual terms
if higher-order terms seem necessary.
David Jones