| |
 |
|
|
Science Forum Index » Statistics - Math Forum » multivariate linear regression
Page 1 of 1
|
| Author |
Message |
| Mike |
Posted: Fri Mar 28, 2008 5:35 pm |
|
|
|
Guest
|
Hi
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
I use
s =load('All.dat');
%following to fit y (the 7th variable) as a function of first five
variables
[b,bint,r,rint,stats] =regress(s(:,7),[ones(size(s,1),1) s(:,1:5)],
0.05)
The five variables are different scale.
One may be from 1 to 2.
The other may only range from 0. to 0.1.
Can the b coefficient represent the sensitivity of y to x?
Do I need to normalize x?
Mike |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Sat Mar 29, 2008 9:08 pm |
|
|
|
Guest
|
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <SulfateIon@gmail.com>
wrote:
Quote: Hi
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
- Not very readily.
If the predictors are not correlated with each other,
you can look at the univariate numbers, e.g., correlations.
If they are correlated, then you have all the potential
logical problems of trying to ignore or credit whatever
variance is shared.
Quote: I use
s =load('All.dat');
%following to fit y (the 7th variable) as a function of first five
variables
[b,bint,r,rint,stats] =regress(s(:,7),[ones(size(s,1),1) s(:,1:5)],
0.05)
The five variables are different scale.
One may be from 1 to 2.
The other may only range from 0. to 0.1.
Can the b coefficient represent the sensitivity of y to x?
Do I need to normalize x?
The b coefficient definitely reflects the scale of the
variable. The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.
One equation of little actual use uses the sum of
the r_i and beta_i,
R^2= sum(r_i * beta_i) .
The main drawback of this equation is that terms
are not always positive in sign. So, it measures
some aspect of sensitivity, but it does not tell you
everything you want to know about confounding --
a negative term proves that confounding does exist.
If the r's are near in value to the betas, then the
variables are largely independent.
Do keep in mind that any assessment of correlations
is (strongly) conditioned by the sample that was used.
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Mike |
Posted: Sun Mar 30, 2008 2:03 pm |
|
|
|
Guest
|
On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
Quote: On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
Hi
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
- Not very readily.
If the predictors are not correlated with each other,
you can look at the univariate numbers, e.g., correlations.
If they are correlated, then you have all the potential
logical problems of trying to ignore or credit whatever
variance is shared.
I use
s =load('All.dat');
%following to fit y (the 7th variable) as a function of first five
variables
[b,bint,r,rint,stats] =regress(s(:,7),[ones(size(s,1),1) s(:,1:5)],
0.05)
The five variables are different scale.
One may be from 1 to 2.
The other may only range from 0. to 0.1.
Can the b coefficient represent the sensitivity of y to x?
Do I need to normalize x?
The b coefficient definitely reflects the scale of the
variable. The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.
I never know beta. What is that?
I check some statistics book. I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention. What is its formal name?
thank you for your reply.
Mike
Quote: One equation of little actual use uses the sum of
the r_i and beta_i,
R^2= sum(r_i * beta_i) .
The main drawback of this equation is that terms
are not always positive in sign. So, it measures
some aspect of sensitivity, but it does not tell you
everything you want to know about confounding --
a negative term proves that confounding does exist.
If the r's are near in value to the betas, then the
variables are largely independent.
Do keep in mind that any assessment of correlations
is (strongly) conditioned by the sample that was used.
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Sun Mar 30, 2008 9:13 pm |
|
|
|
Guest
|
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <SulfateIon@gmail.com>
wrote:
Quote: On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip]
RU > >
Quote: The b coefficient definitely reflects the scale of the
variable. The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.
Mike
I never know beta. What is that?
I check some statistics book. I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention. What is its formal name?
I've never noticed it missing from textbooks,
discussion regression. The usual terminology uses
b for the raw coefficient, and beta for the standardized.
As I said, it has been provided by all the packages
I've used.
Anyway -- Google gives me more than 17,000 hits
for < "standardized regression coefficient" beta > .
Google-groups has 15 hits, most of them in the
sci.stat.* groups, and some of them from me.
[snip, rest]
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Mike |
Posted: Mon Mar 31, 2008 8:13 pm |
|
|
|
Guest
|
On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
Quote: On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip]
RU
The b coefficient definitely reflects the scale of the
variable. The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.
Mike
I never know beta. What is that?
I check some statistics book. I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention. What is its formal name?
I've never noticed it missing from textbooks,
discussion regression. The usual terminology uses
b for the raw coefficient, and beta for the standardized.
As I said, it has been provided by all the packages
I've used.
Anyway -- Google gives me more than 17,000 hits
for < "standardized regression coefficient" beta > .
Google-groups has 15 hits, most of them in the
sci.stat.* groups, and some of them from me.
[snip, rest]
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html
I've read the Help of matlab for "regress" function I used.
Here is the following:
[b,bint,r,rint,stats] = regress(y,X) returns an estimate of in b, a
95% confidence interval for in the p-by-2 vector bint. The residuals
are returned in r and a 95% confidence interval for each residual is
returned in the n-by-2 vector rint. The vector stats contains the R2
statistic along with the F and p values for the regression.
[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1 - alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives
80% confidence intervals.
The b printed in the above actually is beta.
But back to the question I ask at first.
Quote: I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?
If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?
Mike |
|
|
| Back to top |
|
| Paul Rubin |
Posted: Tue Apr 01, 2008 9:05 am |
|
|
|
Guest
|
Quote:
I've read the Help of matlab for "regress" function I used.
Here is the following:
[b,bint,r,rint,stats] = regress(y,X) returns an estimate of in b, a
95% confidence interval for in the p-by-2 vector bint. The residuals
are returned in r and a 95% confidence interval for each residual is
returned in the n-by-2 vector rint. The vector stats contains the R2
statistic along with the F and p values for the regression.
[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1 - alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives
80% confidence intervals.
The b printed in the above actually is beta.
If Matlab will not produce the standardized regression coefficients
directly, you can get them by first standardizing each of the variables
(including the dependent variable), then regressing the standardized
dependent variables on the standardized predictors. The coefficient
estimates from that regression will be the standardized betas. The
interpretation of a standardized beta is that a one standard deviation
change in the predictor relates to a ___ standard deviation change in
the conditional mean of the dependent variable, other predictors held
constant.
/Paul |
|
|
| Back to top |
|
| Guest |
Posted: Tue Apr 01, 2008 9:33 am |
|
|
|
|
On Mon, 31 Mar 2008 23:13:25 -0700 (PDT), Mike <SulfateIon@gmail.com>
wrote:
Quote: On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip]
RU
The b coefficient definitely reflects the scale of the
variable. The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.
Mike
I never know beta. What is that?
I check some statistics book. I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention. What is its formal name?
I've never noticed it missing from textbooks,
discussion regression. The usual terminology uses
b for the raw coefficient, and beta for the standardized.
As I said, it has been provided by all the packages
I've used.
Anyway -- Google gives me more than 17,000 hits
for < "standardized regression coefficient" beta > .
Google-groups has 15 hits, most of them in the
sci.stat.* groups, and some of them from me.
[snip, rest]
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html
I've read the Help of matlab for "regress" function I used.
Here is the following:
[b,bint,r,rint,stats] = regress(y,X) returns an estimate of in b, a
95% confidence interval for in the p-by-2 vector bint. The residuals
are returned in r and a 95% confidence interval for each residual is
returned in the n-by-2 vector rint. The vector stats contains the R2
statistic along with the F and p values for the regression.
[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1 - alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives
80% confidence intervals.
The b printed in the above actually is beta.
But back to the question I ask at first.
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?
If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?
Mike
If y and x have meaningful cardinal measures, then using the (raw)
regression coefficients is exactly the right thing to do if you have a
good understanding of what are meaningful changes in the x variables.
Suppose y is yield of wheat per acre and one x is irrigation in
gallons per acre and another is fertilizer in pounds per acre. Then
the regression coefficients tell you the effect of using water versus
fertilizer. But if you want to compare "sensitivity," you have to take
a stand on how much would be a big change in water versus a big change
in fertilizer.
-Dick Startz |
|
|
| Back to top |
|
| Richard Ulrich |
Posted: Tue Apr 01, 2008 8:16 pm |
|
|
|
Guest
|
On Mon, 31 Mar 2008 23:13:25 -0700 (PDT), Mike <SulfateIon@gmail.com>
wrote:
Quote: On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip, a bunch]
But back to the question I ask at first.
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
Oh, you can present it if you want to, but you should
expect to be justly criticized for it. See my first
comments. Multiple regression will do fine if the variables
are suitably uncorrelated; and in that case you can use the
zero-level correlation or regression to see the same numbers.
If they are correlated, interpretations are problematic.
For various reasons, both numeric and logical.
Quote: You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?
- see above -
Quote: If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?
I don't follow this.
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| Guest |
Posted: Tue Apr 01, 2008 9:03 pm |
|
|
|
|
On Tue, 01 Apr 2008 21:16:33 -0400, Richard Ulrich
<Rich.Ulrich@comcast.net> wrote:
Quote: On Mon, 31 Mar 2008 23:13:25 -0700 (PDT), Mike <SulfateIon@gmail.com
wrote:
On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip, a bunch]
But back to the question I ask at first.
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
Oh, you can present it if you want to, but you should
expect to be justly criticized for it. See my first
comments. Multiple regression will do fine if the variables
are suitably uncorrelated; and in that case you can use the
zero-level correlation or regression to see the same numbers.
If they are correlated, interpretations are problematic.
For various reasons, both numeric and logical.
You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?
- see above -
If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?
I don't follow this.
Rich:
The OP is saying that in a linear regression the coefficients can be
interpreted as estimates of the partial derivatives. In a situation
where the regression gives you unbiased estimates of the data
generating process, this is exactly the right interpretation. What's
more it's true whether or not the right-hand side variables are
correlated.
Of course, if the rhs variables were correlated when the data was
generated it's wise to ask how one variable is going to change later
while the others don't.
-Dick Startz |
|
|
| Back to top |
|
| Mike |
Posted: Wed Apr 02, 2008 2:04 pm |
|
|
|
Guest
|
On Apr 1, 10:33 pm, richardsta...@comcast.net wrote:
Quote: On Mon, 31 Mar 2008 23:13:25 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip]
RU
The b coefficient definitely reflects the scale of the
variable. The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.
Mike
I never know beta. What is that?
I check some statistics book. I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention. What is its formal name?
I've never noticed it missing from textbooks,
discussion regression. The usual terminology uses
b for the raw coefficient, and beta for the standardized.
As I said, it has been provided by all the packages
I've used.
Anyway -- Google gives me more than 17,000 hits
for < "standardized regression coefficient" beta > .
Google-groups has 15 hits, most of them in the
sci.stat.* groups, and some of them from me.
[snip, rest]
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html
I've read the Help of matlab for "regress" function I used.
Here is the following:
[b,bint,r,rint,stats] = regress(y,X) returns an estimate of in b, a
95% confidence interval for in the p-by-2 vector bint. The residuals
are returned in r and a 95% confidence interval for each residual is
returned in the n-by-2 vector rint. The vector stats contains the R2
statistic along with the F and p values for the regression.
[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1 - alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives
80% confidence intervals.
The b printed in the above actually is beta.
But back to the question I ask at first.
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?
If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?
Mike
If y and x have meaningful cardinal measures, then using the (raw)
regression coefficients is exactly the right thing to do if you have a
good understanding of what are meaningful changes in the x variables.
Suppose y is yield of wheat per acre and one x is irrigation in
gallons per acre and another is fertilizer in pounds per acre. Then
the regression coefficients tell you the effect of using water versus
fertilizer. But if you want to compare "sensitivity," you have to take
a stand on how much would be a big change in water versus a big change
in fertilizer.
-Dick Startz- Hide quoted text -
- Show quoted text -
I'd like to ask by a new post about sensitivity, which I am working
right now. Thank you and all of your suggestions.
Mike |
|
|
| Back to top |
|
| Greg Heath |
Posted: Tue Apr 08, 2008 4:20 pm |
|
|
|
Guest
|
On Mar 28, 11:35 pm, Mike <Sulfate...@gmail.com> wrote:
Quote: Hi
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
I think you mean multiple regression. Multivariate implies multiple
outputs as well as multiple inputs.
The sensitivity of y to x1 depends on what other predictors
that are correlated with x1 are used in the model. I don't find
it useful unless either NONE of the others are in the model or
ALL of the other predictors are in the model.
Hope this helps.
Greg |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Mon Sep 08, 2008 1:04 am
|
|