Main Page | Report this Page
 
   
Science Forum Index  »  Statistics - Math Forum  »  multivariate linear regression
Page 1 of 1    
Author Message
Mike
Posted: Fri Mar 28, 2008 5:35 pm
Guest
Hi

I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
I use
s =load('All.dat');
%following to fit y (the 7th variable) as a function of first five
variables
[b,bint,r,rint,stats] =regress(s(:,7),[ones(size(s,1),1) s(:,1:5)],
0.05)

The five variables are different scale.
One may be from 1 to 2.
The other may only range from 0. to 0.1.
Can the b coefficient represent the sensitivity of y to x?
Do I need to normalize x?

Mike
Richard Ulrich
Posted: Sat Mar 29, 2008 9:08 pm
Guest
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <SulfateIon@gmail.com>
wrote:

Quote:
Hi

I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.

- Not very readily.

If the predictors are not correlated with each other,
you can look at the univariate numbers, e.g., correlations.

If they are correlated, then you have all the potential
logical problems of trying to ignore or credit whatever
variance is shared.

Quote:
I use
s =load('All.dat');
%following to fit y (the 7th variable) as a function of first five
variables
[b,bint,r,rint,stats] =regress(s(:,7),[ones(size(s,1),1) s(:,1:5)],
0.05)

The five variables are different scale.
One may be from 1 to 2.
The other may only range from 0. to 0.1.
Can the b coefficient represent the sensitivity of y to x?
Do I need to normalize x?

The b coefficient definitely reflects the scale of the
variable. The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.

One equation of little actual use uses the sum of
the r_i and beta_i,
R^2= sum(r_i * beta_i) .

The main drawback of this equation is that terms
are not always positive in sign. So, it measures
some aspect of sensitivity, but it does not tell you
everything you want to know about confounding --
a negative term proves that confounding does exist.
If the r's are near in value to the betas, then the
variables are largely independent.

Do keep in mind that any assessment of correlations
is (strongly) conditioned by the sample that was used.

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html
Mike
Posted: Sun Mar 30, 2008 2:03 pm
Guest
On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
Quote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

Hi

I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.

 - Not very readily.

If the predictors are not correlated with each other,
you can look at the univariate numbers, e.g., correlations.

If they are correlated, then you have all the potential
logical problems of trying to ignore or credit whatever
variance is shared.

I use
s =load('All.dat');
%following to fit y (the 7th variable) as a function of first five
variables
[b,bint,r,rint,stats] =regress(s(:,7),[ones(size(s,1),1) s(:,1:5)],
0.05)

The five variables are different scale.
One may be from 1 to 2.
The other may only range from 0. to 0.1.
Can the b coefficient represent the sensitivity of y to x?
Do I need to normalize x?

The b coefficient definitely reflects the scale of the
variable.  The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.


I never know beta. What is that?
I check some statistics book. I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention. What is its formal name?
thank you for your reply.

Mike

Quote:
One equation of little actual use uses the sum of
the r_i and beta_i,
    R^2= sum(r_i * beta_i) .

The main drawback of this equation is that terms
are not always positive in sign.  So, it measures
some aspect of sensitivity, but it does not tell you
everything you want to know about confounding --
a negative term proves that confounding does exist.
If the r's are near in value to the betas, then the
variables are largely independent.

Do keep in mind that any assessment of correlations
is (strongly) conditioned by the sample that was used.

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html
Richard Ulrich
Posted: Sun Mar 30, 2008 9:13 pm
Guest
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <SulfateIon@gmail.com>
wrote:

Quote:
On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip]

RU > >
Quote:
The b coefficient definitely reflects the scale of the
variable.  The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.

Mike
I never know beta. What is that?
I check some statistics book. I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention. What is its formal name?

I've never noticed it missing from textbooks,
discussion regression. The usual terminology uses
b for the raw coefficient, and beta for the standardized.
As I said, it has been provided by all the packages
I've used.

Anyway -- Google gives me more than 17,000 hits
for < "standardized regression coefficient" beta > .

Google-groups has 15 hits, most of them in the
sci.stat.* groups, and some of them from me.

[snip, rest]

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html
Mike
Posted: Mon Mar 31, 2008 8:13 pm
Guest
On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
Quote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

[snip]
RU

The b coefficient definitely reflects the scale of the
variable.  The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.

Mike
I never know beta.  What is that?
I check some statistics book.  I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention.  What is its formal name?

I've never noticed it missing from textbooks,
discussion regression.  The usual terminology uses
b for the raw coefficient, and beta for the standardized.
As I said, it has been provided by all the packages
I've used.

Anyway -- Google gives me more than 17,000 hits
for < "standardized regression coefficient" beta > .

Google-groups has 15 hits, most of them in the
sci.stat.*  groups, and some of them from me.

[snip, rest]

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html

I've read the Help of matlab for "regress" function I used.
Here is the following:

[b,bint,r,rint,stats] = regress(y,X) returns an estimate of in b, a
95% confidence interval for in the p-by-2 vector bint. The residuals
are returned in r and a 95% confidence interval for each residual is
returned in the n-by-2 vector rint. The vector stats contains the R2
statistic along with the F and p values for the regression.

[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1 - alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives
80% confidence intervals.

The b printed in the above actually is beta.

But back to the question I ask at first.
Quote:
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
You have suggested me using correlation.

Why cannot I use regression coefficient as the sensitivity of some
variable?
If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?

Mike
Paul Rubin
Posted: Tue Apr 01, 2008 9:05 am
Guest
Quote:

I've read the Help of matlab for "regress" function I used.
Here is the following:

[b,bint,r,rint,stats] = regress(y,X) returns an estimate of in b, a
95% confidence interval for in the p-by-2 vector bint. The residuals
are returned in r and a 95% confidence interval for each residual is
returned in the n-by-2 vector rint. The vector stats contains the R2
statistic along with the F and p values for the regression.

[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1 - alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives
80% confidence intervals.

The b printed in the above actually is beta.


If Matlab will not produce the standardized regression coefficients
directly, you can get them by first standardizing each of the variables
(including the dependent variable), then regressing the standardized
dependent variables on the standardized predictors. The coefficient
estimates from that regression will be the standardized betas. The
interpretation of a standardized beta is that a one standard deviation
change in the predictor relates to a ___ standard deviation change in
the conditional mean of the dependent variable, other predictors held
constant.

/Paul
Guest
Posted: Tue Apr 01, 2008 9:33 am
On Mon, 31 Mar 2008 23:13:25 -0700 (PDT), Mike <SulfateIon@gmail.com>
wrote:

Quote:
On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

[snip]
RU

The b coefficient definitely reflects the scale of the
variable.  The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.

Mike
I never know beta.  What is that?
I check some statistics book.  I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention.  What is its formal name?

I've never noticed it missing from textbooks,
discussion regression.  The usual terminology uses
b for the raw coefficient, and beta for the standardized.
As I said, it has been provided by all the packages
I've used.

Anyway -- Google gives me more than 17,000 hits
for < "standardized regression coefficient" beta > .

Google-groups has 15 hits, most of them in the
sci.stat.*  groups, and some of them from me.

[snip, rest]

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html

I've read the Help of matlab for "regress" function I used.
Here is the following:

[b,bint,r,rint,stats] = regress(y,X) returns an estimate of in b, a
95% confidence interval for in the p-by-2 vector bint. The residuals
are returned in r and a 95% confidence interval for each residual is
returned in the n-by-2 vector rint. The vector stats contains the R2
statistic along with the F and p values for the regression.

[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1 - alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives
80% confidence intervals.

The b printed in the above actually is beta.

But back to the question I ask at first.
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?
If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?

Mike

If y and x have meaningful cardinal measures, then using the (raw)
regression coefficients is exactly the right thing to do if you have a
good understanding of what are meaningful changes in the x variables.

Suppose y is yield of wheat per acre and one x is irrigation in
gallons per acre and another is fertilizer in pounds per acre. Then
the regression coefficients tell you the effect of using water versus
fertilizer. But if you want to compare "sensitivity," you have to take
a stand on how much would be a big change in water versus a big change
in fertilizer.
-Dick Startz
Richard Ulrich
Posted: Tue Apr 01, 2008 8:16 pm
Guest
On Mon, 31 Mar 2008 23:13:25 -0700 (PDT), Mike <SulfateIon@gmail.com>
wrote:

Quote:
On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip, a bunch]

But back to the question I ask at first.
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.

Oh, you can present it if you want to, but you should
expect to be justly criticized for it. See my first
comments. Multiple regression will do fine if the variables
are suitably uncorrelated; and in that case you can use the
zero-level correlation or regression to see the same numbers.

If they are correlated, interpretations are problematic.
For various reasons, both numeric and logical.


Quote:
You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?

- see above -

Quote:
If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?

I don't follow this.

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html
Guest
Posted: Tue Apr 01, 2008 9:03 pm
On Tue, 01 Apr 2008 21:16:33 -0400, Richard Ulrich
<Rich.Ulrich@comcast.net> wrote:

Quote:
On Mon, 31 Mar 2008 23:13:25 -0700 (PDT), Mike <SulfateIon@gmail.com
wrote:

On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:
[snip, a bunch]

But back to the question I ask at first.
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.

Oh, you can present it if you want to, but you should
expect to be justly criticized for it. See my first
comments. Multiple regression will do fine if the variables
are suitably uncorrelated; and in that case you can use the
zero-level correlation or regression to see the same numbers.

If they are correlated, interpretations are problematic.
For various reasons, both numeric and logical.


You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?

- see above -

If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?

I don't follow this.

Rich:

The OP is saying that in a linear regression the coefficients can be
interpreted as estimates of the partial derivatives. In a situation
where the regression gives you unbiased estimates of the data
generating process, this is exactly the right interpretation. What's
more it's true whether or not the right-hand side variables are
correlated.

Of course, if the rhs variables were correlated when the data was
generated it's wise to ask how one variable is going to change later
while the others don't.

-Dick Startz
Mike
Posted: Wed Apr 02, 2008 2:04 pm
Guest
On Apr 1, 10:33 pm, richardsta...@comcast.net wrote:
Quote:
On Mon, 31 Mar 2008 23:13:25 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:





On Mar 31, 10:13 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Sun, 30 Mar 2008 17:03:33 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

On Mar 30, 10:08 am, Richard Ulrich <Rich.Ulr...@comcast.net> wrote:
On Fri, 28 Mar 2008 20:35:56 -0700 (PDT), Mike <Sulfate...@gmail.com
wrote:

[snip]
RU

The b coefficient definitely reflects the scale of the
variable.  The (standardized) beta coefficient, provided
by every computer package I've ever used, incorporates
a correction for standard deviations.

Mike
I never know beta.  What is that?
I check some statistics book.  I look at the index and search
standardized xxx, and didn't find standardized beta coefficient like
you mention.  What is its formal name?

I've never noticed it missing from textbooks,
discussion regression.  The usual terminology uses
b for the raw coefficient, and beta for the standardized.
As I said, it has been provided by all the packages
I've used.

Anyway -- Google gives me more than 17,000 hits
for < "standardized regression coefficient" beta > .

Google-groups has 15 hits, most of them in the
sci.stat.*  groups, and some of them from me.

[snip, rest]

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html

I've read the Help of matlab for "regress" function I used.
Here is the following:

[b,bint,r,rint,stats] = regress(y,X) returns an estimate of  in b, a
95% confidence interval for  in the p-by-2 vector bint. The residuals
are returned in r and a 95% confidence interval for each residual is
returned in the n-by-2 vector rint. The vector stats contains the R2
statistic along with the F and p values for the regression.

[b,bint,r,rint,stats] = regress(y,X,alpha) gives 100(1 - alpha)%
confidence intervals for bint and rint. For example, alpha = 0.2 gives
80% confidence intervals.

The b printed in the above actually is beta.

But back to the question I ask at first.
I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.
You have suggested me using correlation.
Why cannot I use regression coefficient as the sensitivity of some
variable?
If the regression equation is
Y=b0+b1x1+b2x2
Then partialY/partialx1 goes to the sensitivity of Y to x1, which
equals to b1, doesn't it?

Mike

If y and x have meaningful cardinal measures, then using the (raw)
regression coefficients is exactly the right thing to do if you have a
good understanding of what are meaningful changes in the x variables.

Suppose y is yield of wheat per acre and one x is irrigation in
gallons per acre and another is fertilizer in pounds per acre. Then
the regression coefficients tell you the effect of using water versus
fertilizer. But if you want to compare "sensitivity," you have to take
a stand on how much would be a big change in water versus a big change
in fertilizer.
-Dick Startz- Hide quoted text -

- Show quoted text -

I'd like to ask by a new post about sensitivity, which I am working
right now. Thank you and all of your suggestions.

Mike
Greg Heath
Posted: Tue Apr 08, 2008 4:20 pm
Guest
On Mar 28, 11:35 pm, Mike <Sulfate...@gmail.com> wrote:
Quote:
Hi

I am wondering if one can use multivariate linear regression to
study the sensitivity of y to several x variables.

I think you mean multiple regression. Multivariate implies multiple
outputs as well as multiple inputs.

The sensitivity of y to x1 depends on what other predictors
that are correlated with x1 are used in the model. I don't find
it useful unless either NONE of the others are in the model or
ALL of the other predictors are in the model.

Hope this helps.

Greg
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Fri Aug 29, 2008 1:44 am