| |
 |
|
|
Science Forum Index » Statistics - Math Forum » Constraints for D-Optimal design...
Page 1 of 1
|
| Author |
Message |
| jiana via MathKB.com... |
Posted: Fri Jul 11, 2008 3:17 pm |
|
|
|
Guest
|
Hi,
I want to create a d-optimal design with the following characteristics:
factor A - numeric 2 levels
factor B - numeric 3 levels
factor C - numeric 2 levels
factor D - categorical 2 levels
Combinations of A, B, C exist for only one of the two levels of factor D.
How do construct the constraint (Multiple Linear Constraint) to describe this?
--
Message posted via http://www.mathkb.com |
|
|
| Back to top |
|
| Old Mac User... |
Posted: Fri Jul 11, 2008 3:17 pm |
|
|
|
Guest
|
On Jul 11, 4:17 pm, "jiana via MathKB.com" <u44208 at (no spam) uwe> wrote:
Quote: Hi,
I want to create a d-optimal design with the following characteristics:
factor A - numeric 2 levels
factor B - numeric 3 levels
factor C - numeric 2 levels
factor D - categorical 2 levels
Combinations of A, B, C exist for only one of the two levels of factor D.
How do construct the constraint (Multiple Linear Constraint) to describe this?
--
Message posted viahttp://www.mathkb.com
With no insight into the nature of your
data (is it
physics, chemistry, agriculture, etc.)
I suggest the following. Run a complete factorial
design (2 x 3 x 2) in A, B, and C for the one
value of D that is meaningful. That's 12 trials,
which is near the minimum number of data I would
even entertain regardless of issues about
power, etc.
Then what to do with the other level of D?
I have no idea of what you'd do about A, B, and C
at the other level of D.
If you can tell us more, that may be helpful. |
|
|
| Back to top |
|
| jiana via MathKB.com... |
Posted: Fri Jul 11, 2008 9:48 pm |
|
|
|
Guest
|
Quote:
If you can tell us more, that may be helpful.
I want to understand effects (main effects and interactions) of ingredient
level , processing temperature, manufacturing plant and storage time on
product quality.
The product is a fruit beverage; formulated with 3 levels on concentrate
(factor A), using two processing temperatures (factor C), stored at three
temperatures (factor B), at two different plant locations (factor D). One
plant has process at both temperatures while the other plant can only process
at one temperature. A trained panel of tasters will taste and evaluate
samples.
--
Message posted via MathKB.com
http://www.mathkb.com/Uwe/Forums.aspx/math-statistics/200807/1 |
|
|
| Back to top |
|
| GoodStats... |
Posted: Sat Jul 12, 2008 7:07 am |
|
|
|
Guest
|
On Jul 11, 10:48 pm, "jiana via MathKB.com" <u44208 at (no spam) uwe> wrote:
Quote: If you can tell us more, that may be helpful.
I want to understand effects (main effects and interactions) of ingredient
level , processing temperature, manufacturing plant and storage time on
product quality.
The product is a fruit beverage; formulated with 3 levels on concentrate
(factor A), using two processing temperatures (factor C), stored at three
temperatures (factor B), at two different plant locations (factor D). One
plant has process at both temperatures while the other plant can only process
at one temperature. A trained panel of tasters will taste and evaluate
samples.
--
Message posted via MathKB.comhttp://www.mathkb.com/Uwe/Forums.aspx/math-statistics/200807/1
1) Create the full factorial in all of the factors regardless of the
constraints using any appropriate software.
2) Edit the factorial experiment and delete all runs that do not meet
the constraints specified about.
3) Use the resultant basis set as the set of allowable runs.
4) Use any piece of d-optimal generating software that will allow one
to generate a design from a basis set of allowable runs (Design Expert
does this very well).
5) Evaluate the resultant design very carefully to make sure it meets
your needs. Evaluate the state of multicollinearity in particular by
examining the variance inflation factor (VIF) values and also the
condition number of the design matrix (the ratio of the largest to
smallest eigenvalue).
Disclaimer - I am not affiliated with Stat Ease, the makers of Design
Expert. This just a good piece of software that can execute the above
algorithm. |
|
|
| Back to top |
|
| Old Mac User... |
Posted: Sat Jul 12, 2008 8:00 am |
|
|
|
Guest
|
On Jul 12, 1:07 pm, GoodStats <david.schw... at (no spam) goodstats.biz> wrote:
Quote: On Jul 11, 10:48 pm, "jiana via MathKB.com" <u44208 at (no spam) uwe> wrote:
If you can tell us more, that may be helpful.
I want to understand effects (main effects and interactions) of ingredient
level , processing temperature, manufacturing plant and storage time on
product quality.
The product is a fruit beverage; formulated with 3 levels on concentrate
(factor A), using two processing temperatures (factor C), stored at three
temperatures (factor B), at two different plant locations (factor D). One
plant has process at both temperatures while the other plant can only process
at one temperature. A trained panel of tasters will taste and evaluate
samples.
--
Message posted via MathKB.comhttp://www.mathkb.com/Uwe/Forums.aspx/math-statistics/200807/1
1) Create the full factorial in all of the factors regardless of the
constraints using any appropriate software.
2) Edit the factorial experiment and delete all runs that do not meet
the constraints specified about.
3) Use the resultant basis set as the set of allowable runs.
4) Use any piece of d-optimal generating software that will allow one
to generate a design from a basis set of allowable runs (Design Expert
does this very well).
5) Evaluate the resultant design very carefully to make sure it meets
your needs. Evaluate the state of multicollinearity in particular by
examining the variance inflation factor (VIF) values and also the
condition number of the design matrix (the ratio of the largest to
smallest eigenvalue).
Disclaimer - I am not affiliated with Stat Ease, the makers of Design
Expert. This just a good piece of software that can execute the above
algorithm.
I agree with this to the extent of laying out all combinations... then
eliminate all that violate the constraints.
However, that that point I suggest running all of the remaining trials
(experiments). If, on analyzing the data, there is a detectable
difference
due to plant locations, then you may need to present the data as two
separate
sets.
I see no need for searching for the D-optimal subset of all feasible
trials
in this instance. IMHO, all feasible trials should be run... and
probably need
to be replicated. He's running low on experimental combinations
already.
In addition, I assume that the taste testing with be done in such a
manner
as to properly average down variation among testers, etc. Taste
testing is
at best a notorious source of variation... probably much larger than
any
other source of uncontrolled variation in this plan.
I do agree, however, that in general we can do as you suggested. That
is,
lay out all combinations, eliminate those which are outside the
constraints,
then select a subset from this. This also works nicely when we already
have some experimental data and want to add more experiments to the
set...
and do that with an attempt at minimizing correlations among
experimental
factors. I wrote my own software for this in 1961, and have used it in
teaching
many courses in applied DOE... and for "real" situations as well.
There's yet another point I'd like to make. The choice of D-optimal
plans
(no matter which software you use) is sensitive to the metrics of the
"independent" variables. If, for instance, one variable is
temperature
and if you express the temperature in deg. C... that's not the same as
expressing it in deg F. This may not be the best example since the
metrics
are similar. But if distance is a variable and if you do the D-optimal
calculations in feet... then do it again in miles... the D-optimal
"solutions" are likely to be very different. I don't like to deal with
"solutions" that are sensitive to the metrics. So I normally do "D-
optimal"
calculations not from the X'X matrix but from the correlation matrix.
In other words, I go for the cleanest correlation matrix I can get.
That is not the same as "optimizing" the X'X matrix. OMU |
|
|
| Back to top |
|
| Old Mac User... |
Posted: Sat Jul 12, 2008 11:17 am |
|
|
|
Guest
|
On Jul 12, 4:01 pm, "jiana via MathKB.com" <u44208 at (no spam) uwe> wrote:
Quote: I see no need for searching for the D-optimal subset of all feasible
trials in this instance. IMHO, all feasible trials should be run... and
probably need to be replicated. He's running low on experimental combinations
already.
I was considering a D-Optimal approach hoping to minimize the number of
samples because the full factorial requires more samples than the panel of
tasters can evaluate each time interval. The panel is well calibrated and
the ratings are done in triplicate.
--
Message posted via MathKB.comhttp://www.mathkb.com/Uwe/Forums.aspx/math-statistics/200807/1
I'm happy to hear that you have a trained taste panel. I assume they
do their work "blind".
Concerning the number of samples...
For one case of "D" there are 2 x 3 x 2 or 12 "experiments" and hence
I imagine 12 samples.
It's not clear to be how you will set A, B, and C for the other case
of "D".
Select one setting of A and of B and of C... perhaps?
I'm concerned that, if you try to reduce those to a subset of them,
you will not be able
to detect the effects you are looking for. There are at least two
concerns here. One is
"the power of the test". The other is if you do not run some of those
18 then you will
most likely not be able to detect and measure interactions among A, B,
and C... if such
interactions are present. Now it is true that you could reduce A, B,
and C to a 2 x 2 x 2
.... a classic two level factorial in just 8 experiments... a reduction
from 12 to 8.
And then run a "center-point" with B at its intermediate level in that
layout. That's
nine experiments... a "savings" of three experiments. But if
curvature is actually present
in B you'll have a feeble estimate of that curvature... and may
ultimately need to add
two more experiments to resolve it... a "savings" of just one
experiment. Worse, that would
lead to a sequential set of experiments... starting with nine and then
asking permission to
run two more. This is why I said "he's running low on experimental
combinations."
If you go with a D-optimal and a subset of less than the full
factorial, how many experimental
combinations will you propose to run? Run too few and you will lose
the ability to detect
and measure interactions. As the other commenter suggested, be sure
to check to see whether
the proposed D-optimal plan will actually allow you to estimate
interactions before you start
experimenting. Otherwise, the
data may not be analyzable. OMU |
|
|
| Back to top |
|
| Old Mac User... |
Posted: Sat Jul 12, 2008 11:19 am |
|
|
|
Guest
|
On Jul 12, 5:17 pm, Old Mac User <chendrixst... at (no spam) yahoo.com> wrote:
Quote: On Jul 12, 4:01 pm, "jiana via MathKB.com" <u44208 at (no spam) uwe> wrote:
I see no need for searching for the D-optimal subset of all feasible
trials in this instance. IMHO, all feasible trials should be run... and
probably need to be replicated. He's running low on experimental combinations
already.
I was considering a D-Optimal approach hoping to minimize the number of
samples because the full factorial requires more samples than the panel of
tasters can evaluate each time interval. The panel is well calibrated and
the ratings are done in triplicate.
--
Message posted via MathKB.comhttp://www.mathkb.com/Uwe/Forums.aspx/math-statistics/200807/1
I'm happy to hear that you have a trained taste panel. I assume they
do their work "blind".
Concerning the number of samples...
For one case of "D" there are 2 x 3 x 2 or 12 "experiments" and hence
I imagine 12 samples.
It's not clear to be how you will set A, B, and C for the other case
of "D".
Select one setting of A and of B and of C... perhaps?
I'm concerned that, if you try to reduce those to a subset of them,
you will not be able
to detect the effects you are looking for. There are at least two
concerns here. One is
"the power of the test". The other is if you do not run some of those
18 then you will
most likely not be able to detect and measure interactions among A, B,
and C... if such
interactions are present. Now it is true that you could reduce A, B,
and C to a 2 x 2 x 2
... a classic two level factorial in just 8 experiments... a reduction
from 12 to 8.
And then run a "center-point" with B at its intermediate level in that
layout. That's
nine experiments... a "savings" of three experiments. But if
curvature is actually present
in B you'll have a feeble estimate of that curvature... and may
ultimately need to add
two more experiments to resolve it... a "savings" of just one
experiment. Worse, that would
lead to a sequential set of experiments... starting with nine and then
asking permission to
run two more. This is why I said "he's running low on experimental
combinations."
If you go with a D-optimal and a subset of less than the full
factorial, how many experimental
combinations will you propose to run? Run too few and you will lose
the ability to detect
and measure interactions. As the other commenter suggested, be sure
to check to see whether
the proposed D-optimal plan will actually allow you to estimate
interactions before you start
experimenting. Otherwise, the
data may not be analyzable. OMU
Correction...
Not: The other is if you do not run some of those
18 then you will
But: The other is if you do not run some of those
12 then you will OMU |
|
|
| Back to top |
|
| jiana via MathKB.com... |
Posted: Sat Jul 12, 2008 2:49 pm |
|
|
|
Guest
|
Quote: 4) Use any piece of d-optimal generating software that will allow one
to generate a design from a basis set of allowable runs (Design Expert
does this very well).
What is the format of the file of allowable runs? How do you force Design
Expert to choose from these candidates?
--
Message posted via MathKB.com
http://www.mathkb.com/Uwe/Forums.aspx/math-statistics/200807/1 |
|
|
| Back to top |
|
| jiana via MathKB.com... |
Posted: Sat Jul 12, 2008 3:01 pm |
|
|
|
Guest
|
Quote: I see no need for searching for the D-optimal subset of all feasible
trials in this instance. IMHO, all feasible trials should be run... and
probably need to be replicated. He's running low on experimental combinations
already.
I was considering a D-Optimal approach hoping to minimize the number of
samples because the full factorial requires more samples than the panel of
tasters can evaluate each time interval. The panel is well calibrated and
the ratings are done in triplicate.
--
Message posted via MathKB.com
http://www.mathkb.com/Uwe/Forums.aspx/math-statistics/200807/1 |
|
|
| Back to top |
|
| jiana via MathKB.com... |
Posted: Tue Jul 22, 2008 9:22 pm |
|
|
|
Guest
|
Thank you for helping. I am trying to obtain resources to run factorial
model.
Old Mac User wrote:
Quote: I see no need for searching for the D-optimal subset of all feasible
trials in this instance. IMHO, all feasible trials should be run... and
[quoted text clipped - 55 lines]
experimenting. Otherwise, the
data may not be analyzable. OMU
Correction...
Not: The other is if you do not run some of those
18 then you will
But: The other is if you do not run some of those
12 then you will OMU
--
Message posted via MathKB.com
http://www.mathkb.com/Uwe/Forums.aspx/math-statistics/200807/1 |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Sat Nov 22, 2008 4:43 pm
|
|