Main Page | Report this Page
 
   
Science Forum Index  »  Statistics - Math Forum  »  On paired samples spontaneously arising in...
Page 1 of 1    
Author Message
Hanspeter...
Posted: Sun May 04, 2008 4:24 am
Guest
In my opinion an experiment should be designed so as to yield paired
samples (i.e.: before-treatment and after-treatment measurements) only
in the case one wants to understand whether the treatment results in
some effect or not.
Such experiment biases things in such a way so as to reveal even the
least effect of the treatment. That is, the experiment des not
simulate the reality, but it tries only to detect an effect, whatever
little it may be. Then the experiment that yields paired samples is
not good in order to reveal whether the treatment is worth giving;
that is, whether the population of after-treatment measurements is
significantly different from the one generated by before-treatment
measurements.
In my computer-simulated experiments, I have obtained spontaneously
two paired samples (= before- and after-treatment measurements), even
if I had no intention to test for understanding whether the treatment
had any effect. I was perfectly aware the treatment was somehow
effective!
Instead, I made computer-simulations in order to understand whether
the treatment was worth giving, since the implementation of the
treatment was not free of charge.
To be more precise: the object under test was a network whose
configuration was changing randomly. The treatment consisted in the
implementation of a deterministic criterion for assigning transmission
frequencies, instead of letting the frequencies be randomly assigned.
The treatment avoids the transmission frequency interference resulting
from random assignment of near frequencies to adjacent transmission
paths. This explains why it was obvious to me that the treatment had
some effect in any case.
I judged the network performance by means of a figure of noise that
accounted for transmission frequencies interference. The cost of the
treatment consisted in larger energy consumption due to the
implementation of the deterministic criterion for assigning
transmission frequencies. The consumption of energy was a critical
factor for the network design since network nodes (= elements) were
battery operated and the batteries can not be easily replaced.
Then I decided that I would have implemented the treatment if - and
only if - the population of the noise figures generated by the
treatment would have been significantly different from the one
resulting from the random transmission frequency assignment.
For this reason, I have used the 2 sample Kolmogorov-Smirnov test as
if the two samples were independent (actually the two samples were not
independent since they were obtained as repeated measurements, that
is, before- and after-treatment measurements).
The 2 sample Kolmogorov-Smirnov test resulted not significant, so that
I could not reject the null hypothesis that the samples came from the
same population. Then I did not implement the treatment.
I would like to know if my reasoning is correct, or rather if I should
have used a 1 sampled test applied to the differences between the two
samples.
Richard Ulrich...
Posted: Sun May 04, 2008 8:28 pm
Guest
On Sun, 4 May 2008 07:24:55 -0700 (PDT), Hanspeter <gjerbu at (no spam) yahoo.de>
wrote:

Quote:
In my opinion an experiment should be designed so as to yield paired
samples (i.e.: before-treatment and after-treatment measurements) only
in the case one wants to understand whether the treatment results in
some effect or not.

Why do you say, " ... only in the case"?
If you are not interested in looking at an effect,
why do you do any experiment? -- The paired
test, when the data are paired, gives the *correct*
measurement of error. Usually, the correlation is
positive, and the result is greater statistical power,
or, in other words, a sample that acts as if it were larger.


Quote:
Such experiment biases things in such a way so as to reveal even the
least effect of the treatment.

And that is a *good* thing. After you know how
big the effect is, *then* you decide whether it is big
enough to matter.



Quote:
That is, the experiment des not
simulate the reality, but it tries only to detect an effect, whatever
little it may be.

The experiment *measures* the effect. It is desirable to
have measurement that is as precise as possible, and it
is desirable to have an *accurate* assessment of the error.

Quote:
Then the experiment that yields paired samples is
not good in order to reveal whether the treatment is worth giving;
that is, whether the population of after-treatment measurements is
significantly different from the one generated by before-treatment
measurements.

Well, knowing the size is better than not-knowing the
size of the effect. If some effect is "too small to be worth
knowing about", then state that at the start.

Quote:
In my computer-simulated experiments, I have obtained spontaneously
two paired samples (= before- and after-treatment measurements), even
if I had no intention to test for understanding whether the treatment
had any effect. I was perfectly aware the treatment was somehow
effective!
Instead, I made computer-simulations in order to understand whether
the treatment was worth giving, since the implementation of the
treatment was not free of charge.
To be more precise: the object under test was a network whose
configuration was changing randomly. The treatment consisted in the
implementation of a deterministic criterion for assigning transmission
frequencies, instead of letting the frequencies be randomly assigned.
The treatment avoids the transmission frequency interference resulting
from random assignment of near frequencies to adjacent transmission
paths. This explains why it was obvious to me that the treatment had
some effect in any case.
I judged the network performance by means of a figure of noise that
accounted for transmission frequencies interference. The cost of the
treatment consisted in larger energy consumption due to the
implementation of the deterministic criterion for assigning
transmission frequencies. The consumption of energy was a critical
factor for the network design since network nodes (= elements) were
battery operated and the batteries can not be easily replaced.
Then I decided that I would have implemented the treatment if - and
only if - the population of the noise figures generated by the
treatment would have been significantly different from the one
resulting from the random transmission frequency assignment.
For this reason, I have used the 2 sample Kolmogorov-Smirnov test as
if the two samples were independent (actually the two samples were not
independent since they were obtained as repeated measurements, that
is, before- and after-treatment measurements).
The 2 sample Kolmogorov-Smirnov test resulted not significant, so that
I could not reject the null hypothesis that the samples came from the
same population. Then I did not implement the treatment.
I would like to know if my reasoning is correct, or rather if I should
have used a 1 sampled test applied to the differences between the two
samples.

Your reasoning seems to be, "I can apply a test that
I figure is inappropriate and weak, and I want to do it
because it gives me the results that I like." No, I do
not like that logic. It seems to me that I answered the
same statistical question a few days ago, too.

If you want a weaker version of the paired test
(or, one-sample t), you can use a smaller N.

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html
Hanspeter...
Posted: Mon May 05, 2008 6:14 am
Guest
On 5 Mag, 03:28, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:
Quote:
On Sun, 4 May 2008 07:24:55 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:

In my opinion an experiment should be designed so as to yield paired
samples (i.e.: before-treatment and after-treatment measurements) only
in the case one wants to understand whether the treatment results in
some effect or not.

Why do you say, " ... only in the case"?  
If you are not interested in looking at an effect,
why do you do any experiment?   -- The paired
test, when the data are paired, gives the *correct*
measurement of error.  Usually, the correlation is
positive, and the result is greater statistical power,
or, in other words, a sample that acts as if it were larger.

Such experiment biases things in such a way so as to reveal even the
least effect of the treatment.

And that is a *good*  thing.  After you know how
big the effect is, *then*  you decide whether it is big
enough to matter.  

                 That is, the experiment des not
simulate the reality, but it tries only to detect an effect, whatever
little it may be.

The experiment *measures*  the effect.  It is desirable to
have measurement that is as precise as possible, and it
is desirable to have an *accurate*  assessment of the error.

           Then the experiment that yields paired samples is
not good in order to reveal whether the treatment is worth giving;
that is, whether the population of after-treatment measurements is
significantly different from the one generated by before-treatment
measurements.

Well, knowing the size is better than not-knowing the
size of the effect.  If some effect is "too small to be worth
knowing about", then state that at the start.  





In my computer-simulated experiments, I have obtained spontaneously
two paired samples (= before- and after-treatment measurements), even
if I had no intention to test for understanding whether the treatment
had any effect. I was perfectly aware the treatment was somehow
effective!
Instead, I made computer-simulations in order to understand whether
the treatment was worth giving, since the implementation of the
treatment was not free of charge.
To be more precise: the object under test was a network whose
configuration was changing randomly. The treatment consisted in the
implementation of a deterministic criterion for assigning transmission
frequencies, instead of letting the frequencies be randomly assigned.
The treatment avoids the transmission frequency interference resulting
from random assignment of near frequencies to adjacent transmission
paths. This explains why it was obvious to me that the treatment had
some effect in any case.
I judged the network performance by means of a figure of noise that
accounted for transmission frequencies interference. The cost of the
treatment consisted in larger energy consumption due to the
implementation of the deterministic criterion for assigning
transmission frequencies. The consumption of energy was a critical
factor for the network design since network nodes (= elements) were
battery operated and the batteries can not be easily replaced.
Then I decided that I would have implemented the treatment if - and
only if - the population of the noise figures generated by the
treatment would have been significantly different from the one
resulting from the random transmission frequency assignment.
For this reason, I have used the 2 sample Kolmogorov-Smirnov test as
if the two samples were independent (actually the two samples were not
independent since they were obtained as repeated measurements, that
is, before- and after-treatment measurements).
The 2 sample Kolmogorov-Smirnov test resulted not significant, so that
I could not reject the null hypothesis that the samples came from the
same population. Then I did not implement the treatment.
I would like to know if my reasoning is correct, or rather if I should
have used a 1 sampled test applied to the differences between the two
samples.

Your reasoning seems to be, "I can apply a test that
I figure is inappropriate and weak, and I want to do it
because it gives me the results that I like."  No, I do
not like that logic.  It seems to me that I answered the
same statistical question a few days ago, too.

If you want a weaker version of the paired test
(or, one-sample t), you can use a smaller N.

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html- Nascondi testo tra virgolette -

- Mostra testo tra virgolette -

I have set two students the design of a transmission frequencies
assignment criterion in order to implement it in place of the random
number generator. They devised a deterministic criterion of
frequencies assignment that was not able to generate a population of
noise figures better than the one generated by the random number
generator. 2 sample Kolmogorov-Smirnov test resulted non significant
even at alpha=10%! Graphic representation of both the empirical
cumulative functions revealed two overlapped (= imbricated) "curves",
accordingly with test result.
An earnest student should have recognized he failed the task. In fact,
the deterministic criterion was not better than the random number
generator in terms of average noise figures, while it was worse in
terms of consumption of energy.
Conversely, my students observed that each single ratio "before-
treatment/after-treatment measurements" (measurement = noise figure)
was >1 and then they said their criterion was an exceptional one (they
wandered how nobody discovered this criterion before them).
For this reason I have analyzed the samples in terms of paired samples
(i.e.: before-treatment and after-treatment noise figure measured on
the same fixed random configuration of the network) and have
discovered the differences between pairs of noise figures have average
significantly different from zero. This is in accordance with my
students' result. The above-mentioned differences were "log
differences", that is, ratios of "log arguments".
I am not able to understand how the average of pair differences may be
significantly different from zero, while both samples come from the
same population even at the significance level of 10%.

Hanspeter
Richard Ulrich...
Posted: Mon May 05, 2008 8:29 pm
Guest
On Mon, 5 May 2008 09:14:59 -0700 (PDT), Hanspeter <gjerbu at (no spam) yahoo.de>
wrote:

Quote:
On 5 Mag, 03:28, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:
On Sun, 4 May 2008 07:24:55 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:

[ snip, a bunch, preceding discussion of using
the paired t-test when the data are paired.]


Quote:
Conversely, my students observed that each single ratio "before-
treatment/after-treatment measurements" (measurement = noise figure)
was >1 and then they said their criterion was an exceptional one (they
wandered how nobody discovered this criterion before them).

This is the simplest test for paired data - the "sign" test.
Is the score from one particular sample usually the higher
score? This can be tested as the binomial probability.
The p is easy when *all* the higher scores are in the same
sample.


Quote:
For this reason I have analyzed the samples in terms of paired samples
(i.e.: before-treatment and after-treatment noise figure measured on
the same fixed random configuration of the network) and have
discovered the differences between pairs of noise figures have average
significantly different from zero. This is in accordance with my
students' result. The above-mentioned differences were "log
differences", that is, ratios of "log arguments".
I am not able to understand how the average of pair differences may be
significantly different from zero, while both samples come from the
same population even at the significance level of 10%.

It seems to me that it ought to be equally baffling to
the casual observer, to note that the test does *not*
call them "unequal" when the same one of the pair
is always higher -- That is the sort of result that even
passes the "Grandma" test. (If you showed the paired
data to your Grandma, would she agree that B is greater
on the average than A, when, for every pair, B *is*
greater than A.)

What sort of explanation do you want? Above, I gave a
casual one. Here is a more technical one.

The variance (square of the standard deviation) of the
difference between A and B is expressed, in general, as

var(A+B) = var(A) + var(B) - 2* cov(A,B) .

For independent samples, the covariance (and correlation)
is zero, so the last term drops out in the Student's t-test.

For paired samples, the covariance term is subtracted. When
the covariance is positive (usually the case), the variance
becomes smaller for the paired test of a difference, in
comparison to the test that does not use the term.

As the correlation approaches 1.0, the var(A+B) approaches
zero. So you can get a *lot* more sensitivity, or power,
or precision -- whatever you want to call it -- when you
use the variation of the differences, for data that are closely
paired.

- Getting back to your original argument: Often, we construct
our samples so that any difference that is "significant" will
also be large enough to be interesting. And vice-versa.

When folks see that situation often enough, they may tend
to overgeneralize, to think that "significant" equals "interesting".

That is a fundamental misconception. A tiny difference,
however small it needs to be to be "uninteresting,"
will be "significant" if the N is made large enough, and if
you use the same t-test.

A large difference will be "not significant" if the N is small
enough, or if the noise is otherwise greater than expected.


--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html
Hanspeter...
Posted: Thu May 08, 2008 12:01 am
Guest
On 6 Mag, 03:29, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:
Quote:
On Mon, 5 May 2008 09:14:59 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:

On 5 Mag, 03:28, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:
On Sun, 4 May 2008 07:24:55 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:

[ snip, a bunch, preceding discussion of using
the paired t-test when the data are paired.]

Conversely, my students observed that each single ratio "before-
treatment/after-treatment measurements" (measurement = noise figure)
was >1 and then they said their criterion was an exceptional one (they
wandered how nobody discovered this criterion before them).

This is the simplest test for paired data - the "sign" test.
Is the score from one particular sample usually the higher
score?  This can be tested as the binomial probability.
The p is easy when *all*  the higher scores are in the same
sample.

For this reason I have analyzed the samples in terms of paired samples
(i.e.: before-treatment and after-treatment noise figure measured on
the same fixed random configuration of the network) and have
discovered the differences between pairs of noise figures have average
significantly different from zero. This is in accordance with my
students' result. The above-mentioned differences were "log
differences", that is, ratios of "log arguments".
I am not able to understand how the average of pair differences may be
significantly different from zero, while both samples come from the
same population even at the significance level of 10%.

It seems to me that it ought to be equally baffling to
the casual observer, to note that the test does *not*
call them "unequal"  when the same one of the pair
is always higher -- That is the sort of result that even
passes the "Grandma" test.  (If you showed the paired
data to your Grandma, would she agree that B is greater
on the average than A, when, for every pair,  B *is*  
greater than A.)  

What sort of explanation do you want?  Above, I gave a
casual one.  Here is a more technical one.

The variance (square of the standard deviation) of the
difference between A and B is expressed, in general, as

  var(A+B) =  var(A) +  var(B) - 2* cov(A,B) .

For independent samples, the covariance (and correlation)
is zero, so the last term drops out in the Student's t-test.

For paired samples, the covariance term is subtracted.  When
the covariance is positive (usually the case), the variance
becomes smaller for the paired test of a difference, in
comparison to the test that does not use the term.

As the correlation approaches 1.0,  the var(A+B) approaches
zero.  So you can get a *lot*  more sensitivity, or power,
or precision -- whatever you want to call it -- when you
use the variation of the differences, for data that are closely
paired.

 - Getting back to your original argument:  Often, we construct
our samples so that any difference that is "significant" will
also be large enough to be interesting.   And vice-versa.

When folks see that situation often enough, they may tend
to overgeneralize, to think that "significant" equals "interesting".  

That is a fundamental misconception.  A tiny difference,
however small it needs to be to be  "uninteresting,"  
will be "significant" if the N is made large enough, and if
you use the same t-test.

A large difference will be "not significant" if the N is small
enough, or if the noise is otherwise greater than expected.

--
Rich Ulrichhttp://www.pitt.edu/~wpilib/index.html

There is a lack of understanding about the original problem. May be I
was not clear.
1) I thought the significance of frequency interference was obvious.
Since this is a basic concept for understanding why it is not worth
statistically testing the supremacy of the deterministic criterion
over the random one, I'll be more precise about frequency
interference. Two near frequencies signals flowing into adjacent paths
always interfere each other and hence much of the signals information
is lost. Noise figure is a measurement of frequency interference. In
order to save battery power, a random transmission frequencies
assignment algorithm (= a random number generator) was implemented at
the expense of poor performance of the whole network.
2) I decided to probe the possibility of trading better network
interference performance with a restrained increase in energy
consumption. In fact, it is not desirable to have a network working
excellently for 5 minutes only and then stopping because of battery
exhaustion. Then my problem is an optimization one dealing with 2
conflicting objectives: reduction of frequency interference and
restraint on energy consumption.
3) Random assignment of transmission frequencies generally produces
interferences, of course! Conversely, because of its own construction,
a deterministic frequencies assignment criterion keeps frequencies
interference to a minimum. In order to assess the proper frequency,
however, the deterministic criterion queries several network nodes, so
that it uses much battery energy.
4) Clearly, the deterministic criterion produces better noise figures
than the ones produced by the random number generator. Only by chance
it may happen that the randomly generated noise figure results better
than the one generated by means of the deterministic criterion. It is
obvious that any investigation in this sense is a waste of time.
Nobody can have any doubt that the deterministic criterion works
better than the random number generator as far as frequencies
interference is concerned.
5) Instead, the problem consists in testing the new criterion in terms
of optimization between better frequencies interference and worse
power consumption.
6) I thought I had solved the problem correctly in the following
terms: if the use of the new criterion produced a population of noise
figures significantly better than the one produced by the random
number generator, then I would have implemented the new criterion and
tested it for optimization.
7) The deterministic criterion devised by my students did not pass the
Kolmogorov-Smirnov 2 sample test, so that I dismissed the idea of
implementing their criterion. Only at this point the students decided
to "prove the evidence" in terms of paired (enormous) samples.
Cool My students made 5,000 simulations so that they gathered 5,000
before-treatment noise figures and 5,000 after-treatment noise
figures. If they made 10,000 simulations: 5,000 simulations for
obtaining 5,000 before-treatment noise figures and 5,000 simulations
for obtaining 5,000 after-treatment noise figures, then the samples
would have been independent and there should not be any doubt about my
conjecture (see no. 6).
9) I believe, however, that samples are so large (5,000 elements each)
that they should be almost perfect representations of the respective
populations. This means that the sample of 5,000 after-treatment noise
figures obtained by means of the above-mentioned experiment of 10,000
simulations (yielding 2 independent samples of 5,000 noise figures
each) has mean and variance almost identical to the ones of the 5,000
after-treatment noise figures actually obtained as paired samples.
10) Should it be true, my conjecture would be correct even in the
actual case of paired samples, or rather my solution to the
optimization problem is intrinsically absurd and I should solve it by
explicitly taking into account both noise figures and power
consumption?

Hanspeter
Aniko...
Posted: Fri May 09, 2008 9:52 am
Guest
On May 8, 5:01 am, Hanspeter <gje... at (no spam) yahoo.de> wrote:
Quote:
On 6 Mag, 03:29, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:





On Mon, 5 May 2008 09:14:59 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:

On 5 Mag, 03:28, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:
On Sun, 4 May 2008 07:24:55 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:

[ snip, a bunch, preceding discussion of using
the paired t-test when the data are paired.]

Conversely, my students observed that each single ratio "before-
treatment/after-treatment measurements" (measurement = noise figure)
was >1 and then they said their criterion was an exceptional one (they
wandered how nobody discovered this criterion before them).

This is the simplest test for paired data - the "sign" test.
Is the score from one particular sample usually the higher
score?  This can be tested as the binomial probability.
The p is easy when *all*  the higher scores are in the same
sample.

For this reason I have analyzed the samples in terms of paired samples
(i.e.: before-treatment and after-treatment noise figure measured on
the same fixed random configuration of the network) and have
discovered the differences between pairs of noise figures have average
significantly different from zero. This is in accordance with my
students' result. The above-mentioned differences were "log
differences", that is, ratios of "log arguments".
I am not able to understand how the average of pair differences may be
significantly different from zero, while both samples come from the
same population even at the significance level of 10%.

It seems to me that it ought to be equally baffling to
the casual observer, to note that the test does *not*
call them "unequal"  when the same one of the pair
is always higher -- That is the sort of result that even
passes the "Grandma" test.  (If you showed the paired
data to your Grandma, would she agree that B is greater
on the average than A, when, for every pair,  B *is*  
greater than A.)  

What sort of explanation do you want?  Above, I gave a
casual one.  Here is a more technical one.

The variance (square of the standard deviation) of the
difference between A and B is expressed, in general, as

  var(A+B) =  var(A) +  var(B) - 2* cov(A,B) .

For independent samples, the covariance (and correlation)
is zero, so the last term drops out in the Student's t-test.

For paired samples, the covariance term is subtracted.  When
the covariance is positive (usually the case), the variance
becomes smaller for the paired test of a difference, in
comparison to the test that does not use the term.

As the correlation approaches 1.0,  the var(A+B) approaches
zero.  So you can get a *lot*  more sensitivity, or power,
or precision -- whatever you want to call it -- when you
use the variation of the differences, for data that are closely
paired.

 - Getting back to your original argument:  Often, we construct
our samples so that any difference that is "significant" will
also be large enough to be interesting.   And vice-versa.

When folks see that situation often enough, they may tend
to overgeneralize, to think that "significant" equals "interesting".  

That is a fundamental misconception.  A tiny difference,
however small it needs to be to be  "uninteresting,"  
will be "significant" if the N is made large enough, and if
you use the same t-test.

A large difference will be "not significant" if the N is small
enough, or if the noise is otherwise greater than expected.

--
Rich Ulrichhttp://www.pitt.edu/~wpilib/index.html

There is a lack of understanding about the original problem. May be I
was not clear.
1) I thought the significance of frequency interference was obvious.
Since this is a basic concept for understanding why it is not worth
statistically testing the supremacy of the deterministic criterion
over the random one, I'll be more precise about frequency
interference. Two near frequencies signals flowing into adjacent paths
always interfere each other and hence much of the signals information
is lost. Noise figure is a measurement of frequency interference. In
order to save battery power, a random transmission frequencies
assignment algorithm (= a random number generator) was implemented at
the expense of poor performance of the whole network.
2) I decided to probe the possibility of trading better network
interference performance with a restrained increase in energy
consumption. In fact, it is not desirable to have a network working
excellently for 5 minutes only and then stopping because of battery
exhaustion. Then my problem is an optimization one dealing with 2
conflicting objectives: reduction of frequency interference and
restraint on energy consumption.
3) Random assignment of transmission frequencies generally produces
interferences, of course! Conversely, because of its own construction,
a deterministic frequencies assignment criterion keeps frequencies
interference to a minimum. In order to assess the proper frequency,
however, the deterministic criterion queries several network nodes, so
that it uses much battery energy.
4) Clearly, the deterministic criterion produces better noise figures
than the ones produced by the random number generator. Only by chance
it may happen that the randomly generated noise figure results better
than the one generated by means of the deterministic criterion. It is
obvious that any investigation in this sense is a waste of time.
Nobody can have any doubt that the deterministic criterion works
better than the random number generator as far as frequencies
interference is concerned.
5) Instead, the problem consists in testing the new criterion in terms
of optimization between better frequencies interference and worse
power consumption.
6) I thought I had solved the problem correctly in the following
terms: if the use of the new criterion produced a population of noise
figures significantly better than the one produced by the random
number generator, then I would have implemented the new criterion and
tested it for optimization.
7) The deterministic criterion devised by my students did not pass the
Kolmogorov-Smirnov 2 sample test, so that I dismissed the idea of
implementing their criterion. Only at this point the students decided
to "prove the evidence" in terms of paired (enormous) samples.
Cool My students made 5,000 simulations so that they gathered 5,000
before-treatment noise figures and 5,000 after-treatment noise
figures. If they made 10,000 simulations: 5,000 simulations for
obtaining 5,000 before-treatment noise figures and 5,000 simulations
for obtaining 5,000 after-treatment noise figures, then the samples
would have been independent and there should not be any doubt about my
conjecture (see no. 6).
9) I believe, however, that samples are so large (5,000 elements each)
that they should be almost perfect representations of the respective
populations. This means that the sample of 5,000 after-treatment noise
figures obtained by means of the above-mentioned experiment of 10,000
simulations (yielding 2 independent samples of 5,000 noise figures
each) has mean and variance almost identical to the ones of the 5,000
after-treatment noise figures actually obtained as paired samples.
10) Should it be true, my conjecture would be correct even in the
actual case of paired samples, or rather my solution to the
optimization problem is intrinsically absurd and I should solve it by
explicitly taking into account both noise figures and power
consumption?

Hanspeter- Hide quoted text -

- Show quoted text -

Looking at the previous responses I think that the original problem
was understood. You hit it on the nail's head in your point 10: your
approach is "intrinsically absurd", it is like hiding you head in the
sand - you refuse to see the difference when it is there. You can't
blame statistics/your student for showing what you already knew to be
true, that the intervention got to reduce the noise somewhat. Of
course, since testing for this is not interesting, you/your student
should not be doing it. You have to decide what would actually show
practically interesting improvement, and test for that.

I can see two approaches: you either have to say that you are not
interested in noise reduction unless it reaches a certain threshold
(say, 10 units, or 5%, or...) and test for that; or you have to look
at a combined outcome of the noise and power consumption. For the
latter you could say that each unit decrease of noise is worth C units
of increase in power. Your combined outcome could be Noise+C*Power. It
will decrease only if Noise decreases C times faster than power
increases. I am sure you can come up with some other combination that
is more meaningful for the given problem.

Hope it helps,
Aniko
Richard Ulrich...
Posted: Fri May 09, 2008 4:25 pm
Guest
On Fri, 9 May 2008 12:52:15 -0700 (PDT), Aniko <aniko123_57 at (no spam) yahoo.com>
wrote:

Quote:
On May 8, 5:01 am, Hanspeter <gje... at (no spam) yahoo.de> wrote:
On 6 Mag, 03:29, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:


On Mon, 5 May 2008 09:14:59 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:

On 5 Mag, 03:28, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:
On Sun, 4 May 2008 07:24:55 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:


[snip, much]

Quote:
10) Should it be true, my conjecture would be correct even in the
actual case of paired samples, or rather my solution to the
optimization problem is intrinsically absurd and I should solve it by
explicitly taking into account both noise figures and power
consumption?

Hanspeter- Hide quoted text -

- Show quoted text -
Aniko
Looking at the previous responses I think that the original problem
was understood. You hit it on the nail's head in your point 10: your
approach is "intrinsically absurd", it is like hiding you head in the
sand - you refuse to see the difference when it is there. You can't
blame statistics/your student for showing what you already knew to be
true, that the intervention got to reduce the noise somewhat. Of
course, since testing for this is not interesting, you/your student
should not be doing it. You have to decide what would actually show
practically interesting improvement, and test for that.

I can see two approaches: you either have to say that you are not
interested in noise reduction unless it reaches a certain threshold
(say, 10 units, or 5%, or...) and test for that; or you have to look
at a combined outcome of the noise and power consumption. For the
latter you could say that each unit decrease of noise is worth C units
of increase in power. Your combined outcome could be Noise+C*Power. It
will decrease only if Noise decreases C times faster than power
increases. I am sure you can come up with some other combination that
is more meaningful for the given problem.

Hope it helps,
Aniko


Aniko, thanks. That covers it.

--
Rich Ulrich

http://www.pitt.edu/~wpilib/index.html
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Sat Oct 11, 2008 1:30 am