On 6 Mag, 03:29, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:
On Mon, 5 May 2008 09:14:59 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:
On 5 Mag, 03:28, Richard Ulrich <Rich.Ulr... at (no spam) comcast.net> wrote:
On Sun, 4 May 2008 07:24:55 -0700 (PDT), Hanspeter <gje... at (no spam) yahoo.de
wrote:
[ snip, a bunch, preceding discussion of using
the paired t-test when the data are paired.]
Conversely, my students observed that each single ratio "before-
treatment/after-treatment measurements" (measurement = noise figure)
was >1 and then they said their criterion was an exceptional one (they
wandered how nobody discovered this criterion before them).
This is the simplest test for paired data - the "sign" test.
Is the score from one particular sample usually the higher
score? This can be tested as the binomial probability.
The p is easy when *all* the higher scores are in the same
sample.
For this reason I have analyzed the samples in terms of paired samples
(i.e.: before-treatment and after-treatment noise figure measured on
the same fixed random configuration of the network) and have
discovered the differences between pairs of noise figures have average
significantly different from zero. This is in accordance with my
students' result. The above-mentioned differences were "log
differences", that is, ratios of "log arguments".
I am not able to understand how the average of pair differences may be
significantly different from zero, while both samples come from the
same population even at the significance level of 10%.
It seems to me that it ought to be equally baffling to
the casual observer, to note that the test does *not*
call them "unequal" when the same one of the pair
is always higher -- That is the sort of result that even
passes the "Grandma" test. (If you showed the paired
data to your Grandma, would she agree that B is greater
on the average than A, when, for every pair, B *is*
greater than A.)
What sort of explanation do you want? Above, I gave a
casual one. Here is a more technical one.
The variance (square of the standard deviation) of the
difference between A and B is expressed, in general, as
var(A+B) = var(A) + var(B) - 2* cov(A,B) .
For independent samples, the covariance (and correlation)
is zero, so the last term drops out in the Student's t-test.
For paired samples, the covariance term is subtracted. When
the covariance is positive (usually the case), the variance
becomes smaller for the paired test of a difference, in
comparison to the test that does not use the term.
As the correlation approaches 1.0, the var(A+B) approaches
zero. So you can get a *lot* more sensitivity, or power,
or precision -- whatever you want to call it -- when you
use the variation of the differences, for data that are closely
paired.
- Getting back to your original argument: Often, we construct
our samples so that any difference that is "significant" will
also be large enough to be interesting. And vice-versa.
When folks see that situation often enough, they may tend
to overgeneralize, to think that "significant" equals "interesting".
That is a fundamental misconception. A tiny difference,
however small it needs to be to be "uninteresting,"
will be "significant" if the N is made large enough, and if
you use the same t-test.
A large difference will be "not significant" if the N is small
enough, or if the noise is otherwise greater than expected.
--
Rich
Ulrichhttp://www.pitt.edu/~wpilib/index.html
There is a lack of understanding about the original problem. May be I
was not clear.
1) I thought the significance of frequency interference was obvious.
Since this is a basic concept for understanding why it is not worth
statistically testing the supremacy of the deterministic criterion
over the random one, I'll be more precise about frequency
interference. Two near frequencies signals flowing into adjacent paths
always interfere each other and hence much of the signals information
is lost. Noise figure is a measurement of frequency interference. In
order to save battery power, a random transmission frequencies
assignment algorithm (= a random number generator) was implemented at
the expense of poor performance of the whole network.
2) I decided to probe the possibility of trading better network
interference performance with a restrained increase in energy
consumption. In fact, it is not desirable to have a network working
excellently for 5 minutes only and then stopping because of battery
exhaustion. Then my problem is an optimization one dealing with 2
conflicting objectives: reduction of frequency interference and
restraint on energy consumption.
3) Random assignment of transmission frequencies generally produces
interferences, of course! Conversely, because of its own construction,
a deterministic frequencies assignment criterion keeps frequencies
interference to a minimum. In order to assess the proper frequency,
however, the deterministic criterion queries several network nodes, so
that it uses much battery energy.
4) Clearly, the deterministic criterion produces better noise figures
than the ones produced by the random number generator. Only by chance
it may happen that the randomly generated noise figure results better
than the one generated by means of the deterministic criterion. It is
obvious that any investigation in this sense is a waste of time.
Nobody can have any doubt that the deterministic criterion works
better than the random number generator as far as frequencies
interference is concerned.
5) Instead, the problem consists in testing the new criterion in terms
of optimization between better frequencies interference and worse
power consumption.
6) I thought I had solved the problem correctly in the following
terms: if the use of the new criterion produced a population of noise
figures significantly better than the one produced by the random
number generator, then I would have implemented the new criterion and
tested it for optimization.
7) The deterministic criterion devised by my students did not pass the
Kolmogorov-Smirnov 2 sample test, so that I dismissed the idea of
implementing their criterion. Only at this point the students decided
to "prove the evidence" in terms of paired (enormous) samples.

My students made 5,000 simulations so that they gathered 5,000
before-treatment noise figures and 5,000 after-treatment noise
figures. If they made 10,000 simulations: 5,000 simulations for
obtaining 5,000 before-treatment noise figures and 5,000 simulations
for obtaining 5,000 after-treatment noise figures, then the samples
would have been independent and there should not be any doubt about my
conjecture (see no. 6).
9) I believe, however, that samples are so large (5,000 elements each)
that they should be almost perfect representations of the respective
populations. This means that the sample of 5,000 after-treatment noise
figures obtained by means of the above-mentioned experiment of 10,000
simulations (yielding 2 independent samples of 5,000 noise figures
each) has mean and variance almost identical to the ones of the 5,000
after-treatment noise figures actually obtained as paired samples.
10) Should it be true, my conjecture would be correct even in the
actual case of paired samples, or rather my solution to the
optimization problem is intrinsically absurd and I should solve it by
explicitly taking into account both noise figures and power
consumption?
Hanspeter- Hide quoted text -
- Show quoted text -