| |
 |
|
|
Science Forum Index » Space - Consult Forum » Sum of Bernoulli random variables...
Page 1 of 1
|
| Author |
Message |
| ... |
Posted: Mon May 26, 2008 10:58 pm |
|
|
|
Guest
|
Hello,
Suppose S = X_1 + X_2 + ... + X_n where X_i is a 0-1 random variable
with Pr(X_i = 1) = p_i and Pr(X_i = 0) = 1-p_i.
If all the p_i are the same, it is just a binomial random variable.
However, can we say anything when the p_i are not the same?
Furthermore, what will it be if p_i are also random variables with
some distribution?
I did some simulation and it seems that the sum of the Bernoulli
random variables with different p_i looks like a normal distribution.
Thanks,
Peter |
|
|
| Back to top |
|
| Richard Ulrich... |
Posted: Tue May 27, 2008 6:10 pm |
|
|
|
Guest
|
On Tue, 27 May 2008 01:58:32 -0700 (PDT), swk.aio at (no spam) gmail.com wrote:
Quote: Hello,
Suppose S = X_1 + X_2 + ... + X_n where X_i is a 0-1 random variable
with Pr(X_i = 1) = p_i and Pr(X_i = 0) = 1-p_i.
If all the p_i are the same, it is just a binomial random variable.
However, can we say anything when the p_i are not the same?
Furthermore, what will it be if p_i are also random variables with
some distribution?
I did some simulation and it seems that the sum of the Bernoulli
random variables with different p_i looks like a normal distribution.
If you add a number of items *of the same magnitude*,
the results tends toward Normal. That's from the Central Limit
Theorem.
You won't get that if most of the values are p< .0001, and
there's just a few values of 0.5 thrown in. Same magnitude.
--
Rich Ulrich
http://www.pitt.edu/~wpilib/index.html |
|
|
| Back to top |
|
| illywhacker... |
Posted: Tue May 27, 2008 10:30 pm |
|
|
|
Guest
|
On May 27, 10:58 am, swk.... at (no spam) gmail.com wrote:
Quote: Hello,
Suppose S = X_1 + X_2 + ... + X_n where X_i is a 0-1 random variable
with Pr(X_i = 1) = p_i and Pr(X_i = 0) = 1-p_i.
If all the p_i are the same, it is just a binomial random variable.
However, can we say anything when the p_i are not the same?
Furthermore, what will it be if p_i are also random variables with
some distribution?
I did some simulation and it seems that the sum of the Bernoulli
random variables with different p_i looks like a normal distribution.
Thanks,
Peter
There are several cases:
1) As Rich says, if all the variables have the same value of p, the
distribution will 'converge in measure' towards a stable normal
distribution for large enough n, if you substract from S its mean
(i.e. np) and divide the result by \sqrt{n}. What this fancy sounding
phrase means is the following. Obviously the probability will still be
concentrated on a discrete set; after all, S is an integer. So if you
plot the probability as a function at fine enough resolution, you will
see a series of spikes that look nothing like a Gaussian. On the other
hand, if you compute the probability that lies in an interval, then as
n gets large, you can make this interval as small as you like and the
probability in the interval will be given by the normal distribution.
Equivalently, if you plot the probability at a coarse resolution, it
will look like a Gaussian.
You have to substract the meand and divide by \sqrt{n}, or you will
end up with a normal distribution that either has a mean that heads
off to infinity (if you do not subtract the mean) or that narrows or
spreads more and more around the mean (if you normalize differently).
2) If the p_{i} are not all the same, you need some knowledge of what
they are (prior distributions) in order to predict what will happen:
P({x_{i}} | n) = \int [\prod_{i} dp_{i}] [\prod_{i} b(x_{i} | p_{i})]
P({p_{i}}) ,
where b is a Bernouilli distribution.
3) If the p_{i}'s are independent:
P({p_{i}}) = \prod_{i} Q_{i}(p_{i}) ,
and have prior distributions Q_{i} that satisfy Q_{i}(p) = Q_{i}(1 -
p), then the x_{i}are independently Bernouilli distributed with
P(x_{i} = 1) = 1/2, so you are back to case (1).
4) If the p_{i} = p are all the same, but unknown, then the
distribution for the {x_{i}} is a function only of their sum. The
details now depend on the prior distribution, but for large n, the
prior distribution is not as important because the data overwhelms it,
and we have
P(S | n) = \int_{0}^{1} dp p^{S} (1 - p)^{n - S}
which is a function known as a beta function:
P(S | n) = B(S + 1, n - S + 1) = S! (n - S)! / (n + 1)!
As you can see, the probability rapidly concentrates towards 0 and 1.
There may be other tractable cases, but I do not think there can be a
general answer without knowing something about the prior distribution.
illywhacker; |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Fri Dec 05, 2008 8:53 am
|
|