Dora...
When averages are quite small... say in the neighborhood of 0.5 to
perhaps 15 or 20, the distribution of the data is often/usually non-
symmetrical. That alone does not prove the data came from a Poisson
distribution. But oftentimes it's close enough, so the Poisson gets
used a lot (whether proper or not) for data of that sort.
But if the data really is from a Poisson distribution, then as the
average increases the distribution becomes more symmetrical...
eventually very close to symmetrical . So close that we can't tell
the difference between symmetrical and non-symmetrical.
I think you have correctly identified the source of your problem.
Calculating Poisson probs for the full spectrum of outcomes when the
mean is 617,000 is (1) not practical and (2) really not meaningful.
The way I am understanding Poisson probabilities, they are used to
compute
probabilities of very small, discrete numbers of things occurring, given
only the mean number of occurrences in some unit, like an hour, a day, or
five pages of typing.
Yes, and yes. Typicall small "counts" (including even 0) for each
"exposure" (1 hour, 1 day, five pages of typing)
The formua for a Poisson probability clearly tells
me why a computer with limited resources might have trouble if the mean
is 617,000, particularly if you want a cumulative probability. I am
getting
the idea that poisson probabilities are used to track traffic, but the
sort
of problems they are applied to is how many cars will run a light in an
hour. If he wants to computer how many cars will pass his van in an hour
on a particular highway then a Poisson probability might prove useful.
Yes, that is a classic application of the Poisson distribution.
Are poisson probabilities applicable to my application? Can they be
successfully computed for large numbers?
Probably not applicable. Calculating this for large numbers such as
prob(621,102) is not meaningful. Or even for the cumulative probs in
the tails.
If the distribution of the data is approx. symmetrical, then I'd
immediately use a normal approximation the the Poisson. I'm assuming
the purpose here is to estimate the "number or frequency in either one
or both tails of the distribution".
If it's really Poisson, then with a mean of 617,000 the data should
form a nice symmetrical pattern.
If it's not symmetrical... then, with a mean of 617,000... the data is
surely not Poisson. Even a normal approximation may not be good
enough in this case.
In many instances well-intentioned people use a distribution they
learned about in a course in basis statistics because (1) it's all
they know to use and/or (2) they don't realize that data seldom spawns
from precisely the classic distributions they learned about in
college. There are many other choices other than Poisson, Normal, and
Binomial.
In short, data from "perfect distributions" is often generated in a
classroom setting... but seldom occurs in practice.
How do we deal with data that doesn't fit into these "perfect"
frameworks?
Well, we can use some of the many other types of distributions.
Personally, I prefer to step around the situation by plotting the
cumulative distribution on one or more types of probability paper and
go with the one that seems "best" because it approximates a straight
line on a cumulative plot. When that is so... when we resolve this to
a reasonable straight line on cumulative prob paper... then we can
extrapolate into the tails to estimate the probs in those tails. No
calculations required... just a straightedge. OMU
On Feb 3, 10:12 pm, "Dora Smith" <villan...@austin.rr.com> wrote:
I work for a firm that compiles a database of license plate numbers, some
of
which correspond to cars that flunk roadside emissions testing. We
process on the average 617,000 records a month, about 30% of which (more
or
less) meet various criteria of our contract.
My boss has me using Excel to compute graphs, charts and statistics on
our
data. I know Excel and have basic knowledge of statistics adn
regression,
but am not familiar with everything he has asked for.
My boss specifically wants me to give him poisson probabilities of
meeting
our goals.
I am having two problems, and I don't know if you can help me with
computing
Poisson probabilities with Excel or not, but since when I use Excel to do
Poisson probability examples on the Web I get the correct answers, I
suspect
this part of my problem is related to the main question I have for you.
When I tried to use an online Poisson computer, it told me my numbers
were
far too large!
The way I am understanding Poisson probabilities, they are used to
compute
probabilities of very small, discrete numbers of things occurring, given
only the mean number of occurrences in some unit, like an hour, a day, or
five pages of typing. The formua for a Poisson probability clearly
tells
me why a computer with limited resources might have trouble if the mean
is
617,000, particularly if you want a cumulative probability. I am
getting
the idea that poisson probabilities are used to track traffic, but the
sort
of problems they are applied to is how many cars will run a light in an
hour. If he wants to computer how many cars will pass his van in an
hour
on a particular highway then a Poisson probability might prove useful.
Are poisson probabilities applicable to my application? Can they be
successfully computed for large numbers?
If not, can you just suggest what statistics might be helpful for my
boss's
forecasting. Frankly, what my boss says he wants to know sounds more
like
goal seeking - but he insists he specifically wants poisson
probabilities.
Thanks!
Yours,
Dora Smith
Austin, TX
tiggernu...@yahoo.com
--
Yours,
Dora Smith
Austin, TX
tiggernu...@yahoo.com