Main Page | Report this Page
 
   
Science Forum Index  »  Statistics - Math Forum  »  Estimate peak rate from 'low resolution' histogram?...
Page 1 of 1    
Author Message
Steve...
Posted: Mon Jun 30, 2008 10:48 pm
Guest
Hi folks,

It's my first time here and 20 years since university, so please be
gentle with me!

I'm doing a piece of analysis for which I need to understand peak
demand for a resource based on historical data. The scenario is
transactions in a retail store. My problem is that the historical data
is available only in the form of total transactions per hour, and the
resource in question is demanded only for a period of around two
seconds for each transaction.

The hourly data takes much the form which might be expected, with a
symmetric curve looking very roughly like a normal distribution (as
much as I can tell based on only 10 trading hours).

My question is this: does distribution theory provide any way to
estimate to what extent the absolute peak exceeded the mean volume in
the busiest hour?

For example, suppose a subset of the data is:

1100-1200 270 transactions per hour (mean 4.5 transactions per
minute)
1200-1300 300 transactions per hour (mean 5.0 transactions per
minute)
1300-1400 360 transactions per hour (mean 6.0 transactions per
minute)
1400-1500 300 transactions per hour (mean 5.0 transactions per
minute)
1500-1600 270 transactions per hour (mean 4.5 transactions per
minute)

It is obvious that there will be some minutes in the 1300-1400 period
where the actual demand was over 6 transactions per minute, but can I
make an informed estimate about how far over?

Secondly, in an ideal world I'd like to extend the analysis to predict
a maximum transactions per SECOND, and I have a "gut feel" that if I
can prove the busiest minute is (say) 2.5 times the mean transactions
per minute in that hour, it's also valid to consider that the busiest
second is 2.5 times the mean transactions per second in that minute,
but I'm not sure whether this "gut feel"bears any logical analysis.

Many thanks,

Steve
Paul Rubin...
Posted: Tue Jul 01, 2008 10:00 am
Guest
Steve wrote:
Quote:
Hi folks,

It's my first time here and 20 years since university, so please be
gentle with me!

I'm doing a piece of analysis for which I need to understand peak
demand for a resource based on historical data. The scenario is
transactions in a retail store. My problem is that the historical data
is available only in the form of total transactions per hour, and the
resource in question is demanded only for a period of around two
seconds for each transaction.

The hourly data takes much the form which might be expected, with a
symmetric curve looking very roughly like a normal distribution (as
much as I can tell based on only 10 trading hours).

My question is this: does distribution theory provide any way to
estimate to what extent the absolute peak exceeded the mean volume in
the busiest hour?

For example, suppose a subset of the data is:

1100-1200 270 transactions per hour (mean 4.5 transactions per
minute)
1200-1300 300 transactions per hour (mean 5.0 transactions per
minute)
1300-1400 360 transactions per hour (mean 6.0 transactions per
minute)
1400-1500 300 transactions per hour (mean 5.0 transactions per
minute)
1500-1600 270 transactions per hour (mean 4.5 transactions per
minute)

It is obvious that there will be some minutes in the 1300-1400 period
where the actual demand was over 6 transactions per minute, but can I
make an informed estimate about how far over?

Secondly, in an ideal world I'd like to extend the analysis to predict
a maximum transactions per SECOND, and I have a "gut feel" that if I
can prove the busiest minute is (say) 2.5 times the mean transactions
per minute in that hour, it's also valid to consider that the busiest
second is 2.5 times the mean transactions per second in that minute,
but I'm not sure whether this "gut feel"bears any logical analysis.

Many thanks,

Steve

If the source of the demand is an aggregation of independent choices
whether or not to conduct a transaction, from a large (think "infinite",
for finite values of "infinite" Smile) population, it's not unreasonable
to postulate the incidence of transactions as a Poisson process. This
is a common assumption in certain kinds of queueing models. Since
you're looking at transactions in a retail store (I assume this means
human customers), the key assumption is that customer X conducting (or
not conducting) a transaction does not affect the decision by customer Y
to conduct (or not conduct) a transaction. That might be a bit iffy if
the situation is one where customers are either attracted by crowds or
shy away from them. Still, the Poisson process is pretty widely used
(and accepted) in a variety of retail situations.

Since the rate of transactions varies during the day, what you're
looking at is a nonhomogeneous Poisson process. You can use your
histogram means to estimate the mean transaction rate as a function of
time, either as a step function (value 4.5 from 1100 to 1200, etc.) or
as a piecewise linear function (set the mean to 4.5 at 1130, 5.0 at 1230
etc. and connect the dots) or as some sort of smooth function (repeat
the previous step but use splines or something to connect the dots).

For a Poisson process the number of events (transactions) in a given
span of time is a Poisson random variable. You can compute a compute a
cumulative distribution function for it (it's a bit more complicated
when the process is nonhomogeneous). From that you can compute, in any
time period (say 1300-1400), the probability that the volume will exceed
any threshold you care to set. (I'm guessing this is at heart a
capacity planning question.) Since you have to estimate the mean rate,
your probability estimate for exceeding the threshold will be subject to
the vagaries of sampling error.

If the computations get too funky, you can probably get a reasonable
guesstimate using a simple Monte Carlo simulation.

HTH,
Paul
 
Page 1 of 1       All times are GMT - 5 Hours
The time now is Thu Oct 16, 2008 3:57 pm