| |
 |
|
|
Science Forum Index » Statistics - Education Forum » How to ensure data correlation
Page 1 of 1
|
| Author |
Message |
| Raj |
Posted: Tue Dec 12, 2006 7:15 am |
|
|
|
Guest
|
Hi,
I am post graduate student doing a project in the area of soft
computing.Th problem i am trying to solve is that of predicting
operaional risk. I have a small dataset of 25 data points that includes
five input variables - System downtime, Number of employees, Data
Quality, Number of transactions, Number of losses, and one output
variable - the loss amount. Because i have a very small dataset, i went
through the following process to generate additional data points:
step 1: Select one variable at a time and fit various distributions
over it.
step 2: Based upon the goodness of fit tests, select the best
distribution for each variable seperately.
step 3: Generate random numbers for each variable over the selected
distribution seperately.
step 4: Tabulate the values.
My question is how do we ensure the correleation among the variables
that was there in the original sample data over the randomely selected
data as well. Because the random numbers were generated seperately for
each variable i could not find any correlation among the variables that
was present in the original sample data.
please do give me some suggestions, i will be waiting for them.
regards,
Raj kiran |
|
|
| Back to top |
|
| |
|
Page 1 of 1
All times are GMT - 5 Hours
The time now is Wed Dec 03, 2008 9:58 pm
|
|