 |
|
| Computers Forum Index » Computer Artificial Intelligence - Genetic » need some techniques... |
|
Page 1 of 1 |
|
| Author |
Message |
| vasu... |
Posted: Wed Aug 05, 2009 12:20 pm |
|
|
|
Guest
|
Dear All,
i am working on classification problem with highly
unbalanced data set in data mining .please can any one suggest me some
good techniques for achieving better results (in terms of
sensitivity ,specificity& accuracy).
Thanks in advance |
|
|
| Back to top |
|
|
|
| Phil Sherrod... |
Posted: Thu Aug 06, 2009 5:15 am |
|
|
|
Guest
|
On 5-Aug-2009, vasu <vasu.anss at (no spam) gmail.com> wrote:
Quote: i am working on classification problem with highly
unbalanced data set in data mining .please can any one suggest me some
good techniques for achieving better results (in terms of
sensitivity ,specificity& accuracy).
Shift the probability threshold you are using for the decision point from 0.5 in the direction you
need to go to balance the error.
--
Phil Sherrod
(PhilSherrod 'at' comcast.net)
http://www.dtreg.com (Decision trees, Neural networks, SVM)
http://www.nlreg.com (Nonlinear Regression) |
|
|
| Back to top |
|
|
|
| msnel... |
Posted: Thu Aug 06, 2009 10:10 am |
|
|
|
Guest
|
I'm not sure if Phil's technique will give you higher accuracy:
although it probably will give you a higher number of true positives
(higher sensitivity), it might also give you a higher number of false
positives, and therefore lower specificity (less true negatives).
A simple approach is to rebalance your training set by duplicating (or
upweighting) the examples from (the) rare class(es), or similarly
discard (or downweight) examples from the frequent class. Also,
training class-conditional models (e.g. naive Bayes, Gaussian mixture
models) works better in this case than using discriminative
classifiers (e.g. SVMs, decision trees, nearest neighbour).
Finally, if you're using a model that outputs a probability of
belonging to a class, you might consider ranking the examples
according to probability and then taking the top x% of that ranking to
belong to a class. In this case you might be able to make a reasonable
prediction even if you have 0 true positives. |
|
|
| Back to top |
|
|
|
|
|
All times are GMT
The time now is Thu Nov 26, 2009 9:59 am
|
|