Main Page | Report this Page
Computers Forum Index  »  Computer Artificial Intelligence - Genetic  »  need some techniques...
Page 1 of 1    

need some techniques...

Author Message
vasu...
Posted: Wed Aug 05, 2009 12:20 pm
Guest
Dear All,

i am working on classification problem with highly
unbalanced data set in data mining .please can any one suggest me some
good techniques for achieving better results (in terms of
sensitivity ,specificity& accuracy).

Thanks in advance
 
Phil Sherrod...
Posted: Thu Aug 06, 2009 5:15 am
Guest
On 5-Aug-2009, vasu <vasu.anss at (no spam) gmail.com> wrote:

Quote:
i am working on classification problem with highly
unbalanced data set in data mining .please can any one suggest me some
good techniques for achieving better results (in terms of
sensitivity ,specificity& accuracy).

Shift the probability threshold you are using for the decision point from 0.5 in the direction you
need to go to balance the error.

--
Phil Sherrod
(PhilSherrod 'at' comcast.net)
http://www.dtreg.com (Decision trees, Neural networks, SVM)
http://www.nlreg.com (Nonlinear Regression)
 
msnel...
Posted: Thu Aug 06, 2009 10:10 am
Guest
I'm not sure if Phil's technique will give you higher accuracy:
although it probably will give you a higher number of true positives
(higher sensitivity), it might also give you a higher number of false
positives, and therefore lower specificity (less true negatives).

A simple approach is to rebalance your training set by duplicating (or
upweighting) the examples from (the) rare class(es), or similarly
discard (or downweight) examples from the frequent class. Also,
training class-conditional models (e.g. naive Bayes, Gaussian mixture
models) works better in this case than using discriminative
classifiers (e.g. SVMs, decision trees, nearest neighbour).

Finally, if you're using a model that outputs a probability of
belonging to a class, you might consider ranking the examples
according to probability and then taking the top x% of that ranking to
belong to a class. In this case you might be able to make a reasonable
prediction even if you have 0 true positives.
 
 
Page 1 of 1    
All times are GMT
The time now is Thu Nov 26, 2009 9:59 am