Darwin  1.10(beta)
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Groups Pages
Public Member Functions | Static Public Attributes | Protected Member Functions | Static Protected Member Functions | Protected Attributes | Friends | List of all members
drwnDecisionTree Class Reference

Implements a (binary-split) decision tree classifier of arbitrary depth. More...

Inheritance diagram for drwnDecisionTree:
drwnClassifier drwnStdObjIface drwnProperties drwnWriteable drwnCloneable drwnTypeable

Public Member Functions

 drwnDecisionTree ()
 default constructor
 
 drwnDecisionTree (unsigned n, unsigned k=2)
 construct a classifier object for n features and k classes
 
 drwnDecisionTree (const drwnDecisionTree &c)
 copy constructor
 
virtual const char * type () const
 returns object type as a string (e.g., Foo::type() { return "Foo"; })
 
virtual drwnDecisionTreeclone () const
 returns a copy of the class usually implemented as virtual Foo* clone() { return new Foo(*this); }
 
virtual void initialize (unsigned n, unsigned k=2)
 initialize the classifier object for n features and k classes
 
virtual bool save (drwnXMLNode &node) const
 write object to XML node (see also write)
 
virtual bool load (drwnXMLNode &node)
 read object from XML node (see also read)
 
virtual double train (const drwnClassifierDataset &dataset)
 train the parameters of the classifier from a drwnClassifierDataset object
 
virtual double train (const vector< vector< double > > &features, const vector< int > &targets)
 train the parameters of the classifier from a set of features and corresponding labels
 
virtual double train (const vector< vector< double > > &features, const vector< int > &targets, const vector< double > &weights)
 train the parameters of the classifier from a weighted set of features and corresponding labels
 
virtual void getClassScores (const vector< double > &features, vector< double > &outputScores) const
 compute the unnormalized log-probability for a single feature vector
 
virtual int getClassification (const vector< double > &features) const
 return the most likely class label for a single feature vector
 
- Public Member Functions inherited from drwnClassifier
 drwnClassifier ()
 default constructor
 
 drwnClassifier (unsigned n, unsigned k=2)
 construct a classifer with n features and k classes
 
 drwnClassifier (const drwnClassifier &c)
 copy constructor
 
int numFeatures () const
 returns the number of features expected by the classifier object
 
int numClasses () const
 returns the number of classes predicted by the classifier object
 
virtual bool valid () const
 returns true if the classifier is valid (has been initialized and trained)
 
virtual double train (const char *filename)
 train the parameters of the classifier from data stored in filename
 
virtual void getClassScores (const vector< vector< double > > &features, vector< vector< double > > &outputScores) const
 compute the unnormalized log-probability for a set of feature vectors
 
virtual void getClassMarginals (const vector< double > &features, vector< double > &outputMarginals) const
 compute the class marginal probabilities for a single feature vector
 
virtual void getClassMarginals (const vector< vector< double > > &features, vector< vector< double > > &outputMarginals) const
 compute the class marginal probabilities for a set of feature vectors
 
virtual void getClassifications (const vector< vector< double > > &features, vector< int > &outputLabels) const
 compute the most likely class labels for a set of feature vector
 
- Public Member Functions inherited from drwnWriteable
bool write (const char *filename) const
 write object to file (calls save)
 
bool read (const char *filename)
 read object from file (calls load)
 
void dump () const
 print object's current state to standard output (for debugging)
 
- Public Member Functions inherited from drwnProperties
unsigned numProperties () const
 
bool hasProperty (const string &name) const
 
bool hasProperty (const char *name) const
 
unsigned findProperty (const string &name) const
 
unsigned findProperty (const char *name) const
 
void setProperty (unsigned indx, bool value)
 
void setProperty (unsigned indx, int value)
 
void setProperty (unsigned indx, double value)
 
void setProperty (unsigned indx, const string &value)
 
void setProperty (unsigned indx, const char *value)
 
void setProperty (unsigned indx, const Eigen::VectorXd &value)
 
void setProperty (unsigned indx, const Eigen::MatrixXd &value)
 
void setProperty (const char *name, bool value)
 
void setProperty (const char *name, int value)
 
void setProperty (const char *name, double value)
 
void setProperty (const char *name, const string &value)
 
void setProperty (const char *name, const char *value)
 
void setProperty (const char *name, const Eigen::VectorXd &value)
 
void setProperty (const char *name, const Eigen::MatrixXd &value)
 
string getPropertyAsString (unsigned indx) const
 
drwnPropertyType getPropertyType (unsigned indx) const
 
bool isReadOnly (unsigned indx) const
 
const drwnPropertyInterfacegetProperty (unsigned indx) const
 
const drwnPropertyInterfacegetProperty (const char *name) const
 
bool getBoolProperty (unsigned indx) const
 
int getIntProperty (unsigned indx) const
 
double getDoubleProperty (unsigned indx) const
 
const string & getStringProperty (unsigned indx) const
 
const list< string > & getListProperty (unsigned indx) const
 
int getSelectionProperty (unsigned indx) const
 
const Eigen::VectorXd & getVectorProperty (unsigned indx) const
 
const Eigen::MatrixXd & getMatrixProperty (unsigned indx) const
 
const string & getPropertyName (unsigned indx) const
 
vector< string > getPropertyNames () const
 
void readProperties (drwnXMLNode &xml, const char *tag="property")
 
void writeProperties (drwnXMLNode &xml, const char *tag="property") const
 
void printProperties (ostream &os) const
 

Static Public Attributes

static int MAX_DEPTH = 1
 default maximum tree depth
 
static int MAX_FEATURE_THRESHOLDS = 1000
 maximum number of thresholds to try during learning
 
static int MIN_SAMPLES = 0
 minimum number of samples (after first split)
 
static double LEAKAGE = 0.0
 probability that a training sample leaks to both splits
 
static drwnTreeSplitCriterion SPLIT_CRITERION = DRWN_DT_SPLIT_ENTROPY
 tree split criteria during learning
 
static bool CACHE_SORTED_INDEXES = true
 pre-cache indexes of sorted features
 

Protected Member Functions

const Eigen::VectorXd & evaluate (const Eigen::VectorXd &x) const
 
void learnDecisionTree (const vector< vector< double > > &x, const vector< int > &y, const vector< double > &w, const vector< vector< int > > &sortIndex, const drwnBitArray &sampleIndex)
 
- Protected Member Functions inherited from drwnProperties
void declareProperty (const string &name, drwnPropertyInterface *optif)
 
void undeclareProperty (const string &name)
 
void exposeProperties (drwnProperties *opts, const string &prefix=string(""), bool bSerializable=false)
 
virtual void propertyChanged (const string &name)
 

Static Protected Member Functions

static void computeSortedFeatureIndex (const vector< vector< double > > &x, const drwnBitArray &sampleIndex, int featureIndx, vector< int > &featureSortIndex)
 
static void computeSortedFeatureIndex (const vector< vector< double > > &x, vector< vector< int > > &sortIndex)
 

Protected Attributes

int _splitIndx
 variable index on which to split
 
double _splitValue
 split value (go left if less than)
 
drwnDecisionTree_leftChild
 left child (or NULL)
 
drwnDecisionTree_rightChild
 right child (or NULL)
 
Eigen::VectorXd _scores
 log-marginal for each class at this node
 
int _predictedClass
 argmax of _scores
 
int _maxDepth
 maximum depth of decision tree
 
- Protected Attributes inherited from drwnClassifier
int _nFeatures
 number of features
 
int _nClasses
 number of classes
 
bool _bValid
 true if the classifier has been trained or loaded
 

Friends

class drwnDecisionTreeThread
 
class drwnDecisionTreeConfig
 
class drwnBoostedClassifier
 
class drwnRandomForest
 

Detailed Description

Implements a (binary-split) decision tree classifier of arbitrary depth.

The following code snippet shows example learning a decision tree classifier on a training dataset and then testing it on a hold out evaluation dataset.

// load training dataset
dataset.read("training_data.bin");
// train the classifier
const int nFeatures = dataset.numFeatures();
const int nClasses = dataset.maxTarget() + 1;
drwnDecisionTree model(nFeatures, nClasses);
model.train(dataset);
// load evaluation set
dataset.read("testing_data.bin", false);
// predict labels
vector<int> predictions;
model.getClassifications(dataset.features, predictions);

The decision classifier has a number of parameters for controlling it's operation during training. See drwnDecisionTree::MAX_DEPTH, drwnDecisionTree::MAX_FEATURE_THRESHOLDS, drwnDecisionTree::MIN_SAMPLES, and drwnDecisionTree::SPLIT_CRITERION for details.

See Also
drwnClassifier, drwnML Tutorial

The documentation for this class was generated from the following files: