weka.classifiers.meta
Class Dagging

java.lang.Object
  extended by weka.classifiers.Classifier
      extended by weka.classifiers.SingleClassifierEnhancer
          extended by weka.classifiers.RandomizableSingleClassifierEnhancer
              extended by weka.classifiers.meta.Dagging
All Implemented Interfaces:
java.io.Serializable, java.lang.Cloneable, CapabilitiesHandler, OptionHandler, Randomizable, RevisionHandler, TechnicalInformationHandler

public class Dagging
extends RandomizableSingleClassifierEnhancer
implements TechnicalInformationHandler

This meta classifier creates a number of disjoint, stratified folds out of the data and feeds each chunk of data to a copy of the supplied base classifier. Predictions are made via majority vote, since all the generated base classifiers are put into the Vote meta classifier.
Useful for base classifiers that are quadratic or worse in time behavior, regarding number of instances in the training data.

For more information, see:
Ting, K. M., Witten, I. H.: Stacking Bagged and Dagged Models. In: Fourteenth international Conference on Machine Learning, San Francisco, CA, 367-375, 1997.

BibTeX:

 @inproceedings{Ting1997,
    address = {San Francisco, CA},
    author = {Ting, K. M. and Witten, I. H.},
    booktitle = {Fourteenth international Conference on Machine Learning},
    editor = {D. H. Fisher},
    pages = {367-375},
    publisher = {Morgan Kaufmann Publishers},
    title = {Stacking Bagged and Dagged Models},
    year = {1997}
 }
 

Valid options are:

 -F <folds>
  The number of folds for splitting the training set into
  smaller chunks for the base classifier.
  (default 10)
 -verbose
  Whether to print some more information during building the
  classifier.
  (default is off)
 -S <num>
  Random number seed.
  (default 1)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -W
  Full name of base classifier.
  (default: weka.classifiers.functions.SMO)
 
 Options specific to classifier weka.classifiers.functions.SMO:
 
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -no-checks
  Turns off all checks - use with caution!
  Turning them off assumes that data is purely numeric, doesn't
  contain any missing values, and has a nominal class. Turning them
  off also means that no header information will be stored if the
  machine is linear. Finally, it also assumes that no instance has
  a weight equal to 0.
  (default: checks on)
 -C <double>
  The complexity constant C. (default 1)
 -N
  Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
 -L <double>
  The tolerance parameter. (default 1.0e-3)
 -P <double>
  The epsilon for round-off error. (default 1.0e-12)
 -M
  Fit logistic models to SVM outputs. 
 -V <double>
  The number of folds for the internal
  cross-validation. (default -1, use training data)
 -W <double>
  The random number seed. (default 1)
 -K <classname and parameters>
  The Kernel to use.
  (default: weka.classifiers.functions.supportVector.PolyKernel)
 
 Options specific to kernel weka.classifiers.functions.supportVector.PolyKernel:
 
 -D
  Enables debugging output (if available) to be printed.
  (default: off)
 -no-checks
  Turns off all checks - use with caution!
  (default: checks on)
 -C <num>
  The size of the cache (a prime number), 0 for full cache and 
  -1 to turn it off.
  (default: 250007)
 -E <num>
  The Exponent to use.
  (default: 1.0)
 -L
  Use lower-order terms.
  (default: no)
Options after -- are passed to the designated classifier.

Version:
$Revision: 5306 $
Author:
Bernhard Pfahringer (bernhard at cs dot waikato dot ac dot nz), FracPete (fracpete at waikato dot ac dot nz)
See Also:
Vote, Serialized Form

Constructor Summary
Dagging()
          Constructor.
 
Method Summary
 void buildClassifier(Instances data)
          Bagging method.
 double[] distributionForInstance(Instance instance)
          Calculates the class membership probabilities for the given test instance.
 int getNumFolds()
          Gets the number of folds to use for splitting the training set.
 java.lang.String[] getOptions()
          Gets the current settings of the Classifier.
 java.lang.String getRevision()
          Returns the revision string.
 TechnicalInformation getTechnicalInformation()
          Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.
 boolean getVerbose()
          Gets the verbose state
 java.lang.String globalInfo()
          Returns a string describing classifier
 java.util.Enumeration listOptions()
          Returns an enumeration describing the available options.
static void main(java.lang.String[] args)
          Main method for testing this class.
 java.lang.String numFoldsTipText()
          Returns the tip text for this property
 void setNumFolds(int value)
          Sets the number of folds to use for splitting the training set.
 void setOptions(java.lang.String[] options)
          Parses a given list of options.
 void setVerbose(boolean value)
          Set the verbose state.
 java.lang.String toString()
          Returns description of the classifier.
 java.lang.String verboseTipText()
          Returns the tip text for this property
 
Methods inherited from class weka.classifiers.RandomizableSingleClassifierEnhancer
getSeed, seedTipText, setSeed
 
Methods inherited from class weka.classifiers.SingleClassifierEnhancer
classifierTipText, getCapabilities, getClassifier, setClassifier
 
Methods inherited from class weka.classifiers.Classifier
classifyInstance, debugTipText, forName, getDebug, makeCopies, makeCopy, setDebug
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Dagging

public Dagging()
Constructor.

Method Detail

globalInfo

public java.lang.String globalInfo()
Returns a string describing classifier

Returns:
a description suitable for displaying in the explorer/experimenter gui

getTechnicalInformation

public TechnicalInformation getTechnicalInformation()
Returns an instance of a TechnicalInformation object, containing detailed information about the technical background of this class, e.g., paper reference or book this class is based on.

Specified by:
getTechnicalInformation in interface TechnicalInformationHandler
Returns:
the technical information about this class

listOptions

public java.util.Enumeration listOptions()
Returns an enumeration describing the available options.

Specified by:
listOptions in interface OptionHandler
Overrides:
listOptions in class RandomizableSingleClassifierEnhancer
Returns:
an enumeration of all the available options.

setOptions

public void setOptions(java.lang.String[] options)
                throws java.lang.Exception
Parses a given list of options.

Valid options are:

 -F <folds>
  The number of folds for splitting the training set into
  smaller chunks for the base classifier.
  (default 10)
 -verbose
  Whether to print some more information during building the
  classifier.
  (default is off)
 -S <num>
  Random number seed.
  (default 1)
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -W
  Full name of base classifier.
  (default: weka.classifiers.functions.SMO)
 
 Options specific to classifier weka.classifiers.functions.SMO:
 
 -D
  If set, classifier is run in debug mode and
  may output additional info to the console
 -no-checks
  Turns off all checks - use with caution!
  Turning them off assumes that data is purely numeric, doesn't
  contain any missing values, and has a nominal class. Turning them
  off also means that no header information will be stored if the
  machine is linear. Finally, it also assumes that no instance has
  a weight equal to 0.
  (default: checks on)
 -C <double>
  The complexity constant C. (default 1)
 -N
  Whether to 0=normalize/1=standardize/2=neither. (default 0=normalize)
 -L <double>
  The tolerance parameter. (default 1.0e-3)
 -P <double>
  The epsilon for round-off error. (default 1.0e-12)
 -M
  Fit logistic models to SVM outputs. 
 -V <double>
  The number of folds for the internal
  cross-validation. (default -1, use training data)
 -W <double>
  The random number seed. (default 1)
 -K <classname and parameters>
  The Kernel to use.
  (default: weka.classifiers.functions.supportVector.PolyKernel)
 
 Options specific to kernel weka.classifiers.functions.supportVector.PolyKernel:
 
 -D
  Enables debugging output (if available) to be printed.
  (default: off)
 -no-checks
  Turns off all checks - use with caution!
  (default: checks on)
 -C <num>
  The size of the cache (a prime number), 0 for full cache and 
  -1 to turn it off.
  (default: 250007)
 -E <num>
  The Exponent to use.
  (default: 1.0)
 -L
  Use lower-order terms.
  (default: no)
Options after -- are passed to the designated classifier.

Specified by:
setOptions in interface OptionHandler
Overrides:
setOptions in class RandomizableSingleClassifierEnhancer
Parameters:
options - the list of options as an array of strings
Throws:
java.lang.Exception - if an option is not supported

getOptions

public java.lang.String[] getOptions()
Gets the current settings of the Classifier.

Specified by:
getOptions in interface OptionHandler
Overrides:
getOptions in class RandomizableSingleClassifierEnhancer
Returns:
an array of strings suitable for passing to setOptions

getNumFolds

public int getNumFolds()
Gets the number of folds to use for splitting the training set.

Returns:
the number of folds

setNumFolds

public void setNumFolds(int value)
Sets the number of folds to use for splitting the training set.

Parameters:
value - the new number of folds

numFoldsTipText

public java.lang.String numFoldsTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

setVerbose

public void setVerbose(boolean value)
Set the verbose state.

Parameters:
value - the verbose state

getVerbose

public boolean getVerbose()
Gets the verbose state

Returns:
the verbose state

verboseTipText

public java.lang.String verboseTipText()
Returns the tip text for this property

Returns:
tip text for this property suitable for displaying in the explorer/experimenter gui

buildClassifier

public void buildClassifier(Instances data)
                     throws java.lang.Exception
Bagging method.

Specified by:
buildClassifier in class Classifier
Parameters:
data - the training data to be used for generating the bagged classifier.
Throws:
java.lang.Exception - if the classifier could not be built successfully

distributionForInstance

public double[] distributionForInstance(Instance instance)
                                 throws java.lang.Exception
Calculates the class membership probabilities for the given test instance.

Overrides:
distributionForInstance in class Classifier
Parameters:
instance - the instance to be classified
Returns:
preedicted class probability distribution
Throws:
java.lang.Exception - if distribution can't be computed successfully

toString

public java.lang.String toString()
Returns description of the classifier.

Overrides:
toString in class java.lang.Object
Returns:
description of the classifier as a string

getRevision

public java.lang.String getRevision()
Returns the revision string.

Specified by:
getRevision in interface RevisionHandler
Overrides:
getRevision in class Classifier
Returns:
the revision

main

public static void main(java.lang.String[] args)
Main method for testing this class.

Parameters:
args - the options