Class Evaluation
- java.lang.Object
-
- org.processmining.plugins.workshop.Yaguang.WekaDiscriminationTree.Evaluation
-
- All Implemented Interfaces:
Summarizable
public class Evaluation extends java.lang.Object implements Summarizable
Class for evaluating machine learning models. ------------------------------------------------------------------- General options when evaluating a learning scheme from the command-line: -t filename
Name of the file with the training data. (required) -T filename
Name of the file with the test data. If missing a cross-validation is performed. -c index
Index of the class attribute (1, 2, ...; default: last). -x number
The number of folds for the cross-validation (default: 10). -s seed
Random number seed for the cross-validation (default: 1). -m filename
The name of a file containing a cost matrix. -l filename
Loads classifier from the given file. -d filename
Saves classifier built from the training data into the given file. -v
Outputs no statistics for the training data. -o
Outputs statistics only, not the classifier. -i
Outputs information-retrieval statistics per class. -k
Outputs information-theoretic statistics. -p range
Outputs predictions for test instances, along with the attributes in the specified range (and nothing else). Use '-p 0' if no attributes are desired. -r
Outputs cumulative margin distribution (and nothing else). -g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else). ------------------------------------------------------------------- Example usage as the main of a classifier (called FunkyClassifier):------------------------------------------------------------------ Example usage from within an application:public static void main(String [] args) { try { Classifier scheme = new FunkyClassifier(); System.out.println(Evaluation.evaluateModel(scheme, args)); } catch (Exception e) { System.err.println(e.getMessage()); } }Instances trainInstances = ... instances got from somewhere Instances testInstances = ... instances got from somewhere Classifier scheme = ... scheme got from somewhere Evaluation evaluation = new Evaluation(trainInstances); evaluation.evaluateModel(scheme, testInstances); System.out.println(evaluation.toSummaryString());
- Version:
- $Revision: 1.53.2.6 $
- Author:
- Eibe Frank (eibe@cs.waikato.ac.nz), Len Trigg (trigg@cs.waikato.ac.nz)
-
-
Field Summary
Fields Modifier and Type Field Description static double[]accprotected static intk_MarginResolutionResolution of the margin histogramprotected booleanm_ClassIsNominalIs the class nominal or numeric?protected java.lang.String[]m_ClassNamesThe names of the classes.protected double[]m_ClassPriorsThe prior probabilities of the classesprotected doublem_ClassPriorsSumThe sum of counts for priorsprotected double[][]m_ConfusionMatrixArray for storing the confusion matrix.protected doublem_CorrectThe weight of all correctly classified instances.protected CostMatrixm_CostMatrixThe cost matrix (if given).protected Estimatorm_ErrorEstimatorNumeric class error estimator for schemeprotected doublem_IncorrectThe weight of all incorrectly classified instances.protected double[]m_MarginCountsCumulative margin distributionprotected doublem_MissingClassThe weight of all instances that had no class assigned to them.protected booleanm_NoPriorsenables/disables the use of priors, e.g., if no training set is present in case of de-serialized schemesprotected intm_NumClassesThe number of classes.protected intm_NumFoldsThe number of folds for a cross-validation.protected intm_NumTrainClassValsNumber of non-missing class training instances seenprotected Estimatorm_PriorErrorEstimatorNumeric class error estimator for priorprotected doublem_SumAbsErrSum of absolute errors.protected doublem_SumClassSum of class values.protected doublem_SumClassPredictedSum of predicted * class values.protected doublem_SumErrSum of errors.protected doublem_SumKBInfoTotal Kononenko & Bratko Informationprotected doublem_SumPredictedSum of predicted values.protected doublem_SumPriorAbsErrSum of absolute errors of the priorprotected doublem_SumPriorEntropyTotal entropy of prior predictionsprotected doublem_SumPriorSqrErrSum of absolute errors of the priorprotected doublem_SumSchemeEntropyTotal entropy of scheme predictionsprotected doublem_SumSqrClassSum of squared class values.protected doublem_SumSqrErrSum of squared errors.protected doublem_SumSqrPredictedSum of squared predicted values.protected doublem_TotalCostThe total cost of predictions (includes instance weights)protected double[]m_TrainClassValsArray containing all numeric training class values seenprotected double[]m_TrainClassWeightsArray containing all numeric training class weightsprotected doublem_UnclassifiedThe weight of all unclassified instances.protected doublem_WithClassThe weight of all instances that had a class assigned to them.protected static doubleMIN_SF_PROBThe minimum probablility accepted from an estimator to avoid taking log(0) in Sf calculations.
-
Constructor Summary
Constructors Constructor Description Evaluation(Instances data)Initializes all the counters for the evaluation.Evaluation(Instances data, CostMatrix costMatrix)Initializes all the counters for the evaluation and also takes a cost matrix as parameter.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voidaddNumericTrainClass(double classValue, double weight)Adds a numeric (non-missing) training class value and weight to the buffer of stored values.protected static java.lang.StringattributeValuesString(Instance instance, Range attRange)Builds a string listing the attribute values in a specified range of indices, separated by commas and enclosed in brackets.doubleavgCost()Gets the average cost, that is, total cost of misclassifications (incorrect plus unclassified) over the total number of instances.double[][]confusionMatrix()Returns a copy of the confusion matrix.doublecorrect()Gets the number of instances correctly classified (that is, for which a correct prediction was made).doublecorrelationCoefficient()Returns the correlation coefficient if the class is numeric.voidcrossValidateModel(java.lang.String classifierString, Instances data, int numFolds, java.lang.String[] options, java.util.Random random)Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.voidcrossValidateModel(Classifier classifier, Instances data, int numFolds, java.util.Random random)Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances.double[]doEvaluateModelOnce(Classifier classifier, Instance instance)booleanequals(java.lang.Object obj)Tests whether the current evaluation object is equal to another evaluation objectdoubleerrorRate()Returns the estimated error rate or the root mean squared error (if the class is numeric).static java.lang.StringevaluateModel(java.lang.String classifierString, java.lang.String[] options)Evaluates a classifier with the options given in an array of strings.static java.lang.StringevaluateModel(Classifier classifier, java.lang.String[] options)Evaluates a classifier with the options given in an array of strings.double[]evaluateModel(Classifier classifier, Instances data)Evaluates the classifier on a given set of instances.doubleevaluateModelOnce(double[] dist, Instance instance)Evaluates the supplied distribution on a single instance.voidevaluateModelOnce(double prediction, Instance instance)Evaluates the supplied prediction on a single instance.doubleevaluateModelOnce(Classifier classifier, Instance instance)doublefalseNegativeRate(int classIndex)Calculate the false negative rate with respect to a particular class.doublefalsePositiveRate(int classIndex)Calculate the false positive rate with respect to a particular class.doublefMeasure(int classIndex)Calculate the F-Measure with respect to a particular class.protected static CostMatrixhandleCostOption(java.lang.String costFileName, int numClasses)Attempts to load a cost matrix.doubleincorrect()Gets the number of instances incorrectly classified (that is, for which an incorrect prediction was made).doublekappa()Returns value of kappa statistic if class is nominal.doubleKBInformation()Return the total Kononenko & Bratko Information score in bitsdoubleKBMeanInformation()Return the Kononenko & Bratko Information score in bits per instance.doubleKBRelativeInformation()Return the Kononenko & Bratko Relative Information scorestatic voidmain(java.lang.String[] args)A test method for this class.protected double[]makeDistribution(double predictedClass)Convert a single prediction into a probability distribution with all zero probabilities except the predicted value which has probability 1.0;protected static java.lang.StringmakeOptionString(Classifier classifier)Make up the help string giving all the command line optionsdoublemeanAbsoluteError()Returns the mean absolute error.doublemeanPriorAbsoluteError()Returns the mean absolute error of the prior.protected java.lang.Stringnum2ShortID(int num, char[] IDChars, int IDWidth)Method for generating indices for the confusion matrix.doublenumFalseNegatives(int classIndex)Calculate number of false negatives with respect to a particular class.doublenumFalsePositives(int classIndex)Calculate number of false positives with respect to a particular class.doublenumInstances()Gets the number of test instances that had a known class value (actually the sum of the weights of test instances with known class value).doublenumTrueNegatives(int classIndex)Calculate the number of true negatives with respect to a particular class.doublenumTruePositives(int classIndex)Calculate the number of true positives with respect to a particular class.doublepctCorrect()Gets the percentage of instances correctly classified (that is, for which a correct prediction was made).doublepctIncorrect()Gets the percentage of instances incorrectly classified (that is, for which an incorrect prediction was made).doublepctUnclassified()Gets the percentage of instances not classified (that is, for which no prediction was made by the classifier).doubleprecision(int classIndex)Calculate the precision with respect to a particular class.protected static java.lang.StringprintClassifications(Classifier classifier, Instances train, java.lang.String testFileName, int classIndex, Range attributesToOutput)Prints the predictions for the given dataset into a String variable.doublepriorEntropy()Calculate the entropy of the prior distributiondoublerecall(int classIndex)Calculate the recall with respect to a particular class.doublerelativeAbsoluteError()Returns the relative absolute error.doublerootMeanPriorSquaredError()Returns the root mean prior squared error.doublerootMeanSquaredError()Returns the root mean squared error.doublerootRelativeSquaredError()Returns the root relative squared error if the class is numeric.protected voidsetNumericPriorsFromBuffer()Sets up the priors for numeric class attributes from the training class values that have been seen so far.voidsetPriors(Instances train)Sets the class prior probabilitiesdoubleSFEntropyGain()Returns the total SF, which is the null model entropy minus the scheme entropy.doubleSFMeanEntropyGain()Returns the SF per instance, which is the null model entropy minus the scheme entropy, per instance.doubleSFMeanPriorEntropy()Returns the entropy per instance for the null modeldoubleSFMeanSchemeEntropy()Returns the entropy per instance for the schemedoubleSFPriorEntropy()Returns the total entropy for the null modeldoubleSFSchemeEntropy()Returns the total entropy for the schemejava.lang.StringtoClassDetailsString()Generates a breakdown of the accuracy for each class (with default title), incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure.java.lang.StringtoClassDetailsString(java.lang.String title)Generates a breakdown of the accuracy for each class, incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure.java.lang.StringtoCumulativeMarginDistributionString()Output the cumulative margin distribution as a string suitable for input for gnuplot or similar package.java.lang.StringtoMatrixString()Calls toMatrixString() with a default title.java.lang.StringtoMatrixString(java.lang.String title)Outputs the performance statistics as a classification confusion matrix.java.lang.StringtoSummaryString()Calls toSummaryString() with no title and no complexity statsjava.lang.StringtoSummaryString(boolean printComplexityStatistics)Calls toSummaryString() with a default title.java.lang.StringtoSummaryString(java.lang.String title, boolean printComplexityStatistics)Outputs the performance statistics in summary form.doubletotalCost()Gets the total cost, that is, the cost of each prediction times the weight of the instance, summed over all instances.doubletrueNegativeRate(int classIndex)Calculate the true negative rate with respect to a particular class.doubletruePositiveRate(int classIndex)Calculate the true positive rate with respect to a particular class.doubleunclassified()Gets the number of instances not classified (that is, for which no prediction was made by the classifier).protected voidupdateMargins(double[] predictedDistribution, int actualClass, double weight)Update the cumulative record of classification marginsprotected voidupdateNumericScores(double[] predicted, double[] actual, double weight)Update the numeric accuracy measures.voidupdatePriors(Instance instance)Updates the class prior probabilities (when incrementally training)protected voidupdateStatsForClassifier(double[] predictedDistribution, Instance instance)Updates all the statistics about a classifiers performance for the current test instance.protected voidupdateStatsForPredictor(double predictedValue, Instance instance)Updates all the statistics about a predictors performance for the current test instance.voiduseNoPriors()disables the use of priors, e.g., in case of de-serialized schemes that have no access to the original training set, but are evaluated on a set set.protected static java.lang.StringwekaStaticWrapper(Sourcable classifier, java.lang.String className)Wraps a static classifier in enough source to test using the weka class libraries.
-
-
-
Field Detail
-
m_NumClasses
protected int m_NumClasses
The number of classes.
-
m_NumFolds
protected int m_NumFolds
The number of folds for a cross-validation.
-
m_Incorrect
protected double m_Incorrect
The weight of all incorrectly classified instances.
-
m_Correct
protected double m_Correct
The weight of all correctly classified instances.
-
m_Unclassified
protected double m_Unclassified
The weight of all unclassified instances.
-
m_MissingClass
protected double m_MissingClass
The weight of all instances that had no class assigned to them.
-
m_WithClass
protected double m_WithClass
The weight of all instances that had a class assigned to them.
-
m_ConfusionMatrix
protected double[][] m_ConfusionMatrix
Array for storing the confusion matrix.
-
m_ClassNames
protected java.lang.String[] m_ClassNames
The names of the classes.
-
m_ClassIsNominal
protected boolean m_ClassIsNominal
Is the class nominal or numeric?
-
m_ClassPriors
protected double[] m_ClassPriors
The prior probabilities of the classes
-
m_ClassPriorsSum
protected double m_ClassPriorsSum
The sum of counts for priors
-
m_CostMatrix
protected CostMatrix m_CostMatrix
The cost matrix (if given).
-
m_TotalCost
protected double m_TotalCost
The total cost of predictions (includes instance weights)
-
m_SumErr
protected double m_SumErr
Sum of errors.
-
m_SumAbsErr
protected double m_SumAbsErr
Sum of absolute errors.
-
m_SumSqrErr
protected double m_SumSqrErr
Sum of squared errors.
-
m_SumClass
protected double m_SumClass
Sum of class values.
-
m_SumSqrClass
protected double m_SumSqrClass
Sum of squared class values.
-
m_SumPredicted
protected double m_SumPredicted
Sum of predicted values.
-
m_SumSqrPredicted
protected double m_SumSqrPredicted
Sum of squared predicted values.
-
m_SumClassPredicted
protected double m_SumClassPredicted
Sum of predicted * class values.
-
m_SumPriorAbsErr
protected double m_SumPriorAbsErr
Sum of absolute errors of the prior
-
m_SumPriorSqrErr
protected double m_SumPriorSqrErr
Sum of absolute errors of the prior
-
m_SumKBInfo
protected double m_SumKBInfo
Total Kononenko & Bratko Information
-
k_MarginResolution
protected static int k_MarginResolution
Resolution of the margin histogram
-
m_MarginCounts
protected double[] m_MarginCounts
Cumulative margin distribution
-
m_NumTrainClassVals
protected int m_NumTrainClassVals
Number of non-missing class training instances seen
-
m_TrainClassVals
protected double[] m_TrainClassVals
Array containing all numeric training class values seen
-
m_TrainClassWeights
protected double[] m_TrainClassWeights
Array containing all numeric training class weights
-
m_PriorErrorEstimator
protected Estimator m_PriorErrorEstimator
Numeric class error estimator for prior
-
m_ErrorEstimator
protected Estimator m_ErrorEstimator
Numeric class error estimator for scheme
-
MIN_SF_PROB
protected static final double MIN_SF_PROB
The minimum probablility accepted from an estimator to avoid taking log(0) in Sf calculations.- See Also:
- Constant Field Values
-
m_SumPriorEntropy
protected double m_SumPriorEntropy
Total entropy of prior predictions
-
m_SumSchemeEntropy
protected double m_SumSchemeEntropy
Total entropy of scheme predictions
-
m_NoPriors
protected boolean m_NoPriors
enables/disables the use of priors, e.g., if no training set is present in case of de-serialized schemes
-
acc
public static double[] acc
-
-
Constructor Detail
-
Evaluation
public Evaluation(Instances data) throws java.lang.Exception
Initializes all the counters for the evaluation. UseuseNoPriors()if the dataset is the test set and you can't initialize with the priors from the training set viasetPriors(Instances).- Parameters:
data- set of training instances, to get some header information and prior class distribution information- Throws:
java.lang.Exception- if the class is not defined- See Also:
useNoPriors(),setPriors(Instances)
-
Evaluation
public Evaluation(Instances data, CostMatrix costMatrix) throws java.lang.Exception
Initializes all the counters for the evaluation and also takes a cost matrix as parameter. UseuseNoPriors()if the dataset is the test set and you can't initialize with the priors from the training set viasetPriors(Instances).- Parameters:
data- set of training instances, to get some header information and prior class distribution informationcostMatrix- the cost matrix---if null, default costs will be used- Throws:
java.lang.Exception- if cost matrix is not compatible with data, the class is not defined or the class is numeric- See Also:
useNoPriors(),setPriors(Instances)
-
-
Method Detail
-
doEvaluateModelOnce
public double[] doEvaluateModelOnce(Classifier classifier, Instance instance) throws java.lang.Exception
- Throws:
java.lang.Exception
-
confusionMatrix
public double[][] confusionMatrix()
Returns a copy of the confusion matrix.- Returns:
- a copy of the confusion matrix as a two-dimensional array
-
crossValidateModel
public void crossValidateModel(Classifier classifier, Instances data, int numFolds, java.util.Random random) throws java.lang.Exception
Performs a (stratified if class is nominal) cross-validation for a classifier on a set of instances. Now performs a deep copy of the classifier before each call to buildClassifier() (just in case the classifier is not initialized properly).- Parameters:
classifier- the classifier with any options set.data- the data on which the cross-validation is to be performednumFolds- the number of folds for the cross-validationrandom- random number generator for randomization- Throws:
java.lang.Exception- if a classifier could not be generated successfully or the class is not defined
-
crossValidateModel
public void crossValidateModel(java.lang.String classifierString, Instances data, int numFolds, java.lang.String[] options, java.util.Random random) throws java.lang.ExceptionPerforms a (stratified if class is nominal) cross-validation for a classifier on a set of instances.- Parameters:
classifierString- a string naming the class of the classifierdata- the data on which the cross-validation is to be performednumFolds- the number of folds for the cross-validationoptions- the options to the classifier. Any optionsrandom- the random number generator for randomizing the data accepted by the classifier will be removed from this array.- Throws:
java.lang.Exception- if a classifier could not be generated successfully or the class is not defined
-
evaluateModel
public static java.lang.String evaluateModel(java.lang.String classifierString, java.lang.String[] options) throws java.lang.ExceptionEvaluates a classifier with the options given in an array of strings. Valid options are: -t filename
Name of the file with the training data. (required) -T filename
Name of the file with the test data. If missing a cross-validation is performed. -c index
Index of the class attribute (1, 2, ...; default: last). -x number
The number of folds for the cross-validation (default: 10). -s seed
Random number seed for the cross-validation (default: 1). -m filename
The name of a file containing a cost matrix. -l filename
Loads classifier from the given file. -d filename
Saves classifier built from the training data into the given file. -v
Outputs no statistics for the training data. -o
Outputs statistics only, not the classifier. -i
Outputs detailed information-retrieval statistics per class. -k
Outputs information-theoretic statistics. -p range
Outputs predictions for test instances, along with the attributes in the specified range (and nothing else). Use '-p 0' if no attributes are desired. -r
Outputs cumulative margin distribution (and nothing else). -g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).- Parameters:
classifierString- class of machine learning classifier as a stringoptions- the array of string containing the options- Returns:
- a string describing the results
- Throws:
java.lang.Exception- if model could not be evaluated successfully
-
main
public static void main(java.lang.String[] args)
A test method for this class. Just extracts the first command line argument as a classifier class name and calls evaluateModel.- Parameters:
args- an array of command line arguments, the first of which must be the class name of a classifier.
-
evaluateModel
public static java.lang.String evaluateModel(Classifier classifier, java.lang.String[] options) throws java.lang.Exception
Evaluates a classifier with the options given in an array of strings. Valid options are: -t name of training file
Name of the file with the training data. (required) -T name of test file
Name of the file with the test data. If missing a cross-validation is performed. -c class index
Index of the class attribute (1, 2, ...; default: last). -x number of folds
The number of folds for the cross-validation (default: 10). -s random number seed
Random number seed for the cross-validation (default: 1). -m file with cost matrix
The name of a file containing a cost matrix. -l name of model input file
Loads classifier from the given file. -d name of model output file
Saves classifier built from the training data into the given file. -v
Outputs no statistics for the training data. -o
Outputs statistics only, not the classifier. -i
Outputs detailed information-retrieval statistics per class. -k
Outputs information-theoretic statistics. -p
Outputs predictions for test instances (and nothing else). -r
Outputs cumulative margin distribution (and nothing else). -g
Only for classifiers that implement "Graphable." Outputs the graph representation of the classifier (and nothing else).- Parameters:
classifier- machine learning classifieroptions- the array of string containing the options- Returns:
- a string describing the results
- Throws:
java.lang.Exception- if model could not be evaluated successfully
-
handleCostOption
protected static CostMatrix handleCostOption(java.lang.String costFileName, int numClasses) throws java.lang.Exception
Attempts to load a cost matrix.- Parameters:
costFileName- the filename of the cost matrixnumClasses- the number of classes that should be in the cost matrix (only used if the cost file is in old format).- Returns:
- a
CostMatrixvalue, or null if costFileName is empty - Throws:
java.lang.Exception- if an error occurs.
-
evaluateModel
public double[] evaluateModel(Classifier classifier, Instances data) throws java.lang.Exception
Evaluates the classifier on a given set of instances. Note that the data must have exactly the same format (e.g. order of attributes) as the data used to train the classifier! Otherwise the results will generally be meaningless.- Parameters:
classifier- machine learning classifierdata- set of test instances for evaluation- Returns:
- the predictions
- Throws:
java.lang.Exception- if model could not be evaluated successfully
-
evaluateModelOnce
public double evaluateModelOnce(Classifier classifier, Instance instance) throws java.lang.Exception
- Throws:
java.lang.Exception
-
evaluateModelOnce
public double evaluateModelOnce(double[] dist, Instance instance) throws java.lang.ExceptionEvaluates the supplied distribution on a single instance.- Parameters:
dist- the supplied distributioninstance- the test instance to be classified- Returns:
- the prediction
- Throws:
java.lang.Exception- if model could not be evaluated successfully
-
evaluateModelOnce
public void evaluateModelOnce(double prediction, Instance instance) throws java.lang.ExceptionEvaluates the supplied prediction on a single instance.- Parameters:
prediction- the supplied predictioninstance- the test instance to be classified- Throws:
java.lang.Exception- if model could not be evaluated successfully
-
wekaStaticWrapper
protected static java.lang.String wekaStaticWrapper(Sourcable classifier, java.lang.String className) throws java.lang.Exception
Wraps a static classifier in enough source to test using the weka class libraries.- Parameters:
classifier- a Sourcable ClassifierclassName- the name to give to the source code class- Returns:
- the source for a static classifier that can be tested with weka libraries.
- Throws:
java.lang.Exception- if code-generation fails
-
numInstances
public final double numInstances()
Gets the number of test instances that had a known class value (actually the sum of the weights of test instances with known class value).- Returns:
- the number of test instances with known class
-
incorrect
public final double incorrect()
Gets the number of instances incorrectly classified (that is, for which an incorrect prediction was made). (Actually the sum of the weights of these instances)- Returns:
- the number of incorrectly classified instances
-
pctIncorrect
public final double pctIncorrect()
Gets the percentage of instances incorrectly classified (that is, for which an incorrect prediction was made).- Returns:
- the percent of incorrectly classified instances (between 0 and 100)
-
totalCost
public final double totalCost()
Gets the total cost, that is, the cost of each prediction times the weight of the instance, summed over all instances.- Returns:
- the total cost
-
avgCost
public final double avgCost()
Gets the average cost, that is, total cost of misclassifications (incorrect plus unclassified) over the total number of instances.- Returns:
- the average cost.
-
correct
public final double correct()
Gets the number of instances correctly classified (that is, for which a correct prediction was made). (Actually the sum of the weights of these instances)- Returns:
- the number of correctly classified instances
-
pctCorrect
public final double pctCorrect()
Gets the percentage of instances correctly classified (that is, for which a correct prediction was made).- Returns:
- the percent of correctly classified instances (between 0 and 100)
-
unclassified
public final double unclassified()
Gets the number of instances not classified (that is, for which no prediction was made by the classifier). (Actually the sum of the weights of these instances)- Returns:
- the number of unclassified instances
-
pctUnclassified
public final double pctUnclassified()
Gets the percentage of instances not classified (that is, for which no prediction was made by the classifier).- Returns:
- the percent of unclassified instances (between 0 and 100)
-
errorRate
public final double errorRate()
Returns the estimated error rate or the root mean squared error (if the class is numeric). If a cost matrix was given this error rate gives the average cost.- Returns:
- the estimated error rate (between 0 and 1, or between 0 and maximum cost)
-
kappa
public final double kappa()
Returns value of kappa statistic if class is nominal.- Returns:
- the value of the kappa statistic
-
correlationCoefficient
public final double correlationCoefficient() throws java.lang.ExceptionReturns the correlation coefficient if the class is numeric.- Returns:
- the correlation coefficient
- Throws:
java.lang.Exception- if class is not numeric
-
meanAbsoluteError
public final double meanAbsoluteError()
Returns the mean absolute error. Refers to the error of the predicted values for numeric classes, and the error of the predicted probability distribution for nominal classes.- Returns:
- the mean absolute error
-
meanPriorAbsoluteError
public final double meanPriorAbsoluteError()
Returns the mean absolute error of the prior.- Returns:
- the mean absolute error
-
relativeAbsoluteError
public final double relativeAbsoluteError() throws java.lang.ExceptionReturns the relative absolute error.- Returns:
- the relative absolute error
- Throws:
java.lang.Exception- if it can't be computed
-
rootMeanSquaredError
public final double rootMeanSquaredError()
Returns the root mean squared error.- Returns:
- the root mean squared error
-
rootMeanPriorSquaredError
public final double rootMeanPriorSquaredError()
Returns the root mean prior squared error.- Returns:
- the root mean prior squared error
-
rootRelativeSquaredError
public final double rootRelativeSquaredError()
Returns the root relative squared error if the class is numeric.- Returns:
- the root relative squared error
-
priorEntropy
public final double priorEntropy() throws java.lang.ExceptionCalculate the entropy of the prior distribution- Returns:
- the entropy of the prior distribution
- Throws:
java.lang.Exception- if the class is not nominal
-
KBInformation
public final double KBInformation() throws java.lang.ExceptionReturn the total Kononenko & Bratko Information score in bits- Returns:
- the K&B information score
- Throws:
java.lang.Exception- if the class is not nominal
-
KBMeanInformation
public final double KBMeanInformation() throws java.lang.ExceptionReturn the Kononenko & Bratko Information score in bits per instance.- Returns:
- the K&B information score
- Throws:
java.lang.Exception- if the class is not nominal
-
KBRelativeInformation
public final double KBRelativeInformation() throws java.lang.ExceptionReturn the Kononenko & Bratko Relative Information score- Returns:
- the K&B relative information score
- Throws:
java.lang.Exception- if the class is not nominal
-
SFPriorEntropy
public final double SFPriorEntropy()
Returns the total entropy for the null model- Returns:
- the total null model entropy
-
SFMeanPriorEntropy
public final double SFMeanPriorEntropy()
Returns the entropy per instance for the null model- Returns:
- the null model entropy per instance
-
SFSchemeEntropy
public final double SFSchemeEntropy()
Returns the total entropy for the scheme- Returns:
- the total scheme entropy
-
SFMeanSchemeEntropy
public final double SFMeanSchemeEntropy()
Returns the entropy per instance for the scheme- Returns:
- the scheme entropy per instance
-
SFEntropyGain
public final double SFEntropyGain()
Returns the total SF, which is the null model entropy minus the scheme entropy.- Returns:
- the total SF
-
SFMeanEntropyGain
public final double SFMeanEntropyGain()
Returns the SF per instance, which is the null model entropy minus the scheme entropy, per instance.- Returns:
- the SF per instance
-
toCumulativeMarginDistributionString
public java.lang.String toCumulativeMarginDistributionString() throws java.lang.ExceptionOutput the cumulative margin distribution as a string suitable for input for gnuplot or similar package.- Returns:
- the cumulative margin distribution
- Throws:
java.lang.Exception- if the class attribute is nominal
-
toSummaryString
public java.lang.String toSummaryString()
Calls toSummaryString() with no title and no complexity stats- Specified by:
toSummaryStringin interfaceSummarizable- Returns:
- a summary description of the classifier evaluation
-
toSummaryString
public java.lang.String toSummaryString(boolean printComplexityStatistics)
Calls toSummaryString() with a default title.- Parameters:
printComplexityStatistics- if true, complexity statistics are returned as well- Returns:
- the summary string
-
toSummaryString
public java.lang.String toSummaryString(java.lang.String title, boolean printComplexityStatistics)Outputs the performance statistics in summary form. Lists number (and percentage) of instances classified correctly, incorrectly and unclassified. Outputs the total number of instances classified, and the number of instances (if any) that had no class value provided.- Parameters:
title- the title for the statisticsprintComplexityStatistics- if true, complexity statistics are returned as well- Returns:
- the summary as a String
-
toMatrixString
public java.lang.String toMatrixString() throws java.lang.ExceptionCalls toMatrixString() with a default title.- Returns:
- the confusion matrix as a string
- Throws:
java.lang.Exception- if the class is numeric
-
toMatrixString
public java.lang.String toMatrixString(java.lang.String title) throws java.lang.ExceptionOutputs the performance statistics as a classification confusion matrix. For each class value, shows the distribution of predicted class values.- Parameters:
title- the title for the confusion matrix- Returns:
- the confusion matrix as a String
- Throws:
java.lang.Exception- if the class is numeric
-
toClassDetailsString
public java.lang.String toClassDetailsString() throws java.lang.ExceptionGenerates a breakdown of the accuracy for each class (with default title), incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure. Should be useful for ROC curves, recall/precision curves.- Returns:
- the statistics presented as a string
- Throws:
java.lang.Exception- if class is not nominal
-
toClassDetailsString
public java.lang.String toClassDetailsString(java.lang.String title) throws java.lang.ExceptionGenerates a breakdown of the accuracy for each class, incorporating various information-retrieval statistics, such as true/false positive rate, precision/recall/F-Measure. Should be useful for ROC curves, recall/precision curves.- Parameters:
title- the title to prepend the stats string with- Returns:
- the statistics presented as a string
- Throws:
java.lang.Exception- if class is not nominal
-
numTruePositives
public double numTruePositives(int classIndex)
Calculate the number of true positives with respect to a particular class. This is defined ascorrectly classified positives
- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the true positive rate
-
truePositiveRate
public double truePositiveRate(int classIndex)
Calculate the true positive rate with respect to a particular class. This is defined ascorrectly classified positives ------------------------------ total positives- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the true positive rate
-
numTrueNegatives
public double numTrueNegatives(int classIndex)
Calculate the number of true negatives with respect to a particular class. This is defined ascorrectly classified negatives
- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the true positive rate
-
trueNegativeRate
public double trueNegativeRate(int classIndex)
Calculate the true negative rate with respect to a particular class. This is defined ascorrectly classified negatives ------------------------------ total negatives- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the true positive rate
-
numFalsePositives
public double numFalsePositives(int classIndex)
Calculate number of false positives with respect to a particular class. This is defined asincorrectly classified negatives
- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the false positive rate
-
falsePositiveRate
public double falsePositiveRate(int classIndex)
Calculate the false positive rate with respect to a particular class. This is defined asincorrectly classified negatives -------------------------------- total negatives- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the false positive rate
-
numFalseNegatives
public double numFalseNegatives(int classIndex)
Calculate number of false negatives with respect to a particular class. This is defined asincorrectly classified positives
- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the false positive rate
-
falseNegativeRate
public double falseNegativeRate(int classIndex)
Calculate the false negative rate with respect to a particular class. This is defined asincorrectly classified positives -------------------------------- total positives- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the false positive rate
-
recall
public double recall(int classIndex)
Calculate the recall with respect to a particular class. This is defined ascorrectly classified positives ------------------------------ total positives(Which is also the same as the truePositiveRate.)- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the recall
-
precision
public double precision(int classIndex)
Calculate the precision with respect to a particular class. This is defined ascorrectly classified positives ------------------------------ total predicted as positive
- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the precision
-
fMeasure
public double fMeasure(int classIndex)
Calculate the F-Measure with respect to a particular class. This is defined as2 * recall * precision ---------------------- recall + precision
- Parameters:
classIndex- the index of the class to consider as "positive"- Returns:
- the F-Measure
-
setPriors
public void setPriors(Instances train) throws java.lang.Exception
Sets the class prior probabilities- Parameters:
train- the training instances used to determine the prior probabilities- Throws:
java.lang.Exception- if the class attribute of the instances is not set
-
updatePriors
public void updatePriors(Instance instance) throws java.lang.Exception
Updates the class prior probabilities (when incrementally training)- Parameters:
instance- the new training instance seen- Throws:
java.lang.Exception- if the class of the instance is not set
-
useNoPriors
public void useNoPriors()
disables the use of priors, e.g., in case of de-serialized schemes that have no access to the original training set, but are evaluated on a set set.
-
equals
public boolean equals(java.lang.Object obj)
Tests whether the current evaluation object is equal to another evaluation object- Overrides:
equalsin classjava.lang.Object- Parameters:
obj- the object to compare against- Returns:
- true if the two objects are equal
-
printClassifications
protected static java.lang.String printClassifications(Classifier classifier, Instances train, java.lang.String testFileName, int classIndex, Range attributesToOutput) throws java.lang.Exception
Prints the predictions for the given dataset into a String variable.- Parameters:
classifier- the classifier to usetrain- the training datatestFileName- the name of the test fileclassIndex- the class indexattributesToOutput- the indices of the attributes to output- Returns:
- the generated predictions for the attribute range
- Throws:
java.lang.Exception- if test file cannot be opened
-
attributeValuesString
protected static java.lang.String attributeValuesString(Instance instance, Range attRange)
Builds a string listing the attribute values in a specified range of indices, separated by commas and enclosed in brackets.- Parameters:
instance- the instance to print the values fromattRange- the range of the attributes to list- Returns:
- a string listing values of the attributes in the range
-
makeOptionString
protected static java.lang.String makeOptionString(Classifier classifier)
Make up the help string giving all the command line options- Parameters:
classifier- the classifier to include options for- Returns:
- a string detailing the valid command line options
-
num2ShortID
protected java.lang.String num2ShortID(int num, char[] IDChars, int IDWidth)Method for generating indices for the confusion matrix.- Parameters:
num- integer to formatIDChars- the characters to useIDWidth- the width of the entry- Returns:
- the formatted integer as a string
-
makeDistribution
protected double[] makeDistribution(double predictedClass)
Convert a single prediction into a probability distribution with all zero probabilities except the predicted value which has probability 1.0;- Parameters:
predictedClass- the index of the predicted class- Returns:
- the probability distribution
-
updateStatsForClassifier
protected void updateStatsForClassifier(double[] predictedDistribution, Instance instance) throws java.lang.ExceptionUpdates all the statistics about a classifiers performance for the current test instance.- Parameters:
predictedDistribution- the probabilities assigned to each classinstance- the instance to be classified- Throws:
java.lang.Exception- if the class of the instance is not set
-
updateStatsForPredictor
protected void updateStatsForPredictor(double predictedValue, Instance instance) throws java.lang.ExceptionUpdates all the statistics about a predictors performance for the current test instance.- Parameters:
predictedValue- the numeric value the classifier predictsinstance- the instance to be classified- Throws:
java.lang.Exception- if the class of the instance is not set
-
updateMargins
protected void updateMargins(double[] predictedDistribution, int actualClass, double weight)Update the cumulative record of classification margins- Parameters:
predictedDistribution- the probability distribution predicted for the current instanceactualClass- the index of the actual instance classweight- the weight assigned to the instance
-
updateNumericScores
protected void updateNumericScores(double[] predicted, double[] actual, double weight)Update the numeric accuracy measures. For numeric classes, the accuracy is between the actual and predicted class values. For nominal classes, the accuracy is between the actual and predicted class probabilities.- Parameters:
predicted- the predicted valuesactual- the actual valueweight- the weight associated with this prediction
-
addNumericTrainClass
protected void addNumericTrainClass(double classValue, double weight)Adds a numeric (non-missing) training class value and weight to the buffer of stored values.- Parameters:
classValue- the class valueweight- the instance weight
-
setNumericPriorsFromBuffer
protected void setNumericPriorsFromBuffer()
Sets up the priors for numeric class attributes from the training class values that have been seen so far.
-
-