Class GaussianKernelDistribution
- java.lang.Object
-
- org.apache.commons.math3.distribution.AbstractRealDistribution
-
- org.processmining.plugins.stochasticpetrinet.distribution.AnotherAbstractRealDistribution
-
- org.processmining.plugins.stochasticpetrinet.distribution.GaussianKernelDistribution
-
- All Implemented Interfaces:
java.io.Serializable,org.apache.commons.math3.analysis.UnivariateFunction,org.apache.commons.math3.distribution.RealDistribution
- Direct Known Subclasses:
GaussianReflectionKernelDistribution
public class GaussianKernelDistribution extends AnotherAbstractRealDistribution
Simple gaussian kernel estimator. Adds a gaussian kernel for each data point with specified smoothing parameter#kernelBandwidthA precision parameter controls the precision, such that for a precision of 0.1, all sample values between -0.05 and 0.05 will be treated as 0.0.
The bandwidth of the kernel is adjusted to the data distribution and works best for normally distributed data It used the formula proposed in:
Scott, D. W. (1992) Multivariate Density Estimation: Theory, Practice, and Visualization. Wiley
- Author:
- Andreas Rogge-Solti
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description protected doublehsmoothing parameterprotected java.util.Map<java.lang.Long,java.lang.Double>kernelPointsAndWeightsThis map stores the number of occurrences of values in defined intervals.protected org.apache.commons.math3.distribution.NormalDistributionndiststatic intNUMBER_OF_BINSGrid over the dataprotected doubleprecisionThe precision parameter determines the interval size for kernels for improved efficiency.protected java.util.List<java.lang.Double>sampleValuesAll observed values in an array (easier for sampling)protected static java.math.MathContextveryPrecise
-
Constructor Summary
Constructors Constructor Description GaussianKernelDistribution()GaussianKernelDistribution(double precision)Creates a kernel distribution grouping kernels with values falling into the range of the precision parameter into one "bin" with added weight Precision 0.1 for example creates ten bins for one unit, precision 0.5 creates two bins.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddValue(double val)voidaddValues(double[] values)doublecumulativeProbability(double x)doublecumulativeProbability(double arg0, double arg1)doubledensity(double x)protected double[]getDoubleArray(java.util.List<java.lang.Double> values)doublegetH()The smoothing parameter for the density estimation that depends on the number of nodes and the inter-quartile rangedoublegetNumericalMean()The expected value:doublegetReasonableLowerBound()doublegetReasonableUpperBound()doublegetSupportLowerBound()doublegetSupportUpperBound()java.util.List<java.lang.Double>getValues()booleanisSupportConnected()booleanisSupportLowerBoundInclusive()booleanisSupportUpperBoundInclusive()doubleprobability(double x)Should use density, as P(X=x) is zero for real-valued distributionsdoublesample()Simply select one value from the observations at random and sample from it's Gaussian Kernel heap.protected voidupdateKernels()voidupdateSmoothingParameter()Uses the 'rule of thumb' for the kernel bandwith combined with the more robust quantile based approximation.-
Methods inherited from class org.processmining.plugins.stochasticpetrinet.distribution.AnotherAbstractRealDistribution
getNumericalVariance, value
-
-
-
-
Field Detail
-
precision
protected double precision
The precision parameter determines the interval size for kernels for improved efficiency. We do not store n kernels for n observations, but group kernels falling into a particular interval into one with the weight factor capturing the number of occurrences.Change: Make this dynamic depending on the range of values (make it
-
NUMBER_OF_BINS
public static final int NUMBER_OF_BINS
Grid over the data- See Also:
- Constant Field Values
-
kernelPointsAndWeights
protected java.util.Map<java.lang.Long,java.lang.Double> kernelPointsAndWeights
This map stores the number of occurrences of values in defined intervals. The interval size is regulated by theprecisionargument
-
sampleValues
protected java.util.List<java.lang.Double> sampleValues
All observed values in an array (easier for sampling)
-
veryPrecise
protected static java.math.MathContext veryPrecise
-
h
protected double h
smoothing parameter
-
ndist
protected org.apache.commons.math3.distribution.NormalDistribution ndist
-
-
Constructor Detail
-
GaussianKernelDistribution
public GaussianKernelDistribution()
-
GaussianKernelDistribution
public GaussianKernelDistribution(double precision)
Creates a kernel distribution grouping kernels with values falling into the range of the precision parameter into one "bin" with added weight Precision 0.1 for example creates ten bins for one unit, precision 0.5 creates two bins. Values in the interval [0.05,0.05[ fall into bin "0".- Parameters:
precision- the interval size to be captured by one bin. Instead of creating n kernels for n values, we reduce the kernel count by grouping similar values and adjusting the weight of the shared kernel.
-
-
Method Detail
-
addValues
public void addValues(double[] values)
-
addValue
public void addValue(double val)
-
updateKernels
protected void updateKernels()
-
updateSmoothingParameter
public void updateSmoothingParameter()
Uses the 'rule of thumb' for the kernel bandwith combined with the more robust quantile based approximation.
-
getDoubleArray
protected double[] getDoubleArray(java.util.List<java.lang.Double> values)
-
cumulativeProbability
public double cumulativeProbability(double x)
- Specified by:
cumulativeProbabilityin interfaceorg.apache.commons.math3.distribution.RealDistribution- Overrides:
cumulativeProbabilityin classAnotherAbstractRealDistribution
-
cumulativeProbability
public double cumulativeProbability(double arg0, double arg1) throws org.apache.commons.math3.exception.NumberIsTooLargeException- Specified by:
cumulativeProbabilityin interfaceorg.apache.commons.math3.distribution.RealDistribution- Overrides:
cumulativeProbabilityin classorg.apache.commons.math3.distribution.AbstractRealDistribution- Throws:
org.apache.commons.math3.exception.NumberIsTooLargeException
-
density
public double density(double x)
-
getNumericalMean
public double getNumericalMean()
Description copied from class:AnotherAbstractRealDistributionThe expected value:- Specified by:
getNumericalMeanin interfaceorg.apache.commons.math3.distribution.RealDistribution- Overrides:
getNumericalMeanin classAnotherAbstractRealDistribution
-
getSupportLowerBound
public double getSupportLowerBound()
-
getSupportUpperBound
public double getSupportUpperBound()
-
isSupportConnected
public boolean isSupportConnected()
-
isSupportLowerBoundInclusive
public boolean isSupportLowerBoundInclusive()
-
isSupportUpperBoundInclusive
public boolean isSupportUpperBoundInclusive()
-
probability
public double probability(double x)
Should use density, as P(X=x) is zero for real-valued distributions- Specified by:
probabilityin interfaceorg.apache.commons.math3.distribution.RealDistribution- Overrides:
probabilityin classorg.apache.commons.math3.distribution.AbstractRealDistribution
-
sample
public double sample()
Simply select one value from the observations at random and sample from it's Gaussian Kernel heap.- Specified by:
samplein interfaceorg.apache.commons.math3.distribution.RealDistribution- Overrides:
samplein classorg.apache.commons.math3.distribution.AbstractRealDistribution
-
getValues
public java.util.List<java.lang.Double> getValues()
-
getH
public double getH()
The smoothing parameter for the density estimation that depends on the number of nodes and the inter-quartile range- Returns:
-
getReasonableUpperBound
public double getReasonableUpperBound()
-
getReasonableLowerBound
public double getReasonableLowerBound()
-
-