Package com.flaptor.hist4j
Class AdaptiveHistogram
Object
AdaptiveHistogram
- All Implemented Interfaces:
Serializable
This class implements a histogram that adapts to an unknown data distribution.
It keeps a more or less constant resolution throughout the data range by increasing
the resolution where the data is more dense. For example, if the data has such
such a distribution that most of the values lie in the 0-5 range and only a few are
in the 5-10 range, the histogram would adapt and assign more counting buckets to
the 0-5 range and less to the 5-10 range.
This implementation provides a method to obtain the accumulative density function
for a given data point, and a method to obtain the data point that splits the
data set at a given percentile.
- Author:
- Jorge Handl
- See Also:
-
Nested Class Summary
Modifier and TypeClassDescriptionprotected static interface
Auxiliary interface for inline functor object. -
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
addValue
(double value) Adds a data point to the histogram.long
getAccumCount
(double value) Returns the cumulative density function for a given data point.long
getCount
(double value) Returns the number of data points stored in the same bucket as a given value.protected int
This method is used by the internal data structure of the histogram to get the limit of data points that should be counted at one bucket.double
getValueForPercentile
(int percentile) Returns the data point that splits the data set at a given percentile.void
normalize
(double targetMin, double targetMax) Normalizes all the values to the desired range.void
reset()
Erases all data from the histogram.void
show()
Shows the histograms' underlying data structure.ArrayList
<Cell> toTable()
Return a table representing the data in this histogram.
-
Constructor Details
-
AdaptiveHistogram
public AdaptiveHistogram()Class constructor.
-
-
Method Details
-
reset
public void reset()Erases all data from the histogram. -
addValue
public void addValue(double value) Adds a data point to the histogram.- Parameters:
value
- the data point to add.
-
getCount
public long getCount(double value) Returns the number of data points stored in the same bucket as a given value.- Parameters:
value
- the reference data point.- Returns:
- the number of data points stored in the same bucket as the reference point.
-
getAccumCount
public long getAccumCount(double value) Returns the cumulative density function for a given data point.- Parameters:
value
- the reference data point.- Returns:
- the cumulative density function for the reference point.
-
getValueForPercentile
public double getValueForPercentile(int percentile) Returns the data point that splits the data set at a given percentile.- Parameters:
percentile
- the percentile at which the data set is split.- Returns:
- the data point that splits the data set at the given percentile.
-
getCountPerNodeLimit
protected int getCountPerNodeLimit()This method is used by the internal data structure of the histogram to get the limit of data points that should be counted at one bucket.- Returns:
- the limit of data points to store a one bucket.
-
normalize
public void normalize(double targetMin, double targetMax) Normalizes all the values to the desired range.- Parameters:
targetMin
- the target new minimum value.targetMax
- the target new maximum value.
-
show
public void show()Shows the histograms' underlying data structure. -
toTable
Return a table representing the data in this histogram. Each element is a table cell containing the range limit values and the count for that range.
-