Package com.flaptor.hist4j
Class AdaptiveHistogram
- Object
-
- AdaptiveHistogram
-
- All Implemented Interfaces:
Serializable
public class AdaptiveHistogram extends Object implements Serializable
This class implements a histogram that adapts to an unknown data distribution. It keeps a more or less constant resolution throughout the data range by increasing the resolution where the data is more dense. For example, if the data has such such a distribution that most of the values lie in the 0-5 range and only a few are in the 5-10 range, the histogram would adapt and assign more counting buckets to the 0-5 range and less to the 5-10 range. This implementation provides a method to obtain the accumulative density function for a given data point, and a method to obtain the data point that splits the data set at a given percentile.- Author:
- Jorge Handl
- See Also:
- Serialized Form
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description protected static interface
AdaptiveHistogram.ValueConversion
Auxiliary interface for inline functor object.
-
Constructor Summary
Constructors Constructor Description AdaptiveHistogram()
Class constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addValue(double value)
Adds a data point to the histogram.long
getAccumCount(double value)
Returns the cumulative density function for a given data point.long
getCount(double value)
Returns the number of data points stored in the same bucket as a given value.protected int
getCountPerNodeLimit()
This method is used by the internal data structure of the histogram to get the limit of data points that should be counted at one bucket.double
getValueForPercentile(int percentile)
Returns the data point that splits the data set at a given percentile.void
normalize(double targetMin, double targetMax)
Normalizes all the values to the desired range.void
reset()
Erases all data from the histogram.void
show()
Shows the histograms' underlying data structure.ArrayList<Cell>
toTable()
Return a table representing the data in this histogram.
-
-
-
Method Detail
-
reset
public void reset()
Erases all data from the histogram.
-
addValue
public void addValue(double value)
Adds a data point to the histogram.- Parameters:
value
- the data point to add.
-
getCount
public long getCount(double value)
Returns the number of data points stored in the same bucket as a given value.- Parameters:
value
- the reference data point.- Returns:
- the number of data points stored in the same bucket as the reference point.
-
getAccumCount
public long getAccumCount(double value)
Returns the cumulative density function for a given data point.- Parameters:
value
- the reference data point.- Returns:
- the cumulative density function for the reference point.
-
getValueForPercentile
public double getValueForPercentile(int percentile)
Returns the data point that splits the data set at a given percentile.- Parameters:
percentile
- the percentile at which the data set is split.- Returns:
- the data point that splits the data set at the given percentile.
-
getCountPerNodeLimit
protected int getCountPerNodeLimit()
This method is used by the internal data structure of the histogram to get the limit of data points that should be counted at one bucket.- Returns:
- the limit of data points to store a one bucket.
-
normalize
public void normalize(double targetMin, double targetMax)
Normalizes all the values to the desired range.- Parameters:
targetMin
- the target new minimum value.targetMax
- the target new maximum value.
-
show
public void show()
Shows the histograms' underlying data structure.
-
toTable
public ArrayList<Cell> toTable()
Return a table representing the data in this histogram. Each element is a table cell containing the range limit values and the count for that range.
-
-