Class AdaptiveHistogram

Object
AdaptiveHistogram
All Implemented Interfaces:
Serializable

public class AdaptiveHistogram extends Object implements Serializable
This class implements a histogram that adapts to an unknown data distribution. It keeps a more or less constant resolution throughout the data range by increasing the resolution where the data is more dense. For example, if the data has such such a distribution that most of the values lie in the 0-5 range and only a few are in the 5-10 range, the histogram would adapt and assign more counting buckets to the 0-5 range and less to the 5-10 range. This implementation provides a method to obtain the accumulative density function for a given data point, and a method to obtain the data point that splits the data set at a given percentile.
Author:
Jorge Handl
See Also:
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    protected static interface 
    Auxiliary interface for inline functor object.
  • Constructor Summary

    Constructors
    Constructor
    Description
    Class constructor.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    addValue(double value)
    Adds a data point to the histogram.
    long
    getAccumCount(double value)
    Returns the cumulative density function for a given data point.
    long
    getCount(double value)
    Returns the number of data points stored in the same bucket as a given value.
    protected int
    This method is used by the internal data structure of the histogram to get the limit of data points that should be counted at one bucket.
    double
    getValueForPercentile(int percentile)
    Returns the data point that splits the data set at a given percentile.
    void
    normalize(double targetMin, double targetMax)
    Normalizes all the values to the desired range.
    void
    Erases all data from the histogram.
    void
    Shows the histograms' underlying data structure.
    ArrayList<Cell>
    Return a table representing the data in this histogram.

    Methods inherited from class Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • AdaptiveHistogram

      public AdaptiveHistogram()
      Class constructor.
  • Method Details

    • reset

      public void reset()
      Erases all data from the histogram.
    • addValue

      public void addValue(double value)
      Adds a data point to the histogram.
      Parameters:
      value - the data point to add.
    • getCount

      public long getCount(double value)
      Returns the number of data points stored in the same bucket as a given value.
      Parameters:
      value - the reference data point.
      Returns:
      the number of data points stored in the same bucket as the reference point.
    • getAccumCount

      public long getAccumCount(double value)
      Returns the cumulative density function for a given data point.
      Parameters:
      value - the reference data point.
      Returns:
      the cumulative density function for the reference point.
    • getValueForPercentile

      public double getValueForPercentile(int percentile)
      Returns the data point that splits the data set at a given percentile.
      Parameters:
      percentile - the percentile at which the data set is split.
      Returns:
      the data point that splits the data set at the given percentile.
    • getCountPerNodeLimit

      protected int getCountPerNodeLimit()
      This method is used by the internal data structure of the histogram to get the limit of data points that should be counted at one bucket.
      Returns:
      the limit of data points to store a one bucket.
    • normalize

      public void normalize(double targetMin, double targetMax)
      Normalizes all the values to the desired range.
      Parameters:
      targetMin - the target new minimum value.
      targetMax - the target new maximum value.
    • show

      public void show()
      Shows the histograms' underlying data structure.
    • toTable

      public ArrayList<Cell> toTable()
      Return a table representing the data in this histogram. Each element is a table cell containing the range limit values and the count for that range.