Interface Matrix2D

All Superinterfaces:
Iterable<DMUtil.FieldValues>, Serializable

public interface Matrix2D extends Serializable, Iterable<DMUtil.FieldValues>
Provides access to a QueryResult in a formula context, in the form of a two-dimension matrix. The matrix columns correspond to query projections, matrix rows correspond to the rows in the query result set, in the same order. Row and column indexes are zero-based. A query result may be obtained by

  • executing a Query in the DatamartContext
  • executing a org.apache.metamodel.query Query in the TableContext
  • calling the DatamartQuery function, or equivalent SandboxAPI api.datamartQuery() - both deprecated!

  • Method Details

    • getRowCount

      long getRowCount()
      Returns:
      Number of rows in the matrix
    • getColumnCount

      long getColumnCount()
      Returns:
      Number of columns in the matrix
    • getColumnLabel

      String getColumnLabel(long col)
      Parameters:
      col - Index of the column to get the label for. The label is the corresponding projection's alias.
      Returns:
      The column label
    • isEmpty

      boolean isEmpty()
      A matrix is empty if it has no rows.
      Returns:
      true if no rows
    • getColumnLabels

      List<String> getColumnLabels()
      Returns:
      The column labels from left to right.
    • getColumn

      long getColumn(String label)
      Parameters:
      label - Label of a column in the matrix
      Returns:
      The index of that column
    • getColumns

      long[] getColumns(Collection<String> labels)
      Parameters:
      labels - Labels of a set of columns in the matrix
      Returns:
      The indexes of those columns, in the same order
    • getColumnDataType

      DataType getColumnDataType(String colLabel)
      Parameters:
      colLabel - Label of the column to get the DataType for
      Returns:
      DataType of the given column; null if matrix does not have any meta defined.
    • selectRow

      Matrix2D selectRow(long row)
      Selects one row in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.
      Parameters:
      row - Index of row to select
      Returns:
      Sub-matrix containing that one row
    • selectRows

      Matrix2D selectRows(long startRow, long endRow)
      Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.
      Parameters:
      startRow - Index of first row to select, inclusive
      endRow - Index of last row to select, inclusive
      Returns:
      Sub-matrix containing all rows from the start to end row
    • selectColumn

      Matrix2D selectColumn(String label)
      Selects a column in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.
      Parameters:
      label - Label of the column to select
      Returns:
      Sub-matrix containing that one column
    • selectColumn

      Matrix2D selectColumn(long col)
      Selects a column in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.
      Parameters:
      col - Index of the column to select
      Returns:
      Sub-matrix containing that one column
    • selectColumns

      Matrix2D selectColumns(long startCol, long endCol)
      Selects a subset of columns in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.
      Parameters:
      startCol - Index of first column to select, inclusive
      endCol - Index of last column to select, inclusive
      Returns:
      Sub-matrix containing all columns from the start to end column
    • select

      Matrix2D select(long startRow, long endRow, long startColumn, long endColumn)
      Selects a subset of rows and columns in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.
      Parameters:
      startRow - Index of first row to select, inclusive
      endRow - Index of last row to select, inclusive
      startColumn - Index of first column to select, inclusive
      endColumn - Index of last column to select, inclusive
      Returns:
      Sub-matrix containing all cells from the rectangular area defined by the coordinates (startRow,startCol) to (endRow,endCol)
    • selectRows

      Matrix2D selectRows(long[] rows)
      Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.
      Parameters:
      rows - Array of row indices to select.
      Returns:
      Sub-matrix containing the rows matching those indices, preserving the order from the original matrix
    • selectRows

      Matrix2D selectRows(Collection<? extends Number> rows)
      Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.
      Parameters:
      rows - Collection of row indices to select.
      Returns:
      Sub-matrix containing the rows matching those indices, preserving the order from the original matrix
    • search

      long search(BigDecimal value, long col)
      Finds the index of row containing a given value in a given column, assuming this column is sorted high --> low. if there is no exact match we pick the previous or next one, whichever is closest to the given threshold value/
      Parameters:
      value - Threshold value
      col - Index of the column to search
      Returns:
      Index of the row closest to the threshold value
    • search

      long[] search(Collection<?> values, long col)
      Find all rows for which the value in the given column is contained in the passed in collection. This would typically be done only on columns containing discrete data.
      Parameters:
      values - Collection of values to match to
      col - Index of the column to search
      Returns:
      Array of indices of matching rows
    • max

      Matrix2D max()
      Returns:
      A new one row matrix, containing the maximum (of the population of values) for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
    • min

      Matrix2D min()
      Returns:
      A new one row matrix, containing the minimum (of the population of values) for of each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
    • sum

      Matrix2D sum()
      Returns:
      A new one row matrix, containing the sum of the values for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
    • mean

      Matrix2D mean()
      Returns:
      A new one row matrix, containing the mean (of the population of values) for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
    • std

      Matrix2D std()
      Returns:
      A new one row matrix, containing the standard deviation (of the population of values) for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
    • indexOfMin

      Matrix2D indexOfMin()
      Returns:
      A new one row matrix, containing the index of the row that holds the minimum value, for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
    • indexOfMax

      Matrix2D indexOfMax()
      Returns:
      A new one row matrix, containing the index of the row that holds the maximum value, for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
    • normalize

      Matrix2D normalize(long... cols)
      Normalizes numeric column values, by dividing each value in the column by the total for the column. If the total = 0, the normalized value is also 0.
      Parameters:
      cols - The indices of the columns to normalize. These columns should contain numerical data, for th enormalization to be useful.
      Returns:
      A reference to this Matrix2D, in which the values have been normalized 'in situ'.
    • getValue

      Object getValue()
      Convenience getter of values in the matrix.
      Returns:

      • a single object value if the matrix has just one cell
      • a map if the matrix as just one row
      • a list of object values if the matrix has just one column
      • a list of maps if the matrix has multiple rows and columns
      • null for a 0-dim matrix (with 0 columns and or 0 rows)
      • else just the matrix itself

    • getValue

      Object getValue(long row, long col)
      Parameters:
      row - Row index
      col - Column index
      Returns:
      Value of cell at coordinates (row,col)
    • getRowValues

      DMUtil.FieldValues getRowValues(long row)
      Parameters:
      row - Row index
      Returns:
      Values on that row, in FieldValues (Map) format, i.e. [colLabel = value,...]
    • getColumnValues

      List<Object> getColumnValues(long col)
      Parameters:
      col - Column index
      Returns:
      The values in that column in List format, in the same order.
    • setValue

      void setValue(Object val, long row, long col)
      Modifies the value at the given coordinates (row,col)
      Parameters:
      val - Value to set
      row - Row index
      col - Column index
    • setRowValues

      void setRowValues(Map<String,Object> val, long row)
      Modifies the value at the row coordinate, setting those values that match column names in the matrix.
      Parameters:
      val - Map of (column name, value) pairs.
      row - Row index
    • getPercentileValue

      BigDecimal getPercentileValue(long col, int percentile)
      Calculates the value at the given percentile in the given numeric column.
      Parameters:
      col - Column index
      percentile - Integer number from 0 to 100
      Returns:
      The value so that percentile% of values (samples) in the column are smaller or equal to this value
    • getCumulativeDistributionValue

      Long getCumulativeDistributionValue(long col, BigDecimal value)
      Determines the total number of values (samples) smaller or equal to the given value, in the given numeric column.
      Parameters:
      col - Column index
      value - Input to the CDF ('cumulative distribution function')
      Returns:
      CDF evaluated at that value
    • getExcelPercentileValues

      BigDecimal[] getExcelPercentileValues(long c)
      Computes percentiles compatible with Excel
      Parameters:
      c - Column index
      Returns:
      0 to 100th percentile values in a 101-sized BigDecimal array (i.e. result[75] fives the 75th percentile).
    • getExcelPercentRank

      double getExcelPercentRank(long c, double value)
      Computes the rank of a value, as a percentage, in the distribution of values in the given column. The values in the column do not need to be sorted upfront. The algorithm uses interpolation when there is no exact match with the passed in value.
      Parameters:
      c - Column index
      value - The value for which to return the rank
      Returns:
      The rank, from 0 to 1, denoting the percentile of the distribution in which the given value falls.
    • normalizeRowValues

      Matrix2D normalizeRowValues(Collection<String> colLabels, int precision)
      Normalizes the values on a row-by-row basis. The value for each included value is divided by the sum of those values. If this sum is zero, the normalized values will also be zero.
      Parameters:
      colLabels - Labels of the columns to include in the normalization
      precision - Maximum number of decimals in the normalized value
      Returns:
      A reference to this Matrix2D, in which the values have been normalized 'in situ'.
    • kurtosis

      double kurtosis(String label)
      Parameters:
      label - Label of a numeric column
      Returns:
      Kurtosis of the distribution of values in that column
    • sortRows

      Matrix2D sortRows(long col)
      Parameters:
      col - Column index
      Returns:
      A reference to this Matrix2D, in which the rows are now ordered so that the values in the given column are sorted from low --> high
    • sortRowsReverse

      Matrix2D sortRowsReverse(long col)
      Parameters:
      col - Column index
      Returns:
      A reference to this Matrix2D, in which the rows are now ordered so that the values in the given column are sorted from high --> low
    • getMap

      Map<String,Object> getMap(String keyColLabel, String valColLabel)
      Extracts a map from the values in the nominated key and value columns. The assumption is the value column is functionally dependent on the key column (as else key-->value pairs on one row may overwrite the relationship between the pair on a previous row).
      Parameters:
      keyColLabel - Label of the column for which the values are to be used as keys in the map
      valColLabel - Label of the column for which the values are to be used as values in the map
      Returns:
      Key-->value map in the same order as in this Matrix2D
    • getMap

      Map<MultiKey<String>,Object> getMap(List<String> colLabels, String valColLabel)
      Extracts a map from the values in the nominated key columns (resulting in keys of type MultiKey) and value columns. The assumption is the value column is functionally dependent on the key column (as else key-->value pairs on one row may overwrite the relationship between the pair on a prevoius row).
      Parameters:
      colLabels - Labels of the columns for which the values are to be used as MultiKeys in the map
      valColLabel - Label of the column for which the values are to be used as values in the map
      Returns:
      MutltiKey-->value map in the same order as in this Matrix2D
    • addMappedColumn

      Matrix2D addMappedColumn(String colLabel, String mappedColLabel, Map<String,Object> mapping)
      Add a column to this Matrix2D, with the value on each row looked up from the given map, using the mapped column's value as the lookup key.
      Parameters:
      colLabel - Label of the new column
      mappedColLabel - Label of the column holding the lookup keys
      mapping - Map holding the values to map those lookup keys to
      Returns:
      A reference to this matrix, to which the new column is added
    • addMappedColumn

      Matrix2D addMappedColumn(String colLabel, List<String> fromColLabels, Map<MultiKey<String>,Object> mapping)
      Add a column to this Matrix2D, with the value on each row looked up from the given map, using a set of existing columns' values as a composite lookup key. The keys in the map are assumed to have been instantiated by PublicGroovyAPI.multiKey(Object...).
      Parameters:
      colLabel - Label of the new column
      fromColLabels - Labels of the columns holding the composite lookup key values
      mapping - Map holding the values to map those lookup keys to
      Returns:
      A reference to this matrix, to which the new column is added
    • addColumn

      Matrix2D addColumn(String colLabel, Object[] colValues)
      Add a column to this Matrix2D, with the value on each row retreived from the given values array.
      Parameters:
      colLabel - Label of the new column
      colValues - Array of values to populate the new column with
      Returns:
      A reference to this matrix, to which the new column is added
    • addColumn

      Matrix2D addColumn(String colLabel)
      Add a column to this Matrix2D, without any values
      Parameters:
      colLabel - Label of the new column
      Returns:
      A reference to this matrix, to which the new column is added
    • renameColumns

      Matrix2D renameColumns(Map<String,String> old2NewColLabels)
      Renames the columns in this matrix so that the old column label = key, and the new column label = value in the given map.
      Parameters:
      old2NewColLabels - Old --> new column label mapping
      Returns:
      A reference to this matrix, in which the matching columns have been renamed
    • replaceMissingValues

      Matrix2D replaceMissingValues(String colLabel, Object val)
      Replaces nulls in the given column with the given value.
      Parameters:
      colLabel - Label of column to process
      val - Value to set
      Returns:
      A reference to this matrix, in which the null values have been replaced.
    • quantize

      List<String> quantize(int numberOfBuckets, String newColLabelPostfix, String bucketNaming)
      Convenience method to quantize all numeric columns in one go.
      Parameters:
      numberOfBuckets -
      newColLabelPostfix - Example: the bins for column 'margin' go into a new column named 'margin_q' if the prefix is set to '_q'.
      bucketNaming -
    • quantize

      void quantize(String colLabel, int numberOfBuckets, String newColLabel, String bucketNaming)
      Performs equal width quantization of a scalar distribution in a given column, and maps each value to its nearest lower quantile. A column is added if none with the given label exists, or when the label is null.
      Parameters:
      colLabel - Label of the column holding the scalar values
      numberOfBuckets - Number of quantiles/buckets to calculate (max 100). Typically 10 deciles or 4 quartiles.
      newColLabel - Label of the new column to add, holding the mapped quantile
      bucketNaming - One of the following values:

      • "ABC": quantiles are named "A","B","C" etc.
      • "INT": quantiles are numbered 0,2,3,...
      • "RANGE": quantile names reflect the range of the bucket (lower bound → upper bound)
      • "RANGE_PCT": as "RANGE" but not the bounds are formatted as percentages (ex. 8% vs 0.08)

    • toResultMatrix

      ResultMatrix toResultMatrix()
    • toResultMatrix

      ResultMatrix toResultMatrix(boolean withColumnFormats)
      Converts this Matrix2D into ResultMatrix
      Parameters:
      withColumnFormats - boolean flag to add columns formats and normalize datetime fields.
      Returns:
      ResultMatrix
      See Also:
    • toResultMatrix

      ResultMatrix toResultMatrix(Map<String,String> columnMapping)
      Create a ResultMatrix from a Matrix2D, selecting only the columns included in the mapping if such a mapping is provided.
      Parameters:
      columnMapping - Mapping ResultMatrix column name --> Matrix2D column label
      Returns:
      ResultMatrix
    • toResultMatrix

      ResultMatrix toResultMatrix(Map<String,String> columnMapping, boolean withColumnFormats)
    • fromResultMatrix

      static Matrix2DImpl fromResultMatrix(ResultMatrix resultMatrix)
    • fromListOfMaps

      static Matrix2DImpl fromListOfMaps(Collection<Map<String,Object>> rows)
    • fromListOfMaps

      static Matrix2DImpl fromListOfMaps(Collection<String> colLabels, Collection<Map<String,Object>> rows)