Interface Matrix2D
- All Superinterfaces:
Iterable<DMUtil.FieldValues>
,Serializable
Provides access to a QueryResult in a formula context, in the form of a two-dimension matrix. The matrix columns correspond to query projections, matrix rows
correspond to the rows in the query result set, in the same order. Row and column indexes are zero-based.
A query result may be obtained by
- executing a
Query
in theDatamartContext
- executing a
org.apache.metamodel.query Query
in theTableContext
- calling the DatamartQuery function, or equivalent SandboxAPI api.datamartQuery() - both deprecated!
-
Nested Class Summary
-
Method Summary
Modifier and TypeMethodDescriptionAdd a column to this Matrix2D, without any valuesAdd a column to this Matrix2D, with the value on each row retreived from the given values array.Add a column to this Matrix2D, with the value on each row looked up from the given map, using the mapped column's value as the lookup key.Add a column to this Matrix2D, with the value on each row looked up from the given map, using a set of existing columns' values as a composite lookup key.static Matrix2DImpl
fromListOfMaps
(Collection<String> colLabels, Collection<Map<String, Object>> rows) static Matrix2DImpl
fromListOfMaps
(Collection<Map<String, Object>> rows) static Matrix2DImpl
fromResultMatrix
(ResultMatrix resultMatrix) long
long
getColumnDataType
(String colLabel) getColumnLabel
(long col) long[]
getColumns
(Collection<String> labels) getColumnValues
(long col) getCumulativeDistributionValue
(long col, BigDecimal value) Determines the total number of values (samples) smaller or equal to the given value, in the given numeric column.getExcelPercentileValues
(long c) Computes percentiles compatible with Exceldouble
getExcelPercentRank
(long c, double value) Computes the rank of a value, as a percentage, in the distribution of values in the given column.Extracts a map from the values in the nominated key and value columns.Extracts a map from the values in the nominated key columns (resulting in keys of type MultiKey) and value columns.getPercentileValue
(long col, int percentile) Calculates the value at the given percentile in the given numeric column.long
DMUtil.FieldValues
getRowValues
(long row) getValue()
Convenience getter of values in the matrix.getValue
(long row, long col) boolean
isEmpty()
A matrix is empty if it has no rows.double
max()
mean()
min()
normalize
(long... cols) Normalizes numeric column values, by dividing each value in the column by the total for the column.normalizeRowValues
(Collection<String> colLabels, int precision) Normalizes the values on a row-by-row basis.Convenience method to quantize all numeric columns in one go.void
Performs equal width quantization of a scalar distribution in a given column, and maps each value to its nearest lower quantile.renameColumns
(Map<String, String> old2NewColLabels) Renames the columns in this matrix so that the old column label = key, and the new column label = value in the given map.replaceMissingValues
(String colLabel, Object val) Replaces nulls in the given column with the given value.long
search
(BigDecimal value, long col) Finds the index of row containing a given value in a given column, assuming this column is sorted high --> low.long[]
search
(Collection<?> values, long col) Find all rows for which the value in the given column is contained in the passed in collection.select
(long startRow, long endRow, long startColumn, long endColumn) Selects a subset of rows and columns in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows.selectColumn
(long col) Selects a column in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows.selectColumn
(String label) Selects a column in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows.selectColumns
(long startCol, long endCol) Selects a subset of columns in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows.selectRow
(long row) Selects one row in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows.selectRows
(long[] rows) Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows.selectRows
(long startRow, long endRow) Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows.selectRows
(Collection<? extends Number> rows) Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows.void
setRowValues
(Map<String, Object> val, long row) Modifies the value at the row coordinate, setting those values that match column names in the matrix.void
Modifies the value at the given coordinates (row,col)sortRows
(long col) sortRowsReverse
(long col) std()
sum()
toResultMatrix
(boolean withColumnFormats) Converts this Matrix2D into ResultMatrixtoResultMatrix
(Map<String, String> columnMapping) Create a ResultMatrix from a Matrix2D, selecting only the columns included in the mapping if such a mapping is provided.toResultMatrix
(Map<String, String> columnMapping, boolean withColumnFormats) Methods inherited from interface Iterable
forEach, iterator, spliterator
-
Method Details
-
getRowCount
long getRowCount()- Returns:
- Number of rows in the matrix
-
getColumnCount
long getColumnCount()- Returns:
- Number of columns in the matrix
-
getColumnLabel
- Parameters:
col
- Index of the column to get the label for. The label is the corresponding projection's alias.- Returns:
- The column label
-
isEmpty
boolean isEmpty()A matrix is empty if it has no rows.- Returns:
- true if no rows
-
getColumnLabels
- Returns:
- The column labels from left to right.
-
getColumn
- Parameters:
label
- Label of a column in the matrix- Returns:
- The index of that column
-
getColumns
- Parameters:
labels
- Labels of a set of columns in the matrix- Returns:
- The indexes of those columns, in the same order
-
getColumnDataType
- Parameters:
colLabel
- Label of the column to get the DataType for- Returns:
- DataType of the given column; null if matrix does not have any meta defined.
-
selectRow
Selects one row in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.- Parameters:
row
- Index of row to select- Returns:
- Sub-matrix containing that one row
-
selectRows
Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.- Parameters:
startRow
- Index of first row to select, inclusiveendRow
- Index of last row to select, inclusive- Returns:
- Sub-matrix containing all rows from the start to end row
-
selectColumn
Selects a column in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.- Parameters:
label
- Label of the column to select- Returns:
- Sub-matrix containing that one column
-
selectColumn
Selects a column in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.- Parameters:
col
- Index of the column to select- Returns:
- Sub-matrix containing that one column
-
selectColumns
Selects a subset of columns in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.- Parameters:
startCol
- Index of first column to select, inclusiveendCol
- Index of last column to select, inclusive- Returns:
- Sub-matrix containing all columns from the start to end column
-
select
Selects a subset of rows and columns in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.- Parameters:
startRow
- Index of first row to select, inclusiveendRow
- Index of last row to select, inclusivestartColumn
- Index of first column to select, inclusiveendColumn
- Index of last column to select, inclusive- Returns:
- Sub-matrix containing all cells from the rectangular area defined by the coordinates (startRow,startCol) to (endRow,endCol)
-
selectRows
Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.- Parameters:
rows
- Array of row indices to select.- Returns:
- Sub-matrix containing the rows matching those indices, preserving the order from the original matrix
-
selectRows
Selects a subset of rows in the matrix, and returns the result as new Matrix2D object, but which crucially does not duplicate any of the original rows. Another way to look at this, is to see the result sub-matrix as a view on the original one. Any changes to the row data in one matrix, will be reflected in the other.- Parameters:
rows
- Collection of row indices to select.- Returns:
- Sub-matrix containing the rows matching those indices, preserving the order from the original matrix
-
search
Finds the index of row containing a given value in a given column, assuming this column is sorted high --> low. if there is no exact match we pick the previous or next one, whichever is closest to the given threshold value/- Parameters:
value
- Threshold valuecol
- Index of the column to search- Returns:
- Index of the row closest to the threshold value
-
search
Find all rows for which the value in the given column is contained in the passed in collection. This would typically be done only on columns containing discrete data.- Parameters:
values
- Collection of values to match tocol
- Index of the column to search- Returns:
- Array of indices of matching rows
-
max
Matrix2D max()- Returns:
- A new one row matrix, containing the maximum (of the population of values) for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
-
min
Matrix2D min()- Returns:
- A new one row matrix, containing the minimum (of the population of values) for of each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
-
sum
Matrix2D sum()- Returns:
- A new one row matrix, containing the sum of the values for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
-
mean
Matrix2D mean()- Returns:
- A new one row matrix, containing the mean (of the population of values) for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
-
std
Matrix2D std()- Returns:
- A new one row matrix, containing the standard deviation (of the population of values) for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
-
indexOfMin
Matrix2D indexOfMin()- Returns:
- A new one row matrix, containing the index of the row that holds the minimum value, for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
-
indexOfMax
Matrix2D indexOfMax()- Returns:
- A new one row matrix, containing the index of the row that holds the maximum value, for each numeric column in this matrix, at the same column indices. Non-numeric columns result in NaN.
-
normalize
Normalizes numeric column values, by dividing each value in the column by the total for the column. If the total = 0, the normalized value is also 0.- Parameters:
cols
- The indices of the columns to normalize. These columns should contain numerical data, for th enormalization to be useful.- Returns:
- A reference to this Matrix2D, in which the values have been normalized 'in situ'.
-
getValue
Object getValue()Convenience getter of values in the matrix.- Returns:
- a single object value if the matrix has just one cell
- a map if the matrix as just one row
- a list of object values if the matrix has just one column
- a list of maps if the matrix has multiple rows and columns
- null for a 0-dim matrix (with 0 columns and or 0 rows)
- else just the matrix itself
-
getValue
- Parameters:
row
- Row indexcol
- Column index- Returns:
- Value of cell at coordinates (row,col)
-
getRowValues
DMUtil.FieldValues getRowValues(long row) - Parameters:
row
- Row index- Returns:
- Values on that row, in FieldValues (Map) format, i.e. [colLabel = value,...]
-
getColumnValues
- Parameters:
col
- Column index- Returns:
- The values in that column in List format, in the same order.
-
setValue
Modifies the value at the given coordinates (row,col)- Parameters:
val
- Value to setrow
- Row indexcol
- Column index
-
setRowValues
Modifies the value at the row coordinate, setting those values that match column names in the matrix.- Parameters:
val
- Map of (column name, value) pairs.row
- Row index
-
getPercentileValue
Calculates the value at the given percentile in the given numeric column.- Parameters:
col
- Column indexpercentile
- Integer number from 0 to 100- Returns:
- The value so that percentile% of values (samples) in the column are smaller or equal to this value
-
getCumulativeDistributionValue
Determines the total number of values (samples) smaller or equal to the given value, in the given numeric column.- Parameters:
col
- Column indexvalue
- Input to the CDF ('cumulative distribution function')- Returns:
- CDF evaluated at that value
-
getExcelPercentileValues
Computes percentiles compatible with Excel- Parameters:
c
- Column index- Returns:
- 0 to 100th percentile values in a 101-sized BigDecimal array (i.e. result[75] fives the 75th percentile).
-
getExcelPercentRank
double getExcelPercentRank(long c, double value) Computes the rank of a value, as a percentage, in the distribution of values in the given column. The values in the column do not need to be sorted upfront. The algorithm uses interpolation when there is no exact match with the passed in value.- Parameters:
c
- Column indexvalue
- The value for which to return the rank- Returns:
- The rank, from 0 to 1, denoting the percentile of the distribution in which the given value falls.
-
normalizeRowValues
Normalizes the values on a row-by-row basis. The value for each included value is divided by the sum of those values. If this sum is zero, the normalized values will also be zero.- Parameters:
colLabels
- Labels of the columns to include in the normalizationprecision
- Maximum number of decimals in the normalized value- Returns:
- A reference to this Matrix2D, in which the values have been normalized 'in situ'.
-
kurtosis
- Parameters:
label
- Label of a numeric column- Returns:
- Kurtosis of the distribution of values in that column
-
sortRows
- Parameters:
col
- Column index- Returns:
- A reference to this Matrix2D, in which the rows are now ordered so that the values in the given column are sorted from low --> high
-
sortRowsReverse
- Parameters:
col
- Column index- Returns:
- A reference to this Matrix2D, in which the rows are now ordered so that the values in the given column are sorted from high --> low
-
getMap
Extracts a map from the values in the nominated key and value columns. The assumption is the value column is functionally dependent on the key column (as else key-->value pairs on one row may overwrite the relationship between the pair on a previous row).- Parameters:
keyColLabel
- Label of the column for which the values are to be used as keys in the mapvalColLabel
- Label of the column for which the values are to be used as values in the map- Returns:
- Key-->value map in the same order as in this Matrix2D
-
getMap
Extracts a map from the values in the nominated key columns (resulting in keys of type MultiKey) and value columns. The assumption is the value column is functionally dependent on the key column (as else key-->value pairs on one row may overwrite the relationship between the pair on a prevoius row).- Parameters:
colLabels
- Labels of the columns for which the values are to be used as MultiKeys in the mapvalColLabel
- Label of the column for which the values are to be used as values in the map- Returns:
- MutltiKey-->value map in the same order as in this Matrix2D
-
addMappedColumn
Add a column to this Matrix2D, with the value on each row looked up from the given map, using the mapped column's value as the lookup key.- Parameters:
colLabel
- Label of the new columnmappedColLabel
- Label of the column holding the lookup keysmapping
- Map holding the values to map those lookup keys to- Returns:
- A reference to this matrix, to which the new column is added
-
addMappedColumn
Matrix2D addMappedColumn(String colLabel, List<String> fromColLabels, Map<MultiKey<String>, Object> mapping) Add a column to this Matrix2D, with the value on each row looked up from the given map, using a set of existing columns' values as a composite lookup key. The keys in the map are assumed to have been instantiated byPublicGroovyAPI.multiKey(Object...)
.- Parameters:
colLabel
- Label of the new columnfromColLabels
- Labels of the columns holding the composite lookup key valuesmapping
- Map holding the values to map those lookup keys to- Returns:
- A reference to this matrix, to which the new column is added
-
addColumn
Add a column to this Matrix2D, with the value on each row retreived from the given values array.- Parameters:
colLabel
- Label of the new columncolValues
- Array of values to populate the new column with- Returns:
- A reference to this matrix, to which the new column is added
-
addColumn
Add a column to this Matrix2D, without any values- Parameters:
colLabel
- Label of the new column- Returns:
- A reference to this matrix, to which the new column is added
-
renameColumns
Renames the columns in this matrix so that the old column label = key, and the new column label = value in the given map.- Parameters:
old2NewColLabels
- Old --> new column label mapping- Returns:
- A reference to this matrix, in which the matching columns have been renamed
-
replaceMissingValues
Replaces nulls in the given column with the given value.- Parameters:
colLabel
- Label of column to processval
- Value to set- Returns:
- A reference to this matrix, in which the null values have been replaced.
-
quantize
Convenience method to quantize all numeric columns in one go.- Parameters:
numberOfBuckets
-newColLabelPostfix
- Example: the bins for column 'margin' go into a new column named 'margin_q' if the prefix is set to '_q'.bucketNaming
-
-
quantize
Performs equal width quantization of a scalar distribution in a given column, and maps each value to its nearest lower quantile. A column is added if none with the given label exists, or when the label is null.- Parameters:
colLabel
- Label of the column holding the scalar valuesnumberOfBuckets
- Number of quantiles/buckets to calculate (max 100). Typically 10 deciles or 4 quartiles.newColLabel
- Label of the new column to add, holding the mapped quantilebucketNaming
- One of the following values:- "ABC": quantiles are named "A","B","C" etc.
- "INT": quantiles are numbered 0,2,3,...
- "RANGE": quantile names reflect the range of the bucket (lower bound → upper bound)
- "RANGE_PCT": as "RANGE" but not the bounds are formatted as percentages (ex. 8% vs 0.08)
-
toResultMatrix
ResultMatrix toResultMatrix() -
toResultMatrix
Converts this Matrix2D into ResultMatrix- Parameters:
withColumnFormats
- boolean flag to add columns formats and normalize datetime fields.- Returns:
- ResultMatrix
- See Also:
-
toResultMatrix
Create a ResultMatrix from a Matrix2D, selecting only the columns included in the mapping if such a mapping is provided.- Parameters:
columnMapping
- Mapping ResultMatrix column name --> Matrix2D column label- Returns:
- ResultMatrix
-
toResultMatrix
-
fromResultMatrix
-
fromListOfMaps
-
fromListOfMaps
static Matrix2DImpl fromListOfMaps(Collection<String> colLabels, Collection<Map<String, Object>> rows)
-