Module pyfx.internal.pandasutil
Utility functions to convert pandas DataFrames to avro and pricefx FieldCollections.
Contains also schema type conversion functions - from pandas to avro - from pandas to pricefx FieldCollections.
Functions
def to_avro_type(column: pandas.core.series.Series) ‑> Union[str, Dict[str, Any], ForwardRef(None)]
-
Returns the avro type declaration corresponding to the given pandas Series data.
Return None if the given numpy type is not supported.
Args
column
- the column data as pd.Series
def to_field_collection_spec(dataframe: pandas.core.frame.DataFrame, dimensions: Optional[List[str]] = None, on_unsupported_type: str = 'error', inplace: bool = False, column_labels: Optional[Dict[str, str]] = None) ‑> Tuple[List[Dict[str, Any]], pandas.core.frame.DataFrame]
-
Returns the pricefx field collection spec and the corresponding DataFrame.
Args
dataframe
- the dataframe to push
dimensions
- columns that should be used as dimension (optional, default: None)
on_unsupported_type
- define the behavior when encontering a dataframe column containing an incompatible type (optional, default: "error"):
- "error" (default value) raises an error
- "drop" will drop the column
- "coerce" will try to convert this column to strings
inplace
- do the required data prep operations in place (will mutate the source dataframe ; optional, default: False)
column_labels
- a dictionary associating dataframe column (including index) name to its desired label (optional, default: None).
def to_pricefx_type(column: pandas.core.series.Series) ‑> Optional[str]
-
Returns the pricefx column type corresponding to the given pandas Series data.
Return None if the given numpy type is not supported.
Args
column
- the column data as pd.Series
Classes
class FieldSpecs
-
Structure for keeping user specifications for exported table fields.
Class variables
var MeasureTypes
-
Enum for keeping allowed measure types.
var base_type
var extended_type
Instance variables
var field_type_to_dtype : Dict[str, pyfx.internal.pandasutil.FieldSpecs._DtypeAndCheck]
-
Returns dict with check and retype for each FieldType.
Methods
def set_col_specs(self, col_name: str, name: Optional[str] = None, label: Optional[str] = None, type: Optional[str] = None, key: Optional[bool] = False, distribution_key: Optional[bool] = False, dimension: Optional[bool] = False, format: Optional[str] = None, measure_type: Optional[str] = None) ‑> None
-
Set specs for single column.
Args
col_name
- current name of the column, present in dataframe
name
- name of the column in exported table, if no value is provided, col_name is used (optional, default: None)
label
- column label in the exported table, if no value is provided, name is used (optional, default: None)
type
- FieldType of column in exported table, if no value is provided, type is assigned automatically (optional, default: None)
key
- boolean flag for key columns (optional, default: False)
distribution_key
- boolean flag for distribution key (optional, default: False)
dimension
- boolean flag for dimension (optional, default: False)
format
- allows to set formatting of the values present in column (optional, default: None)
measure_type
- columns measure type (optional, default: None)
Raises
ValueError
- If type is set to unknown FieldType
ValueError
- If measure_type is set to unsupported one