pymrio.extension_convert

pymrio.extension_convert(*extensions, df_map, new_extension_name, extension_col_name='extension', agg_func='sum', drop_not_bridged_index=True, unit_column_orig='unit_orig', unit_column_new='unit_new', ignore_columns=None, reindex=None)

Apply the convert function to a list of extensions.

Internally that calls the Extension.convert function for all extensions.

Parameters

extensions (list of extensions) – Extensions to convert. All extensions passed must have an index structure (index names) as described in df_map.
df_map –
The DataFrame with the mapping of the old to the new classification. This requires a specific structure:
- Constraining data (e.g. stressors, regions, sectors) can be either in the index or columns of df_orig. The need to have the same name as the named index or column in df_orig. The algorithm searches for matching data in df_orig based on all constraining columns in df_map.
- Bridge columns are columns with ‘__’ in the name. These are used to map (bridge) some/all of the constraining columns in df_orig to the new classification.
- One column “factor”, which gives the multiplication factor for the conversion. If it is missing, it is set to 1.
This is better explained with an example.

Assuming a original dataframe df_orig with index names ‘stressor’ and ‘compartment’ and column name ‘region’, the characterizing dataframe could have the following structure (column names):
- stressor: original index name
- compartment: original index name
- region: original column name
- factor: the factor for multiplication/characterization
  If no factor is given, the factor is assumed to be 1. This can be used, to simplify renaming/aggregation mappings.
- impact__stressor: the new index name,
  replacing the previous index name “stressor”. Thus here “stressor” will be renamed to “impact”, and the row index will be renamed by the entries here.
- compartment__compartment: the new compartment,
  replacing the original compartment. No rename of column happens here, still row index will be renamed as given here.
The columns with __ are called bridge columns, they are used to match the original index. The new dataframe with have index names based on the first part of the bridge column, in the order in which the bridge columns are given in the mapping dataframe.

extension_name: str

The name of the new extension returned

extension_col_namestr, optional

Name of the column specifying the extension name in df_map. The entry in df_map here can either be the name returned by Extension.name or the name of the Extension instance. Default: ‘extension’

agg_funcstr or func

the aggregation function to use for multiple matchings (summation by default)

drop_not_bridged_indexbool, optional

What to do with index levels in df_orig not appearing in the bridge columns. If True, drop them after aggregation across these, if False, pass them through to the result.

Note: Only index levels will be dropped, not columns.

In case some index levels need to be dropped, and some not make a bridge column for the ones to be dropped and map all to the same name. Then drop this index level after the conversion.

unit_column_origstr, optional

Name of the column in df_map with the original unit. This will be used to check if the unit matches the original unit in the extension. Default is “unit_orig”, if None, no check is performed.

unit_column_newstr, optional

Name of the column in df_map with the new unit to be assigned to the new extension. Default is “unit_new”, if None same unit as in df_orig TODO EXPLAIN BETTER, THINK WARNING

ignore_columnslist, optional

List of column names in df_map which should be ignored. These could be columns with additional information, etc. The unit columns given in unit_column_orig and unit_column_new are ignored by default.

reindex: str, None or collection

Wrapper for pandas’ reindex method to control return order. - If None: sorts the index alphabetically. - If str: uses the unique value order from the specified bridge column as the index order. - For other types (e.g., collections): passes directly to pandas.reindex.