Characterization of stressors
Stressor characterization allows the calculation of environmental and social impacts of economic activities. It transforms raw stressor data into meaningful impact indicators.
Pymrio implements an innovative string-matching approach for characterization. This method matches stressors in the characterization table (provided in long format) with available stressors in the MRIO system. This brings the following benefits:
Ensures stressor correspondence across the MRIO system and characterization table
Performs automatic unit verification
Works regardless of entry order in the characterization table
Handles characterization tables that include factors for stressors not present in the satellite account
Efficiently manages region and sector-specific characterization factors
Enables characterization across different extensions
This contrasts with traditional approaches that rely on matrix multiplication between stressor and characterization matrices, requiring strict 1:1 correspondence between matrix dimensions and precise ordering of entries.
The characterization functionality is available both as an extension object method and as top-level function accepting complete MRIO objects or extension collections.
In the following, we give some examples on how to use both methods, starting with some simple example and then advancing to more complex cases with regional specific factors.
Basic Example
For this example we use the test MRIO included in Pymrio. We also need the Pandas library for loading the characterization table and pathlib for some folder manipulation.
[ ]:
from pathlib import Path
[ ]:
import pandas as pd
[1]:
import pymrio
from pymrio.core.constants import PYMRIO_PATH # noqa
To load the test MRIO we use:
[2]:
io = pymrio.load_test()
and the characterization table with some foo factors can be loaded by
[3]:
charact_table = pd.read_csv(
(PYMRIO_PATH["test_mrio"] / Path("concordance") / "emissions_charact.tsv"),
sep="\t",
)
charact_table
[3]:
| stressor | compartment | impact | factor | impact_unit | stressor_unit | |
|---|---|---|---|---|---|---|
| 0 | emission_type1 | air | air water impact | 0.002 | t | kg |
| 1 | emission_type2 | water | air water impact | 0.001 | t | kg |
| 2 | emission_type1 | air | total emissions | 1.000 | kg | kg |
| 3 | emission_type2 | water | total emissions | 1.000 | kg | kg |
| 4 | emission_type3 | land | total emissions | 1.000 | kg | kg |
| 5 | emission_type1 | air | total air emissions | 0.001 | t | kg |
This table contains the columns ‘stressor’ and ‘compartment’ which correspond to the index names of the test_mrio emission satellite accounts:
[4]:
io.emissions.F
[4]:
| region | reg1 | reg2 | ... | reg5 | reg6 | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sector | food | mining | manufactoring | electricity | construction | trade | transport | other | food | mining | ... | transport | other | food | mining | manufactoring | electricity | construction | trade | transport | other | |
| stressor | compartment | |||||||||||||||||||||
| emission_type1 | air | 1848064.80 | 986448.090 | 23613787.00 | 28139100.00 | 2584141.80 | 4132656.3 | 21766987.0 | 7842090.6 | 1697937.30 | 347378.150 | ... | 42299319 | 10773826.0 | 15777996.0 | 6420955.5 | 113172450.0 | 56022534.0 | 4861838.5 | 18195621 | 47046542.0 | 21632868 |
| emission_type2 | water | 139250.47 | 22343.295 | 763569.18 | 273981.55 | 317396.51 | 1254477.8 | 1012999.1 | 2449178.0 | 204835.44 | 29463.944 | ... | 4199841 | 7191006.3 | 4826108.1 | 1865625.1 | 12700193.0 | 753213.7 | 2699288.3 | 13892313 | 8765784.3 | 16782553 |
2 rows × 48 columns
Theses index-names / columns-names need to match in order to match characterization factors to the stressors.
The other columns names can be passed to the characterization method. By default the method assumes the following column names:
impact: name of the characterization/impact
factor: the numerical (float) multiplication value for a specific stressor to derive the impact/characterized account
impact_unit: the unit of the calculated characterization/impact
stressor_unit: the unit of the stressor in the extension
Alternative names can be passed through the parameters characterized_name_column, characterization_factors_column, characterized_unit_column and orig_unit_column
To calculate the characterization we use
[5]:
char_emis = io.emissions.characterize(charact_table, name="impacts")
The parameter name is optional, if omitted the name will be set to extension_name + _characterized. In case the passed name starts with an underscore, the return name with be the name of the original extension concatenated with the passed name.
The return value is a named tuple with the validation and extension as attriubtes.
[6]:
print(char_emis.extension)
Extension impacts with parameters: name, F, F_Y, unit
[7]:
char_emis.validation
[7]:
| stressor | compartment | impact | factor | impact_unit | stressor_unit | error_unit_impact | error_unit_stressor | error_missing_stressor | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | emission_type1 | air | air water impact | 0.002 | t | kg | False | False | False |
| 1 | emission_type1 | air | total emissions | 1.000 | kg | kg | False | False | False |
| 2 | emission_type1 | air | total air emissions | 0.001 | t | kg | False | False | False |
| 3 | emission_type2 | water | air water impact | 0.001 | t | kg | False | False | False |
| 4 | emission_type2 | water | total emissions | 1.000 | kg | kg | False | False | False |
| 5 | emission_type3 | land | total emissions | 1.000 | kg | kg | False | False | True |
Checking the validation table is a recommended step that ensures accuracy and completeness before impact calculations. The validation process helps identify potential issues such as:
Missing characterization factors for specific region/sector/stressor combinations
Spelling mistakes or inconsistencies in stressor, sector, or region names
Unit mismatches between the MRIO system and characterization factors
Incomplete coverage that could affect impact assessment results
By systematically checking these elements, users can avoid calculation errors and ensure their impact assessment captures all relevant environmental and social dimensions with the proper characterization factors.
In the current case, the charact_table contains a characterization called ‘total emissions’, for which the calculation requires a stressor not present in the satellite account. This is indicated in the validation table in the error_missing_stressor column. The calculation can proceed, but for all impacts containing the stressor it is assumed to be 0.
It is possible, to just the verification before doing any calculation with
[8]:
only_val = io.emissions.characterize(charact_table, only_validation=True)
only_val.validation
[8]:
| stressor | compartment | impact | factor | impact_unit | stressor_unit | error_unit_impact | error_unit_stressor | error_missing_stressor | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | emission_type1 | air | air water impact | 0.002 | t | kg | False | False | False |
| 1 | emission_type1 | air | total emissions | 1.000 | kg | kg | False | False | False |
| 2 | emission_type1 | air | total air emissions | 0.001 | t | kg | False | False | False |
| 3 | emission_type2 | water | air water impact | 0.001 | t | kg | False | False | False |
| 4 | emission_type2 | water | total emissions | 1.000 | kg | kg | False | False | False |
| 5 | emission_type3 | land | total emissions | 1.000 | kg | kg | False | False | True |
In that case the extension attribute is set to None. The same applies if a characterization needs to be aborted due to unit inconsistencies.
Anyways, in case everything works as expected, the extension can be attached to the MRIO object.
[9]:
io.impacts = char_emis.extension
and used for subsequent calculations:
[10]:
io.calc_all()
io.impacts.D_cba
[10]:
| region | reg1 | reg2 | ... | reg5 | reg6 | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sector | food | mining | manufactoring | electricity | construction | trade | transport | other | food | mining | ... | transport | other | food | mining | manufactoring | electricity | construction | trade | transport | other |
| impact | |||||||||||||||||||||
| air water impact | 4.354677e+03 | 384.125264 | 2.116984e+05 | 2.391231e+04 | 7.032641e+03 | 8.548388e+03 | 2.200050e+04 | 4.012355e+04 | 3.800328e+03 | 42.024811 | ... | 8.843384e+04 | 3.008044e+04 | 3.476528e+04 | 3.227857e+03 | 1.535829e+05 | 7.423616e+04 | 4.580343e+03 | 1.393218e+05 | 1.049447e+05 | 1.183949e+05 |
| total air emissions | 2.056183e+03 | 179.423536 | 9.749300e+04 | 1.188759e+04 | 3.342906e+03 | 3.885884e+03 | 1.075027e+04 | 1.582152e+04 | 1.793338e+03 | 19.145605 | ... | 4.209505e+04 | 1.138661e+04 | 1.517235e+04 | 1.345318e+03 | 7.145075e+04 | 3.683167e+04 | 1.836696e+03 | 4.241568e+04 | 4.805409e+04 | 3.602298e+04 |
| total emissions | 2.298494e+06 | 204701.727979 | 1.142054e+08 | 1.202472e+07 | 3.689735e+06 | 4.662504e+06 | 1.125023e+07 | 2.430203e+07 | 2.006991e+06 | 22879.206385 | ... | 4.633879e+07 | 1.869382e+07 | 1.959293e+07 | 1.882540e+06 | 8.213219e+07 | 3.740449e+07 | 2.743647e+06 | 9.690613e+07 | 5.689057e+07 | 8.237196e+07 |
3 rows × 48 columns
Note that units are checked against the unit specification of the extension. Thus, any mismatch of units will abort the calculation. The validation table helps to identify the issue.
[11]:
charact_table.loc[charact_table.stressor == "emission_type1", "stressor_unit"] = "t"
[12]:
ret_error = io.emissions.characterize(charact_table)
/home/konstans/proj/pymrio/pymrio/core/mriosystem.py:1869: UserWarning: Unit errors/inconsistencies between passed units and extension units - check validation
warnings.warn("Unit errors/inconsistencies between passed units and extension units - check validation")
[13]:
ret_error.extension
[14]:
ret_error.validation
[14]:
| stressor | compartment | impact | factor | impact_unit | stressor_unit | error_unit_impact | error_unit_stressor | error_missing_stressor | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | emission_type1 | air | air water impact | 0.002 | t | t | False | True | False |
| 1 | emission_type1 | air | total emissions | 1.000 | kg | t | False | True | False |
| 2 | emission_type1 | air | total air emissions | 0.001 | t | t | False | True | False |
| 3 | emission_type2 | water | air water impact | 0.001 | t | kg | False | False | False |
| 4 | emission_type2 | water | total emissions | 1.000 | kg | kg | False | False | False |
| 5 | emission_type3 | land | total emissions | 1.000 | kg | kg | False | False | True |
The error_unit_impact column indicate the stressor with the unit mismatch.
Regional specific characterization factors
Here we use a table of regionally specific characterisation factors. The actual factors contained here are the same as in the basic example and we will modify them after loading. We will also investigate cases with missing data or conflicting units. The same principles can be used for sector specific characterization factors.
We use the same data test mrio system as before:
[15]:
io = pymrio.load_test()
with the regional specific characterization factors from
[16]:
charact_table_reg = pd.read_csv(
(PYMRIO_PATH["test_mrio"] / Path("concordance") / "emissions_charact_reg_spec.tsv"),
sep="\t",
)
charact_table_reg
[16]:
| region | stressor | compartment | impact | factor | impact_unit | stressor_unit | |
|---|---|---|---|---|---|---|---|
| 0 | reg1 | emission_type1 | air | air water impact | 0.002 | t | kg |
| 1 | reg1 | emission_type2 | water | air water impact | 0.001 | t | kg |
| 2 | reg1 | emission_type1 | air | total emissions | 1.000 | kg | kg |
| 3 | reg1 | emission_type2 | water | total emissions | 1.000 | kg | kg |
| 4 | reg1 | emission_type3 | land | total emissions | 1.000 | kg | kg |
| 5 | reg1 | emission_type1 | air | total air emissions | 0.001 | t | kg |
| 6 | reg2 | emission_type1 | air | air water impact | 0.002 | t | kg |
| 7 | reg2 | emission_type2 | water | air water impact | 0.001 | t | kg |
| 8 | reg2 | emission_type1 | air | total emissions | 1.000 | kg | kg |
| 9 | reg2 | emission_type2 | water | total emissions | 1.000 | kg | kg |
| 10 | reg2 | emission_type3 | land | total emissions | 1.000 | kg | kg |
| 11 | reg2 | emission_type1 | air | total air emissions | 0.001 | t | kg |
| 12 | reg3 | emission_type1 | air | air water impact | 0.002 | t | kg |
| 13 | reg3 | emission_type2 | water | air water impact | 0.001 | t | kg |
| 14 | reg3 | emission_type1 | air | total emissions | 1.000 | kg | kg |
| 15 | reg3 | emission_type2 | water | total emissions | 1.000 | kg | kg |
| 16 | reg3 | emission_type3 | land | total emissions | 1.000 | kg | kg |
| 17 | reg3 | emission_type1 | air | total air emissions | 0.001 | t | kg |
| 18 | reg4 | emission_type1 | air | air water impact | 0.002 | t | kg |
| 19 | reg4 | emission_type2 | water | air water impact | 0.001 | t | kg |
| 20 | reg4 | emission_type1 | air | total emissions | 1.000 | kg | kg |
| 21 | reg4 | emission_type2 | water | total emissions | 1.000 | kg | kg |
| 22 | reg4 | emission_type3 | land | total emissions | 1.000 | kg | kg |
| 23 | reg4 | emission_type1 | air | total air emissions | 0.001 | t | kg |
| 24 | reg5 | emission_type1 | air | air water impact | 0.002 | t | kg |
| 25 | reg5 | emission_type2 | water | air water impact | 0.001 | t | kg |
| 26 | reg5 | emission_type1 | air | total emissions | 1.000 | kg | kg |
| 27 | reg5 | emission_type2 | water | total emissions | 1.000 | kg | kg |
| 28 | reg5 | emission_type3 | land | total emissions | 1.000 | kg | kg |
| 29 | reg5 | emission_type1 | air | total air emissions | 0.001 | t | kg |
| 30 | reg6 | emission_type1 | air | air water impact | 0.002 | t | kg |
| 31 | reg6 | emission_type2 | water | air water impact | 0.001 | t | kg |
| 32 | reg6 | emission_type1 | air | total emissions | 1.000 | kg | kg |
| 33 | reg6 | emission_type2 | water | total emissions | 1.000 | kg | kg |
| 34 | reg6 | emission_type3 | land | total emissions | 1.000 | kg | kg |
| 35 | reg6 | emission_type1 | air | total air emissions | 0.001 | t | kg |
Compared with the previous table (charact_table), this table contains an additional column region which contains the regional specific data. Currently, the factors are actually the same as before, thus
[17]:
char_reg = io.emissions.characterize(charact_table_reg)
For regional specific characterization, the validation table contains information per region
[18]:
char_reg.validation
[18]:
| stressor | compartment | region | impact | factor | impact_unit | stressor_unit | error_unit_impact | error_unit_stressor | error_missing_stressor | error_missing_region | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | emission_type1 | air | reg1 | air water impact | 0.002 | t | kg | False | False | False | False |
| 1 | emission_type1 | air | reg1 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 2 | emission_type1 | air | reg1 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 3 | emission_type1 | air | reg2 | air water impact | 0.002 | t | kg | False | False | False | False |
| 4 | emission_type1 | air | reg2 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 5 | emission_type1 | air | reg2 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 6 | emission_type1 | air | reg3 | air water impact | 0.002 | t | kg | False | False | False | False |
| 7 | emission_type1 | air | reg3 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 8 | emission_type1 | air | reg3 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 9 | emission_type1 | air | reg4 | air water impact | 0.002 | t | kg | False | False | False | False |
| 10 | emission_type1 | air | reg4 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 11 | emission_type1 | air | reg4 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 12 | emission_type1 | air | reg5 | air water impact | 0.002 | t | kg | False | False | False | False |
| 13 | emission_type1 | air | reg5 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 14 | emission_type1 | air | reg5 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 15 | emission_type1 | air | reg6 | air water impact | 0.002 | t | kg | False | False | False | False |
| 16 | emission_type1 | air | reg6 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 17 | emission_type1 | air | reg6 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 18 | emission_type2 | water | reg1 | air water impact | 0.001 | t | kg | False | False | False | False |
| 19 | emission_type2 | water | reg1 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 20 | emission_type2 | water | reg2 | air water impact | 0.001 | t | kg | False | False | False | False |
| 21 | emission_type2 | water | reg2 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 22 | emission_type2 | water | reg3 | air water impact | 0.001 | t | kg | False | False | False | False |
| 23 | emission_type2 | water | reg3 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 24 | emission_type2 | water | reg4 | air water impact | 0.001 | t | kg | False | False | False | False |
| 25 | emission_type2 | water | reg4 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 26 | emission_type2 | water | reg5 | air water impact | 0.001 | t | kg | False | False | False | False |
| 27 | emission_type2 | water | reg5 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 28 | emission_type2 | water | reg6 | air water impact | 0.001 | t | kg | False | False | False | False |
| 29 | emission_type2 | water | reg6 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 30 | emission_type3 | land | reg1 | total emissions | 1.000 | kg | kg | False | False | True | False |
| 31 | emission_type3 | land | reg2 | total emissions | 1.000 | kg | kg | False | False | True | False |
| 32 | emission_type3 | land | reg3 | total emissions | 1.000 | kg | kg | False | False | True | False |
| 33 | emission_type3 | land | reg4 | total emissions | 1.000 | kg | kg | False | False | True | False |
| 34 | emission_type3 | land | reg5 | total emissions | 1.000 | kg | kg | False | False | True | False |
| 35 | emission_type3 | land | reg6 | total emissions | 1.000 | kg | kg | False | False | True | False |
The extension is again available in the extension attribute
[19]:
char_reg.extension.F
[19]:
| region | reg1 | reg2 | ... | reg5 | reg6 | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sector | food | mining | manufactoring | electricity | construction | trade | transport | other | food | mining | ... | transport | other | food | mining | manufactoring | electricity | construction | trade | transport | other |
| impact | |||||||||||||||||||||
| air water impact | 3.835380e+03 | 1.995239e+03 | 4.799114e+04 | 5.655218e+04 | 5.485680e+03 | 9.519790e+03 | 4.454697e+04 | 1.813336e+04 | 3.600710e+03 | 724.220244 | ... | 8.879848e+04 | 2.873866e+04 | 3.638210e+04 | 1.470754e+04 | 2.390451e+05 | 1.127983e+05 | 1.242297e+04 | 5.028356e+04 | 1.028589e+05 | 6.004829e+04 |
| total air emissions | 1.848065e+03 | 9.864481e+02 | 2.361379e+04 | 2.813910e+04 | 2.584142e+03 | 4.132656e+03 | 2.176699e+04 | 7.842091e+03 | 1.697937e+03 | 347.378150 | ... | 4.229932e+04 | 1.077383e+04 | 1.577800e+04 | 6.420956e+03 | 1.131724e+05 | 5.602253e+04 | 4.861838e+03 | 1.819562e+04 | 4.704654e+04 | 2.163287e+04 |
| total emissions | 1.987315e+06 | 1.008791e+06 | 2.437736e+07 | 2.841308e+07 | 2.901538e+06 | 5.387134e+06 | 2.277999e+07 | 1.029127e+07 | 1.902773e+06 | 376842.094000 | ... | 4.649916e+07 | 1.796483e+07 | 2.060410e+07 | 8.286581e+06 | 1.258726e+08 | 5.677575e+07 | 7.561127e+06 | 3.208793e+07 | 5.581233e+07 | 3.841542e+07 |
3 rows × 48 columns
gives the same result as before. To highlight regional specificity, we double the total emission factors of region 3.
[20]:
charact_table_reg.loc[
(charact_table_reg.region == "reg3")
& (charact_table_reg.impact == "total emissions"),
"factor",
] = (
charact_table_reg.loc[
(charact_table_reg.region == "reg3")
& (charact_table_reg.impact == "total emissions"),
"factor",
]
* 2
)
and calculate the new impacts
[21]:
char_reg_dbl = io.emissions.characterize(charact_table_reg).extension
char_reg_dbl.F.loc["total emissions"]
[21]:
region sector
reg1 food 1.987315e+06
mining 1.008791e+06
manufactoring 2.437736e+07
electricity 2.841308e+07
construction 2.901538e+06
trade 5.387134e+06
transport 2.277999e+07
other 1.029127e+07
reg2 food 1.902773e+06
mining 3.768421e+05
manufactoring 1.598022e+07
electricity 1.660779e+07
construction 1.868660e+06
trade 3.511220e+06
transport 6.836824e+06
other 6.185187e+06
reg3 food 1.100035e+07
mining 9.531717e+06
manufactoring 2.150874e+08
electricity 1.503010e+08
construction 3.996900e+07
trade 1.213563e+08
transport 1.301629e+08
other 3.714520e+08
reg4 food 6.479508e+06
mining 9.508597e+06
manufactoring 4.755267e+07
electricity 4.478230e+07
construction 6.364480e+06
trade 1.444928e+07
transport 3.071924e+07
other 2.961040e+07
reg5 food 6.628066e+06
mining 4.867714e+06
manufactoring 1.192319e+08
electricity 4.324482e+07
construction 3.465382e+06
trade 1.967876e+07
transport 4.649916e+07
other 1.796483e+07
reg6 food 2.060410e+07
mining 8.286581e+06
manufactoring 1.258726e+08
electricity 5.677575e+07
construction 7.561127e+06
trade 3.208793e+07
transport 5.581233e+07
other 3.841542e+07
Name: total emissions, dtype: float64
compared to
[22]:
char_reg.extension.F.loc["total emissions"]
[22]:
region sector
reg1 food 1.987315e+06
mining 1.008791e+06
manufactoring 2.437736e+07
electricity 2.841308e+07
construction 2.901538e+06
trade 5.387134e+06
transport 2.277999e+07
other 1.029127e+07
reg2 food 1.902773e+06
mining 3.768421e+05
manufactoring 1.598022e+07
electricity 1.660779e+07
construction 1.868660e+06
trade 3.511220e+06
transport 6.836824e+06
other 6.185187e+06
reg3 food 5.500174e+06
mining 4.765858e+06
manufactoring 1.075437e+08
electricity 7.515049e+07
construction 1.998450e+07
trade 6.067817e+07
transport 6.508145e+07
other 1.857260e+08
reg4 food 6.479508e+06
mining 9.508597e+06
manufactoring 4.755267e+07
electricity 4.478230e+07
construction 6.364480e+06
trade 1.444928e+07
transport 3.071924e+07
other 2.961040e+07
reg5 food 6.628066e+06
mining 4.867714e+06
manufactoring 1.192319e+08
electricity 4.324482e+07
construction 3.465382e+06
trade 1.967876e+07
transport 4.649916e+07
other 1.796483e+07
reg6 food 2.060410e+07
mining 8.286581e+06
manufactoring 1.258726e+08
electricity 5.677575e+07
construction 7.561127e+06
trade 3.208793e+07
transport 5.581233e+07
other 3.841542e+07
Name: total emissions, dtype: float64
Some more notes on validation
We can put some more inconsistencies into the table to showcase the validation process. Some unit error in the stressors:
[23]:
charact_table_reg.loc[
(charact_table_reg.region == "reg4")
& (charact_table_reg.stressor == "emission_type1"),
"stressor_unit",
] = "s"
Some inconsistent impact units:
[24]:
charact_table_reg.loc[
(charact_table_reg.region == "reg2")
& (charact_table_reg.impact == "total emissions"),
"impact_unit",
] = "kt"
Some spelling mistake in region 2 for some stressor:
[25]:
charact_table_reg.loc[
(charact_table_reg.region == "reg2")
& (charact_table_reg.stressor == "emission_type2"),
"region",
] = "reg22"
Another region data which is not available in the extension
[26]:
new_data = charact_table_reg.iloc[[0]]
new_data.loc[:, "region"] = "reg_additional"
charact_table_reg = charact_table_reg.merge(new_data, how="outer")
[27]:
report = io.emissions.characterize(charact_table_reg, only_validation=True).validation
The unit errors are reported for each row, and the one additional region not present in the extension is report under error_missing_region. The column error_unit_impact indicates the impact with inconsistent units
[28]:
report[report.stressor == "emission_type1"]
[28]:
| stressor | compartment | region | impact | factor | impact_unit | stressor_unit | error_unit_impact | error_unit_stressor | error_missing_stressor | error_missing_region | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | emission_type1 | air | reg1 | air water impact | 0.002 | t | kg | False | False | False | False |
| 1 | emission_type1 | air | reg1 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 2 | emission_type1 | air | reg1 | total emissions | 1.000 | kg | kg | True | False | False | False |
| 3 | emission_type1 | air | reg2 | air water impact | 0.002 | t | kg | False | False | False | False |
| 4 | emission_type1 | air | reg2 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 5 | emission_type1 | air | reg2 | total emissions | 1.000 | kt | kg | True | False | False | False |
| 6 | emission_type1 | air | reg3 | air water impact | 0.002 | t | kg | False | False | False | False |
| 7 | emission_type1 | air | reg3 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 8 | emission_type1 | air | reg3 | total emissions | 2.000 | kg | kg | True | False | False | False |
| 9 | emission_type1 | air | reg4 | air water impact | 0.002 | t | s | False | True | False | False |
| 10 | emission_type1 | air | reg4 | total air emissions | 0.001 | t | s | False | True | False | False |
| 11 | emission_type1 | air | reg4 | total emissions | 1.000 | kg | s | True | True | False | False |
| 12 | emission_type1 | air | reg5 | air water impact | 0.002 | t | kg | False | False | False | False |
| 13 | emission_type1 | air | reg5 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 14 | emission_type1 | air | reg5 | total emissions | 1.000 | kg | kg | True | False | False | False |
| 15 | emission_type1 | air | reg6 | air water impact | 0.002 | t | kg | False | False | False | False |
| 16 | emission_type1 | air | reg6 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 17 | emission_type1 | air | reg6 | total emissions | 1.000 | kg | kg | True | False | False | False |
| 18 | emission_type1 | air | reg_additional | air water impact | 0.002 | t | kg | False | False | False | True |
In case of emission_type2, the error_missing_region is True for the whole stressor, since reg2 is “no longer present” in the factor sheets due to the spelling mistake. Thus, not all regions are covered in the specifications. Again, the column error_unit_impact indicates the impact with inconsistent units
[29]:
report[report.stressor == "emission_type2"]
[29]:
| stressor | compartment | region | impact | factor | impact_unit | stressor_unit | error_unit_impact | error_unit_stressor | error_missing_stressor | error_missing_region | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 19 | emission_type2 | water | reg1 | air water impact | 0.001 | t | kg | False | False | False | True |
| 20 | emission_type2 | water | reg1 | total emissions | 1.000 | kg | kg | True | False | False | True |
| 21 | emission_type2 | water | reg22 | air water impact | 0.001 | t | kg | False | False | False | True |
| 22 | emission_type2 | water | reg22 | total emissions | 1.000 | kt | kg | True | False | False | True |
| 23 | emission_type2 | water | reg3 | air water impact | 0.001 | t | kg | False | False | False | True |
| 24 | emission_type2 | water | reg3 | total emissions | 2.000 | kg | kg | True | False | False | True |
| 25 | emission_type2 | water | reg4 | air water impact | 0.001 | t | kg | False | False | False | True |
| 26 | emission_type2 | water | reg4 | total emissions | 1.000 | kg | kg | True | False | False | True |
| 27 | emission_type2 | water | reg5 | air water impact | 0.001 | t | kg | False | False | False | True |
| 28 | emission_type2 | water | reg5 | total emissions | 1.000 | kg | kg | True | False | False | True |
| 29 | emission_type2 | water | reg6 | air water impact | 0.001 | t | kg | False | False | False | True |
| 30 | emission_type2 | water | reg6 | total emissions | 1.000 | kg | kg | True | False | False | True |
Characterization across multiple extensions
In addition to characterizing a single extension, pymrio also offers functionality to apply characterization across multiple extensions simultaneously. This is useful when your impacts depend on stressors that are distributed across different satellite accounts.
Let’s demonstrate this using our test MRIO system:
[30]:
io = pymrio.load_test()
First, let’s create multiple extensions from our emissions data to better showcase this functionality:
[31]:
# Create copies of the emissions extension with different names and data subsets
io.water = io.emissions.copy("water")
io.air = io.emissions.copy("air")
[32]:
# Keep only water emissions in the water extension
io.water.F = io.water.F.loc[[("emission_type2", "water")], :]
io.water.F_Y = io.water.F_Y.loc[[("emission_type2", "water")], :]
[33]:
# Keep only air emissions in the air extension
io.air.F = io.air.F.loc[[("emission_type1", "air")], :]
io.air.F_Y = io.air.F_Y.loc[[("emission_type1", "air")], :]
Examining the extensions:
[34]:
io.air.F
[34]:
| region | reg1 | reg2 | ... | reg5 | reg6 | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sector | food | mining | manufactoring | electricity | construction | trade | transport | other | food | mining | ... | transport | other | food | mining | manufactoring | electricity | construction | trade | transport | other | |
| stressor | compartment | |||||||||||||||||||||
| emission_type1 | air | 1848064.8 | 986448.09 | 23613787.0 | 28139100.0 | 2584141.8 | 4132656.3 | 21766987.0 | 7842090.6 | 1697937.3 | 347378.15 | ... | 42299319 | 10773826.0 | 15777996.0 | 6420955.5 | 113172450.0 | 56022534.0 | 4861838.5 | 18195621 | 47046542.0 | 21632868 |
1 rows × 48 columns
[35]:
io.water.F
[35]:
| region | reg1 | reg2 | ... | reg5 | reg6 | |||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sector | food | mining | manufactoring | electricity | construction | trade | transport | other | food | mining | ... | transport | other | food | mining | manufactoring | electricity | construction | trade | transport | other | |
| stressor | compartment | |||||||||||||||||||||
| emission_type2 | water | 139250.47 | 22343.295 | 763569.18 | 273981.55 | 317396.51 | 1254477.8 | 1012999.1 | 2449178.0 | 204835.44 | 29463.944 | ... | 4199841 | 7191006.3 | 4826108.1 | 1865625.1 | 12700193.0 | 753213.7 | 2699288.3 | 13892313 | 8765784.3 | 16782553 |
1 rows × 48 columns
To characterize across multiple extensions, we need a characterization table that includes an ‘extension’ column specifying which extension each stressor belongs to:
[36]:
# Start with our regional characterization table
factors_reg_spec = pd.read_csv(
(PYMRIO_PATH["test_mrio"] / Path("concordance") / "emissions_charact_reg_spec.tsv"),
sep="\t",
)
[37]:
# Create a copy and add an extension column based on compartment
factors_reg_ext = factors_reg_spec.copy()
factors_reg_ext.loc[:, "extension"] = factors_reg_ext.loc[:, "compartment"]
[38]:
# Filter out any entries that don't correspond to our extensions
factors_reg_ext = factors_reg_ext[factors_reg_ext.compartment.isin(["air", "water"])]
[39]:
# Examine our multi-extension characterization table:
factors_reg_ext.head(10)
[39]:
| region | stressor | compartment | impact | factor | impact_unit | stressor_unit | extension | |
|---|---|---|---|---|---|---|---|---|
| 0 | reg1 | emission_type1 | air | air water impact | 0.002 | t | kg | air |
| 1 | reg1 | emission_type2 | water | air water impact | 0.001 | t | kg | water |
| 2 | reg1 | emission_type1 | air | total emissions | 1.000 | kg | kg | air |
| 3 | reg1 | emission_type2 | water | total emissions | 1.000 | kg | kg | water |
| 5 | reg1 | emission_type1 | air | total air emissions | 0.001 | t | kg | air |
| 6 | reg2 | emission_type1 | air | air water impact | 0.002 | t | kg | air |
| 7 | reg2 | emission_type2 | water | air water impact | 0.001 | t | kg | water |
| 8 | reg2 | emission_type1 | air | total emissions | 1.000 | kg | kg | air |
| 9 | reg2 | emission_type2 | water | total emissions | 1.000 | kg | kg | water |
| 11 | reg2 | emission_type1 | air | total air emissions | 0.001 | t | kg | air |
There are two ways to characterize across multiple extensions:
[40]:
# 1. Using the top-level function with specific extensions:
ex_reg_multi = pymrio.extension_characterize(
io.air,
io.water, # List the extensions you want to include
factors=factors_reg_ext,
new_extension_name="multi_top_level",
).extension
[41]:
# 2. Using the MRIO object's method which automatically includes all available extensions:
ex_reg_mrio = io.extension_characterize(
factors=factors_reg_ext, new_extension_name="multi_mrio_method"
).extension
[42]:
# Both approaches produce the same result when the same extensions are involved:
print("Are the characterized F matrices equal?", ex_reg_multi.F.equals(ex_reg_mrio.F))
Are the characterized F matrices equal? True
[43]:
# Add the extension to our MRIO and calculate results:
io.multi = ex_reg_multi
io.calc_all()
io.multi.D_cba
[43]:
| region | reg1 | reg2 | ... | reg5 | reg6 | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| sector | food | mining | manufactoring | electricity | construction | trade | transport | other | food | mining | ... | transport | other | food | mining | manufactoring | electricity | construction | trade | transport | other |
| impact | |||||||||||||||||||||
| air water impact | 4.354677e+03 | 384.125264 | 2.116984e+05 | 2.391231e+04 | 7.032641e+03 | 8.548388e+03 | 2.200050e+04 | 4.012355e+04 | 3.800328e+03 | 42.024811 | ... | 8.843384e+04 | 3.008044e+04 | 3.476528e+04 | 3.227857e+03 | 1.535829e+05 | 7.423616e+04 | 4.580343e+03 | 1.393218e+05 | 1.049447e+05 | 1.183949e+05 |
| total air emissions | 2.056183e+03 | 179.423536 | 9.749300e+04 | 1.188759e+04 | 3.342906e+03 | 3.885884e+03 | 1.075027e+04 | 1.582152e+04 | 1.793338e+03 | 19.145605 | ... | 4.209505e+04 | 1.138661e+04 | 1.517235e+04 | 1.345318e+03 | 7.145075e+04 | 3.683167e+04 | 1.836696e+03 | 4.241568e+04 | 4.805409e+04 | 3.602298e+04 |
| total emissions | 2.298494e+06 | 204701.727979 | 1.142054e+08 | 1.202472e+07 | 3.689735e+06 | 4.662504e+06 | 1.125023e+07 | 2.430203e+07 | 2.006991e+06 | 22879.206385 | ... | 4.633879e+07 | 1.869382e+07 | 1.959293e+07 | 1.882540e+06 | 8.213219e+07 | 3.740449e+07 | 2.743647e+06 | 9.690613e+07 | 5.689057e+07 | 8.237196e+07 |
3 rows × 48 columns
[44]:
# As with single extension characterization, validation is crucial:
validation_report = pymrio.extension_characterize(
io.air, io.water, factors=factors_reg_ext, only_validation=True
).validation
[45]:
print("Validation report:")
validation_report
Validation report:
[45]:
| stressor | compartment | region | impact | factor | impact_unit | stressor_unit | error_unit_impact | error_unit_stressor | error_missing_stressor | error_missing_region | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | emission_type1 | air | reg1 | air water impact | 0.002 | t | kg | False | False | False | False |
| 1 | emission_type1 | air | reg1 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 2 | emission_type1 | air | reg1 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 3 | emission_type1 | air | reg2 | air water impact | 0.002 | t | kg | False | False | False | False |
| 4 | emission_type1 | air | reg2 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 5 | emission_type1 | air | reg2 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 6 | emission_type1 | air | reg3 | air water impact | 0.002 | t | kg | False | False | False | False |
| 7 | emission_type1 | air | reg3 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 8 | emission_type1 | air | reg3 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 9 | emission_type1 | air | reg4 | air water impact | 0.002 | t | kg | False | False | False | False |
| 10 | emission_type1 | air | reg4 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 11 | emission_type1 | air | reg4 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 12 | emission_type1 | air | reg5 | air water impact | 0.002 | t | kg | False | False | False | False |
| 13 | emission_type1 | air | reg5 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 14 | emission_type1 | air | reg5 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 15 | emission_type1 | air | reg6 | air water impact | 0.002 | t | kg | False | False | False | False |
| 16 | emission_type1 | air | reg6 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 17 | emission_type1 | air | reg6 | total air emissions | 0.001 | t | kg | False | False | False | False |
| 18 | emission_type2 | water | reg1 | air water impact | 0.001 | t | kg | False | False | False | False |
| 19 | emission_type2 | water | reg1 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 20 | emission_type2 | water | reg2 | air water impact | 0.001 | t | kg | False | False | False | False |
| 21 | emission_type2 | water | reg2 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 22 | emission_type2 | water | reg3 | air water impact | 0.001 | t | kg | False | False | False | False |
| 23 | emission_type2 | water | reg3 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 24 | emission_type2 | water | reg4 | air water impact | 0.001 | t | kg | False | False | False | False |
| 25 | emission_type2 | water | reg4 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 26 | emission_type2 | water | reg5 | air water impact | 0.001 | t | kg | False | False | False | False |
| 27 | emission_type2 | water | reg5 | total emissions | 1.000 | kg | kg | False | False | False | False |
| 28 | emission_type2 | water | reg6 | air water impact | 0.001 | t | kg | False | False | False | False |
| 29 | emission_type2 | water | reg6 | total emissions | 1.000 | kg | kg | False | False | False | False |
The validation process helps identify issues such as: - Missing stressors or extensions - Unit inconsistencies - Missing regions or sectors - Extension name mismatches
Important considerations for multi-extension characterization:
The ‘extension’ column in your characterization table must match the extension names in your MRIO
All extensions must have compatible region and sector classifications
Units must be consistent across extensions and characterization factors
If a characterization table references an extension that doesn’t exist, it will be noted in the validation report