Using the aggregation functionality of pymrio

Pymrio offers various possibilities to achieve an aggregation of a existing MRIO system. The following section will present all of them in turn, using the test MRIO system included in pymrio. The same concept can be applied to real life MRIOs.

Some of the examples rely in the country converter coco. The minimum version required is coco >= 0.6.3 - install the latest version with

pip install country_converter --upgrade

Coco can also be installed from the Anaconda Cloud - see the coco readme for further infos.

Loading the test mrio

First, we load and explore the test MRIO included in pymrio:

[1]:
import numpy as np
import pymrio
[2]:
io = pymrio.load_test()
io.calc_all()
[2]:
<pymrio.core.mriosystem.IOSystem at 0x7f0e90c8beb0>
[3]:
print(
    "Sectors: {sec},\nRegions: {reg}".format(
        sec=io.get_sectors().tolist(), reg=io.get_regions().tolist()
    )
)
Sectors: ['food', 'mining', 'manufactoring', 'electricity', 'construction', 'trade', 'transport', 'other'],
Regions: ['reg1', 'reg2', 'reg3', 'reg4', 'reg5', 'reg6']

Aggregation using a numerical concordance matrix

This is the standard way to aggregate MRIOs when you work in Matlab. To do so, we need to set up a concordance matrix in which the columns correspond to the orignal classification and the rows to the aggregated one.

[4]:
sec_agg_matrix = np.array(
    [[1, 0, 0, 0, 0, 0, 0, 0], [0, 1, 1, 1, 1, 0, 0, 0], [0, 0, 0, 0, 0, 1, 1, 1]]
)

reg_agg_matrix = np.array([[1, 1, 1, 0, 0, 0], [0, 0, 0, 1, 1, 1]])
[5]:
io.aggregate(region_agg=reg_agg_matrix, sector_agg=sec_agg_matrix)
[5]:
<pymrio.core.mriosystem.IOSystem at 0x7f0e90c8beb0>
[6]:
print(
    "Sectors: {sec},\nRegions: {reg}".format(
        sec=io.get_sectors().tolist(), reg=io.get_regions().tolist()
    )
)
Sectors: ['sec0', 'sec1', 'sec2'],
Regions: ['reg0', 'reg1']
[7]:
io.calc_all()
[7]:
<pymrio.core.mriosystem.IOSystem at 0x7f0e90c8beb0>
[8]:
io.emissions.D_cba
[8]:
region reg0 reg1
sector sec0 sec1 sec2 sec0 sec1 sec2
stressor compartment
emission_type1 air 9.041149e+06 3.018791e+08 1.523236e+08 2.469465e+07 3.468742e+08 2.454117e+08
emission_type2 water 2.123543e+06 4.884509e+07 9.889757e+07 6.000239e+06 4.594530e+07 1.892731e+08

To use custom names for the aggregated sectors or regions, pass a list of names in order of rows in the concordance matrix:

[9]:
io = (
    pymrio.load_test()
    .calc_all()
    .aggregate(
        region_agg=reg_agg_matrix,
        region_names=["World Region A", "World Region B"],
        inplace=False,
    )
)
[10]:
io.get_regions()
[10]:
Index(['World Region A', 'World Region B'], dtype='object', name='region')

Aggregation using a numerical vector

Pymrio also accepts the aggregatio information as numerical or string vector. For these, each entry in the vector assignes the sector/region to a aggregation group. Thus the two aggregation matrices from above (sec_agg_matrix and reg_agg_matrix) can also be represented as numerical or string vectors/lists:

[11]:
sec_agg_vec = np.array([0, 1, 1, 1, 1, 2, 2, 2])
reg_agg_vec = ["R1", "R1", "R1", "R2", "R2", "R2"]

can also be represented as aggregation vector:

[12]:
io_vec_agg = (
    pymrio.load_test()
    .calc_all()
    .aggregate(region_agg=reg_agg_vec, sector_agg=sec_agg_vec, inplace=False)
)
[13]:
print(
    "Sectors: {sec},\nRegions: {reg}".format(
        sec=io_vec_agg.get_sectors().tolist(), reg=io_vec_agg.get_regions().tolist()
    )
)
Sectors: ['sec0', 'sec1', 'sec2'],
Regions: ['R1', 'R2']
[14]:
io_vec_agg.emissions.D_cba_reg
[14]:
region R1 R2
stressor compartment
emission_type1 air 6.690192e+08 1.686954e+09
emission_type2 water 5.337682e+08 5.902081e+08

Regional aggregation using the country converter coco

The previous examples are best suited if you want to reuse existing aggregation information. For new/ad hoc aggregation, the most user-friendly solution is to build the concordance with the country converter coco. The minimum version of coco required is 0.6.2. You can either use coco to build independent aggregations (first case below) or use the predefined classifications included in coco (second case - Example WIOD below).

[15]:
import country_converter as coco

Independent aggregation

[16]:
io = pymrio.load_test().calc_all()
[17]:
reg_agg_coco = coco.agg_conc(
    original_countries=io.get_regions(),
    aggregates={
        "reg1": "World Region A",
        "reg2": "World Region A",
        "reg3": "World Region A",
    },
    missing_countries="World Region B",
)
[18]:
io.aggregate(region_agg=reg_agg_coco)
[18]:
<pymrio.core.mriosystem.IOSystem at 0x7f0e8eb6aa60>
[19]:
print(
    "Sectors: {sec},\nRegions: {reg}".format(
        sec=io.get_sectors().tolist(), reg=io.get_regions().tolist()
    )
)
Sectors: ['food', 'mining', 'manufactoring', 'electricity', 'construction', 'trade', 'transport', 'other'],
Regions: ['World Region A', 'World Region B']

This can be passed directly to pymrio:

[20]:
io.emissions.D_cba_reg
[20]:
region World Region A World Region B
stressor compartment
emission_type1 air 6.690192e+08 1.686954e+09
emission_type2 water 5.337682e+08 5.902081e+08

A pandas DataFrame corresponding to the output from coco can also be passed to sector_agg for aggregation. A sector aggregation package similar to the country converter is planned.

Using the build-in classifications - WIOD example

The country converter is most useful when you work with a MRIO which is included in coco. In that case you can just pass the desired country aggregation to coco and it returns the required aggregation matrix:

For the example here, we assume that a raw WIOD download is available at:

[21]:
wiod_raw = "/tmp/mrios/WIOD2013"

We will parse the year 2000 and calculate the results:

[22]:
wiod_orig = pymrio.parse_wiod(path=wiod_raw, year=2000).calc_all()

and then aggregate the database to first the EU countries and group the remaining countries based on OECD membership. In the example below, we single out Germany (DEU) to be not included in the aggregation:

[23]:
wiod_agg_DEU_EU_OECD = wiod_orig.aggregate(
    region_agg=coco.agg_conc(
        original_countries="WIOD",
        aggregates=[{"DEU": "DEU"}, "EU", "OECD"],
        missing_countries="Other",
        merge_multiple_string=None,
    ),
    inplace=False,
)

We can then rename the regions to make the membership clearer:

[24]:
wiod_agg_DEU_EU_OECD.rename_regions({"OECD": "OECDwoEU", "EU": "EUwoGermany"})
[24]:
<pymrio.core.mriosystem.IOSystem at 0x7f0e8a1e9be0>

To see the result for the air emission footprints:

[25]:
wiod_agg_DEU_EU_OECD.AIR.D_cba_reg
[25]:
region OECDwoEU EUwoGermany Other DEU
stressor
CO2 1.029626e+07 3.120346e+06 9.232742e+06 1.123772e+06
CH4 6.595964e+07 2.605212e+07 1.487615e+08 7.953304e+06
N2O 2.312456e+06 1.055048e+06 6.166586e+06 2.941486e+05
NOX 3.986117e+07 1.130268e+07 5.103133e+07 3.164278e+06
SOX 3.567011e+07 1.034605e+07 5.137882e+07 2.045926e+06
CO 2.032219e+08 4.789266e+07 4.424992e+08 1.296816e+07
NMVOC 3.680383e+07 1.280083e+07 8.186918e+07 2.870176e+06
NH3 6.013446e+06 3.307710e+06 1.674807e+07 8.656818e+05

For further examples on the capabilities of the country converter see the coco tutorial notebook

Aggregation by renaming

One alternative method for aggregating the MRIO system is to rename specific regions and/or sectors to duplicated names. Duplicated sectors and regions can then be automatically aggregated. This makes most sense when having some categories of some kind (e.g. consumption categories) or detailed classification which can easily be broadened (e.g. A01, A02, which could be renamed all to A). In the example below, we will aggregate sectors to consumption categories using some predefined categories included in pymrio. Check the Adjusting, Renaming and Restructuring notebook for more details.

[ ]:
mrio = pymrio.load_test()
[ ]:
class_info = pymrio.get_classification("test")
rename_dict = class_info.get_sector_dict(
    orig=class_info.sectors.TestMrioName, new=class_info.sectors.Type
)

If we take a look at the rename_dict, we see that it maps several sectors of the original MRIO to combined regions (technically a many to one mapping).

[ ]:
rename_dict

Using this dict to rename sectors leads to an index with overlapping labels.

[ ]:
mrio.rename_sectors(rename_dict)
mrio.Z

Which can then be aggregated with

[ ]:
mrio.aggregate_duplicates()
mrio.Z

This method also comes handy when aggregating parts of the MRIO regions. E.g.:

[ ]:
region_convert = {"reg1": "Antarctica", "reg2": "Antarctica"}
mrio.rename_regions(region_convert).aggregate_duplicates()
mrio.Z

Which lets us calculate the footprint of the consumption category ‘eat’ in ‘Antarctica’:

[ ]:
mrio.calc_all()
mrio.emissions.D_cba.loc[:, ("Antarctica", "eat")]

Aggregation to one total sector / region

Both, region_agg and sector_agg, also accept a string as argument. This leads to the aggregation to one total region or sector for the full IO system.

[26]:
pymrio.load_test().calc_all().aggregate(
    region_agg="global", sector_agg="total"
).emissions.D_cba
[26]:
region global
sector total
stressor compartment
emission_type1 air 1.080224e+09
emission_type2 water 3.910848e+08

Pre- vs post-aggregation account calculations

It is generally recommended to calculate MRIO accounts with the highest detail possible and aggregated the results afterwards (post-aggregation - see for example Steen-Olsen et al 2014, Stadler et al 2014 or Koning et al 2015.

Pre-aggregation, that means the aggregation of MRIO sectors and regions before calculation of footprint accounts, might be necessary when dealing with MRIOs on computers with limited RAM resources. However, one should be aware that the results might change.

Pymrio can handle both cases and can be used to highlight the differences. To do so, we use the two concordance matrices defined at the beginning (sec_agg_matrix and reg_agg_matrix) and aggregate the test system before and after the calculation of the accounts:

[27]:
io_pre = (
    pymrio.load_test()
    .aggregate(region_agg=reg_agg_matrix, sector_agg=sec_agg_matrix)
    .calc_all()
)
io_post = (
    pymrio.load_test()
    .calc_all()
    .aggregate(region_agg=reg_agg_matrix, sector_agg=sec_agg_matrix)
)
[28]:
io_pre.emissions.D_cba
[28]:
region reg0 reg1
sector sec0 sec1 sec2 sec0 sec1 sec2
stressor compartment
emission_type1 air 7.722782e+06 3.494413e+08 1.388764e+08 2.695396e+07 3.354598e+08 2.217703e+08
emission_type2 water 1.862161e+06 5.240950e+07 1.583465e+08 6.399685e+06 4.080509e+07 1.312619e+08
[29]:
io_post.emissions.D_cba
[29]:
region reg0 reg1
sector sec0 sec1 sec2 sec0 sec1 sec2
stressor compartment
emission_type1 air 9.041149e+06 3.018791e+08 1.523236e+08 2.469465e+07 3.468742e+08 2.454117e+08
emission_type2 water 2.123543e+06 4.884509e+07 9.889757e+07 6.000239e+06 4.594530e+07 1.892731e+08

The same results as in io_pre are obtained for io_post, if we recalculate the footprint accounts based on the aggregated system:

[30]:
io_post.reset_all_full().calc_all().emissions.D_cba
[30]:
region reg0 reg1
sector sec0 sec1 sec2 sec0 sec1 sec2
stressor compartment
emission_type1 air 7.722782e+06 3.494413e+08 1.388764e+08 2.695396e+07 3.354598e+08 2.217703e+08
emission_type2 water 1.862161e+06 5.240950e+07 1.583465e+08 6.399685e+06 4.080509e+07 1.312619e+08