Pymrio Tutorial

A complete tutorial covering all top-level functions of pymrio, using the test MRIO system.

Setup and Installation

Before starting this tutorial, make sure you’ve got pymrio installed. You can grab it from conda-forge or PyPi. Use pip, mamba, conda, or whatever package manager you prefer to get it sorted. For example

Getting Started with Test MRIO Data

We begin by importing pymrio and loading the test MRIO system. This small test system contains six regions and eight sectors, making it ideal for learning purposes.

Note that any other MRIO database can be used with the same functions demonstrated here. The test system serves only as a representative example for larger, real-world datasets. See the other notebooks on MRIO downloading and handling (for example for EXIOBASE) for more details.

[1]:

import pymrio

# Load the test MRIO system
test_mrio = pymrio.load_test()

# Display basic information about the system
print(test_mrio)
print("Type of object:", type(test_mrio))
print("Available extensions:", test_mrio.extensions)

# Get regions and sectors
print("Regions:", test_mrio.regions)
print("Sectors:", test_mrio.sectors)
print("Final demand categories:", test_mrio.Y_categories)
print("Extensions:", test_mrio.extensions)
print("Rows in emissions extension:", test_mrio.emissions.rows)

IO System with parameters: Z, Y, unit, population, meta, factor_inputs, emissions
Type of object: <class 'pymrio.core.mriosystem.IOSystem'>
Available extensions: ['Factor Inputs', 'Emissions']
Regions: Index(['reg1', 'reg2', 'reg3', 'reg4', 'reg5', 'reg6'], dtype='object', name='region')
Sectors: Index(['food', 'mining', 'manufactoring', 'electricity', 'construction',
       'trade', 'transport', 'other'],
      dtype='object', name='sector')
Final demand categories: Index(['Final consumption expenditure by households',
       'Final consumption expenditure by non-profit organisations serving households (NPISH)',
       'Final consumption expenditure by government',
       'Gross fixed capital formation', 'Changes in inventories',
       'Changes in valuables', 'Export'],
      dtype='object', name='category')
Extensions: ['Factor Inputs', 'Emissions']
Rows in emissions extension: MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])

Search Functionality

Pymrio offers comprehensive search capabilities to find specific accounts, regions, sectors, stressors, and impacts: The terms are the same as the pandas regex method names and work in the same way. For more details, check out the explore notebook and the pandas regex documentation.

[2]:

# Search for specific terms across the system
search_results = test_mrio.find("food")
print("Search results for 'food':", search_results)

Search results for 'food': {'index': MultiIndex([('reg1', 'food'),
            ('reg2', 'food'),
            ('reg3', 'food'),
            ('reg4', 'food'),
            ('reg5', 'food'),
            ('reg6', 'food')],
           names=['region', 'sector']), 'sectors': Index(['food'], dtype='object', name='sector')}

[3]:

# More specific search methods
contains_results = test_mrio.contains("electricity")
print("Contains 'electricity':", contains_results)

Contains 'electricity': MultiIndex([('reg1', 'electricity'),
            ('reg2', 'electricity'),
            ('reg3', 'electricity'),
            ('reg4', 'electricity'),
            ('reg5', 'electricity'),
            ('reg6', 'electricity')],
           names=['region', 'sector'])

[4]:

# Search within extensions
extension_search = test_mrio.extension_contains("emission")
print("Extension search for 'emission':", extension_search)

# Full match search
match_results = test_mrio.match("reg1")
print("Full match for 'reg1':", match_results)

Extension search for 'emission': {'Factor Inputs': Index([], dtype='object', name='inputtype'), 'Emissions': MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])}
Full match for 'reg1': MultiIndex([('reg1',          'food'),
            ('reg1',        'mining'),
            ('reg1', 'manufactoring'),
            ('reg1',   'electricity'),
            ('reg1',  'construction'),
            ('reg1',         'trade'),
            ('reg1',     'transport'),
            ('reg1',         'other')],
           names=['region', 'sector'])

Tip: Use the find method to get a quick overview where you find a specific term, in particular for mrio systems with multiple extensions. For example, the following finds “air” in the compartment information of one extension.

[5]:

print("Search for occurance of >air< in the whole system:", test_mrio.find("air"))

Search for occurance of >air< in the whole system: {'emissions_index': MultiIndex([('emission_type1', 'air')],
           names=['stressor', 'compartment'])}

Core Calculations with calc_all

The calc_all method is fundamental to pymrio analysis. It automatically identifies missing tables and calculates all necessary accounts:

Before the calculation, we have the following accounts available in the test MRIO system:

[6]:

print("Before calc_all:")
print(test_mrio.DataFrames)
print(test_mrio.emissions.DataFrames)

Before calc_all:
['Z', 'Y', 'unit', 'population']
['F', 'F_Y', 'unit']

Calculate all missing parts

[7]:

test_mrio.calc_all()

[7]:

<pymrio.core.mriosystem.IOSystem at 0x7fe341d63d00>

After calculation, these accounts are available

[8]:

print("After calc_all:")
print(test_mrio.DataFrames)

After calc_all:
['Z', 'Y', 'x', 'A', 'L', 'unit', 'population']

And we now also have several classical EE-MRIO results available:

[9]:

print("\nEmissions accounts:")
print(test_mrio.emissions.DataFrames)


Emissions accounts:
['F', 'F_Y', 'S', 'S_Y', 'M', 'D_cba', 'D_pba', 'D_imp', 'D_exp', 'unit', 'D_cba_reg', 'D_pba_reg', 'D_imp_reg', 'D_exp_reg', 'D_cba_cap', 'D_pba_cap', 'D_imp_cap', 'D_exp_cap']

For example

[10]:

print("D_cba (consumption-based):", test_mrio.emissions.D_cba)

D_cba (consumption-based): region                              reg1                               \
sector                              food         mining manufactoring
stressor       compartment
emission_type1 air          2.056183e+06  179423.535893  9.749300e+07
emission_type2 water        2.423103e+05   25278.192086  1.671240e+07

region                                                                \
sector                       electricity  construction         trade
stressor       compartment
emission_type1 air          1.188759e+07  3.342906e+06  3.885884e+06
emission_type2 water        1.371303e+05  3.468292e+05  7.766205e+05

region                                                          reg2  \
sector                         transport         other          food
stressor       compartment
emission_type1 air          1.075027e+07  1.582152e+07  1.793338e+06
emission_type2 water        4.999628e+05  8.480505e+06  2.136528e+05

region                                    ...          reg5                \
sector                            mining  ...     transport         other
stressor       compartment                ...
emission_type1 air          19145.604911  ...  4.209505e+07  1.138661e+07
emission_type2 water         3733.601474  ...  4.243738e+06  7.307208e+06

region                              reg6                              \
sector                              food        mining manufactoring
stressor       compartment
emission_type1 air          1.517235e+07  1.345318e+06  7.145075e+07
emission_type2 water        4.420574e+06  5.372216e+05  1.068144e+07

region                                                                \
sector                       electricity  construction         trade
stressor       compartment
emission_type1 air          3.683167e+07  1.836696e+06  4.241568e+07
emission_type2 water        5.728136e+05  9.069515e+05  5.449044e+07

region
sector                         transport         other
stressor       compartment
emission_type1 air          4.805409e+07  3.602298e+07
emission_type2 water        8.836484e+06  4.634899e+07

[2 rows x 48 columns]

Ghosh Calculations in calc_all

When calc_all is executed, it can optionally calculate Ghosh inverse matrices for downstream analysis:

[11]:

test_mrio.calc_all(include_ghosh=True)
print(test_mrio)

IO System with parameters: Z, Y, x, A, B, L, G, unit, population, meta, factor_inputs, emissions

This also calculates downstream multipliers M_down

[12]:

test_mrio.emissions.M_down

[12]:

	region	reg1								reg2		...	reg5		reg6
	sector	food	mining	manufactoring	electricity	construction	trade	transport	other	food	mining	...	transport	other	food	mining	manufactoring	electricity	construction	trade	transport	other
stressor	compartment
emission_type1	air	1.045907	26.607878	0.015271	22.163630	0.034868	0.00822	0.030033	0.047107	0.001025	23.995315	...	0.076524	0.020483	0.010557	17.377205	0.025397	15.093929	0.210979	0.048656	0.029646	0.029898
emission_type2	water	0.074245	0.657819	0.000959	0.238178	0.000609	0.00032	0.001693	0.001882	0.000068	1.061891	...	0.011976	0.003882	0.001871	0.667438	0.002734	0.333868	0.023223	0.005806	0.003279	0.003230

2 rows × 48 columns

See the math section of the documentation for further details on the Ghosh calculations.

Search and Extract Functionality

Extracting Specific Accounts

We can also extract consumption-based accounts for a specific stressor

[13]:

cba_emission1 = test_mrio.emissions.D_cba.loc[["emission_type1"]]
print("CBA emissions by region for emission_type1:")
print(cba_emission1)

CBA emissions by region for emission_type1:
region                              reg1                               \
sector                              food         mining manufactoring
stressor       compartment
emission_type1 air          2.056183e+06  179423.535893  9.749300e+07

region                                                                \
sector                       electricity  construction         trade
stressor       compartment
emission_type1 air          1.188759e+07  3.342906e+06  3.885884e+06

region                                                          reg2  \
sector                         transport         other          food
stressor       compartment
emission_type1 air          1.075027e+07  1.582152e+07  1.793338e+06

region                                    ...          reg5                \
sector                            mining  ...     transport         other
stressor       compartment                ...
emission_type1 air          19145.604911  ...  4.209505e+07  1.138661e+07

region                              reg6                              \
sector                              food        mining manufactoring
stressor       compartment
emission_type1 air          1.517235e+07  1.345318e+06  7.145075e+07

region                                                                \
sector                       electricity  construction         trade
stressor       compartment
emission_type1 air          3.683167e+07  1.836696e+06  4.241568e+07

region
sector                         transport         other
stressor       compartment
emission_type1 air          4.805409e+07  3.602298e+07

[1 rows x 48 columns]

And extract data for specific regions:

[14]:

reg1_data = test_mrio.emissions.D_cba_reg[["reg1", "reg3"]]
print("\nTotal CBA emissions for the selected regions:")
print(reg1_data)


Total CBA emissions for the selected regions:
region                              reg1          reg3
stressor       compartment
emission_type1 air          2.077521e+08  3.457988e+08
emission_type2 water        8.642744e+07  3.753335e+08

Besides the direct access to the DataFrames explained above, one can also extract data into dictionaries for alternative access.:

[15]:

emission_type1_data = test_mrio.emissions.get_row_data("emission_type1")

This extracts all data available for >emission_type1<

[16]:

emission_type1_data.keys()

[16]:

dict_keys(['F', 'F_Y', 'S', 'S_Y', 'M', 'M_down', 'D_cba', 'D_pba', 'D_imp', 'D_exp', 'unit', 'D_cba_reg', 'D_pba_reg', 'D_imp_reg', 'D_exp_reg', 'D_cba_cap', 'D_pba_cap', 'D_imp_cap', 'D_exp_cap'])

Advanced Search Patterns

Use regular expressions for more complex searches:

[17]:

emis_search = test_mrio.find("emission.*")
print("Emission... occurances:", emis_search)

all_extension_search = test_mrio.extension_contains("typ+")
print("Extensions containing 'type':", all_extension_search)

Emission... occurances: {'emissions_index': MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])}
Extensions containing 'type': {'Factor Inputs': Index([], dtype='object', name='inputtype'), 'Emissions': MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])}

Using Functions from iomath

Pymrio’s iomath module provides low-level functions for specific calculations:

import numpy as np

[18]:

from pymrio.tools import iomath

# Calculate specific matrices manually
A_manual = iomath.calc_A(test_mrio.Z, test_mrio.x)
print("Manual A matrix calculation matches:", np.allclose(A_manual, test_mrio.A))

# Calculate Leontief matrix
L_manual = iomath.calc_L(test_mrio.A)
print("Manual L matrix calculation matches:", np.allclose(L_manual, test_mrio.L))

# Calculate multipliers
S = test_mrio.emissions.S
M_manual = iomath.calc_M(S, test_mrio.L)
print(
    "Manual multiplier calculation matches:",
    np.allclose(M_manual, test_mrio.emissions.M),
)

Manual A matrix calculation matches: True
Manual L matrix calculation matches: True
Manual multiplier calculation matches: True

Gross Trade Analysis

The calc_gross_trade function provides insights into bilateral trade flows:

[19]:

gross_trade = test_mrio.get_gross_trade()

This give the total trade flows from one region/sectors to other regions

[20]:

gross_trade.bilat_flows.head()

[20]:

	region	reg1	reg2	reg3	reg4	reg5	reg6
region	sector
reg1	food	0.0	9.874311e+03	3.772336e+03	2.327343e+02	1.231784e+03	4.615724e+03
	mining	0.0	2.905520e+03	3.657874e+03	4.020829e+02	6.660429e+02	9.281742e+02
	manufactoring	0.0	6.027532e+07	5.111218e+07	2.709138e+07	3.349291e+07	3.814142e+07
	electricity	0.0	3.775794e+03	3.629075e+02	2.492309e+00	2.222702e+03	9.412412e+02
	construction	0.0	6.629450e+02	2.530807e+02	2.995250e+02	1.537160e+03	1.401676e+02

As well as the totals for each region

[21]:

gross_trade.totals.head()

[21]:

		exports	imports
region	sector
reg1	food	1.972689e+04	1.504225e+05
	mining	8.559694e+03	1.418970e+05
	manufactoring	2.101132e+08	3.888102e+08
	electricity	7.305137e+03	7.365582e+03
	construction	2.892878e+03	5.157738e+03

Extension Methods: Concatenate, Convert, and Characterize

Extension Concatenation

The extension_concate method allows combining multiple extensions.

[22]:

# Create a copy for demonstration
ext_emis2 = test_mrio.emissions.copy()
# Combine two extensions with same index structure
new_ext = pymrio.extension_concate(
    test_mrio.emissions, ext_emis2, new_extension_name="emissions_combined"
)
new_ext.rows

[22]:

MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water'),
            ('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])

Combining extensions with different indicies results in a new index called >indicator<. Any indicies not avaialable in one of the extensions is set to NaN.

[23]:

all_ext = pymrio.extension_concate(
    test_mrio.emissions, test_mrio.factor_inputs, new_extension_name="All"
)
print(all_ext)
all_ext.rows

Extension All with parameters: name, F, F_Y, S, S_Y, M, M_down, D_cba, D_pba, D_imp, D_exp, unit, D_cba_cap, D_imp_reg, D_pba_reg, D_cba_reg, D_exp_reg, D_exp_cap, D_pba_cap, D_imp_cap

[23]:

MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water'),
            (   'Value Added',     nan)],
           names=['indicator', 'compartment'])

In any case, the extension can be attached to the mrio object and used alongside the others.

[24]:

test_mrio.all_ext = all_ext
print(test_mrio.extensions)
print(test_mrio.extensions_instance_names)

['Factor Inputs', 'Emissions', 'All']
['factor_inputs', 'emissions', 'all_ext']

Extension Conversion

The convert and and extension_convert methods transforms extensions based on mapping functions:

[25]:

import pandas as pd

conversion_factors = pd.DataFrame(
    columns=[
        "stressor",
        "compartment",
        "total__stressor",
        "factor",
        "unit_orig",
        "unit_new",
    ],
    data=[
        ["emis.*", "air|water", "total_sum_tonnes", 1e-3, "kg", "t"],
        ["emission_type[1|2]", ".*", "total_sum", 1, "kg", "kg"],
        ["emission_type1", ".*", "air_emissions", 1e-3, "kg", "t"],
        ["emission_type2", ".*", "water_emissions", 1000, "kg", "g"],
        ["emission_type1", ".*", "char_emissions", 2, "kg", "kg_eq"],
        ["emission_type2", ".*", "char_emissions", 10, "kg", "kg_eq"],
    ],
)

Importantly, the columns names >stressor< and >compartment< match the index names of the extension to be converted. Bridge columns are columns with ‘__’ in the name. They defined a new name >impact< and how it is based on a previous column name. The column >Factor< is the conversion factor, and the last 2 columns define new colums and check the orignal ones.

[26]:

new_emis = test_mrio.emissions.convert(
    conversion_factors, new_extension_name="converted_emissions"
)
new_emis.F

[26]:

region	reg1								reg2		...	reg5		reg6
sector	food	mining	manufactoring	electricity	construction	trade	transport	other	food	mining	...	transport	other	food	mining	manufactoring	electricity	construction	trade	transport	other
total
air_emissions	1.848065e+03	9.864481e+02	2.361379e+04	2.813910e+04	2.584142e+03	4.132656e+03	2.176699e+04	7.842091e+03	1.697937e+03	3.473782e+02	...	4.229932e+04	1.077383e+04	1.577800e+04	6.420956e+03	1.131724e+05	5.602253e+04	4.861838e+03	1.819562e+04	4.704654e+04	2.163287e+04
char_emissions	5.088634e+06	2.196329e+06	5.486327e+07	5.901802e+07	8.342249e+06	2.081009e+07	5.366396e+07	4.017596e+07	5.444229e+06	9.893957e+05	...	1.265970e+08	9.345772e+07	7.981707e+07	3.149816e+07	3.533468e+08	1.195772e+08	3.671656e+07	1.753144e+08	1.817509e+08	2.110913e+08
total_sum	1.987315e+06	1.008791e+06	2.437736e+07	2.841308e+07	2.901538e+06	5.387134e+06	2.277999e+07	1.029127e+07	1.902773e+06	3.768421e+05	...	4.649916e+07	1.796483e+07	2.060410e+07	8.286581e+06	1.258726e+08	5.677575e+07	7.561127e+06	3.208793e+07	5.581233e+07	3.841542e+07
total_sum_tonnes	1.987315e+03	1.008791e+03	2.437736e+04	2.841308e+04	2.901538e+03	5.387134e+03	2.277999e+04	1.029127e+04	1.902773e+03	3.768421e+02	...	4.649916e+04	1.796483e+04	2.060410e+04	8.286581e+03	1.258726e+05	5.677575e+04	7.561127e+03	3.208793e+04	5.581233e+04	3.841542e+04
water_emissions	1.392505e+08	2.234330e+07	7.635692e+08	2.739816e+08	3.173965e+08	1.254478e+09	1.012999e+09	2.449178e+09	2.048354e+08	2.946394e+07	...	4.199841e+09	7.191006e+09	4.826108e+09	1.865625e+09	1.270019e+10	7.532137e+08	2.699288e+09	1.389231e+10	8.765784e+09	1.678255e+10

5 rows × 48 columns

Tip: Due to the regular expression capabilities this function is quite powerful but also rather slow. Use is before doing the full analysis, and use the characterization function for “standard” characterization tasks (see below).

Characterization of stressors

Pymrio uses an innovative string-matching approach to characterize stressors. This method matches stressors in the characterization table (in long format) with those in the MRIO system, ensuring consistent stressor mapping, automatic unit verification, and flexibility regardless of entry order. It also handles characterization factors for stressors not present in the satellite account, efficiently manages region- and sector-specific factors, and supports characterization across different extensions.

Unlike traditional matrix multiplication methods, which require strict 1:1 correspondence and precise ordering, this approach is more flexible. Characterization can be performed using either an extension object method or a top-level function that accepts MRIO objects or extension collections.

To start, we need to first define a characterization factors table.

[27]:

char_factors = pd.DataFrame(
    {
        "stressor": ["emission_type1", "emission_type2", "emission_type3"],
        "compartment": ["air", "water", "land"],
        "impact": ["climate_change", "acidification", "eutrophication"],
        "factor": [25.0, 1.5, 0.8],  # kg CO2-eq, SO2-eq, PO4-eq
        "impact_unit": ["kg CO2-eq", "kg SO2-eq", "kg PO4-eq"],
        "stressor_unit": ["kg", "kg", "kg"],
    }
)

This can be used to characterize the emissions extension of the test MRIO system.

[28]:

characterization_result = test_mrio.emissions.characterize(
    factors=char_factors,
    characterized_name_column="impact",
    characterization_factors_column="factor",
    characterized_unit_column="impact_unit",
    orig_unit_column="stressor_unit",
)

The result contains a validation table, informing about the missing stressor.

[29]:

characterization_result.validation

[29]:

	stressor	compartment	impact	factor	impact_unit	stressor_unit	error_unit_impact	error_unit_stressor	error_missing_stressor
0	emission_type1	air	climate_change	25.0	kg CO2-eq	kg	False	False	False
1	emission_type2	water	acidification	1.5	kg SO2-eq	kg	False	False	False
2	emission_type3	land	eutrophication	0.8	kg PO4-eq	kg	False	False	True

TIP: Alway verify and check via the validation table. It is also returned in cases when the characterization can not be performed (e.g. due to unit errors).

The characterized is available as the second attribute of the result:

[30]:

characterization_result.extension

[30]:

<pymrio.core.mriosystem.Extension at 0x7fe339b9d9d0>

[31]:

characterization_result.extension.F

[31]:

region	reg1								reg2		...	reg5		reg6
sector	food	mining	manufactoring	electricity	construction	trade	transport	other	food	mining	...	transport	other	food	mining	manufactoring	electricity	construction	trade	transport	other
impact
acidification	2.088757e+05	3.351494e+04	1.145354e+06	4.109723e+05	4.760948e+05	1881716.7	1.519499e+06	3673767.0	307253.16	44195.916	...	6.299762e+06	1.078651e+07	7.239162e+06	2.798438e+06	1.905029e+07	1.129821e+06	4.048932e+06	20838469.5	1.314868e+07	25173829.5
climate_change	4.620162e+07	2.466120e+07	5.903447e+08	7.034775e+08	6.460354e+07	103316407.5	5.441747e+08	196052265.0	42448432.50	8684453.750	...	1.057483e+09	2.693456e+08	3.944499e+08	1.605239e+08	2.829311e+09	1.400563e+09	1.215460e+08	454890525.0	1.176164e+09	540821700.0
eutrophication	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.0	0.000000e+00	0.0	0.00	0.000	...	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.000000e+00	0.0	0.000000e+00	0.0

3 rows × 48 columns

For more details on region-specific characterization and characterization across multiple extensions, see the notebook stressor_characterization.

Parsing, saving and loading MRIOs

Parsing MRIOs

Pymrio supports any symmetric MRIO table and provides automatic downloading and parsing for several common datasets. For details, see the sections “Automatic MRIO download” and “Handling MRIO data”.

Saving processed MRIOs

You can save your MRIO after you’ve parsed and analysed it. Pymrio lets you save in text, pickle or parquet formats. Parquet works well if your dataset is on the larger side.

import os

[32]:

import tempfile
from pathlib import Path

# Create temporary directory for demonstration
temp_dir = Path(tempfile.mkdtemp())

The difference for into the supported formats it given by the argument to the >save_all< method

[33]:

# Save to text format
txt_path = temp_dir / "test_mrio_txt"
test_mrio.save_all(txt_path, table_format="txt")
list(txt_path.glob("**/*"))

[33]:

[PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/A.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/unit.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/B.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/x.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/L.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/metadata.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/G.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/population.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/Z.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/F.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/S.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/unit.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_cba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/M_down.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_exp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/M.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_imp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_cba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_imp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_pba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_imp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_pba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/F_Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_pba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_exp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_cba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/S_Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_exp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/F.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/S.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/unit.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_cba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/M_down.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_exp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/M.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_imp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_cba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_imp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_pba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_imp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_pba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/F_Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_pba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_exp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_cba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/S_Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_exp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/F.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/S.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/unit.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_cba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/M_down.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_exp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/M.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_imp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_cba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_imp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_pba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_imp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_pba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_pba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_exp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_cba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_exp_cap.txt')]

[34]:

# Save to parquet format
parquet_path = temp_dir / "test_mrio_parquet"
test_mrio.save_all(parquet_path, table_format="parquet")
list(parquet_path.glob("**/*"))

[34]:

[PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/x.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/Z.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/metadata.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/population.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/L.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/unit.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/G.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/A.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/B.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/F.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_imp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_exp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_cba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/F_Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/M.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_imp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/S_Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/S.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_pba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_cba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_imp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_cba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_exp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_pba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/unit.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/M_down.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_pba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_exp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/F.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_imp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_exp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_cba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/F_Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/M.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_imp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/S_Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/S.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_pba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_cba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_imp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_cba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_exp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_pba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/unit.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/M_down.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_pba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_exp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/F.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_imp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_exp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_cba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/M.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_imp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/S.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_pba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_cba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_imp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_cba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_exp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_pba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/unit.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/M_down.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_pba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_exp_cap.parquet')]

In both cases, each DataFrame (account) is stored as separate file, with satellite accounts a subfolders. The file >file_paramters.json< stores the definition of index/columns such that files can be read back in correctly.

[35]:

mrio_reload_txt = pymrio.load_all(txt_path)
mrio_reload_parquet = pymrio.load_all(parquet_path)
print(test_mrio)
print(mrio_reload_txt)
print(mrio_reload_parquet)

IO System with parameters: Z, Y, x, A, B, L, G, unit, population, meta, factor_inputs, emissions, all_ext
IO System with parameters: Z, Y, x, A, B, L, G, unit, population, meta, all_ext, emissions, factor_inputs
IO System with parameters: Z, Y, x, A, B, L, G, unit, population, meta, all_ext, emissions, factor_inputs

Clean up temporary directory

[36]:

import shutil

shutil.rmtree(temp_dir)
print("Temporary files cleaned up")

Temporary files cleaned up

Conclusion

We covered the main functionality of pymrio in this tutorial.

Pymrio has many more features, including aggregation, renaming and restructuring, and analysing the source of stressors. You can explore these topics in more detail in the following example notebooks:

For working with specific MRIO databases, see for example:

You can also check the API Reference for a full overview of available functions and classes.

If you have questions or need help, please open an issue on our GitHub page. We’re happy to help!

Thank you for following the tutorial, and good luck with your MRIO analyses!