Pymrio Tutorial

A complete tutorial covering all top-level functions of pymrio, using the test MRIO system.

Setup and Installation

Before starting this tutorial, make sure you’ve got pymrio installed. You can grab it from conda-forge or PyPi. Use pip, mamba, conda, or whatever package manager you prefer to get it sorted. For example

Getting Started with Test MRIO Data

We begin by importing pymrio and loading the test MRIO system. This small test system contains six regions and eight sectors, making it ideal for learning purposes.

Note that any other MRIO database can be used with the same functions demonstrated here. The test system serves only as a representative example for larger, real-world datasets. See the other notebooks on MRIO downloading and handling (for example for EXIOBASE) for more details.

[1]:
import pymrio

# Load the test MRIO system
test_mrio = pymrio.load_test()

# Display basic information about the system
print(test_mrio)
print("Type of object:", type(test_mrio))
print("Available extensions:", test_mrio.extensions)

# Get regions and sectors
print("Regions:", test_mrio.regions)
print("Sectors:", test_mrio.sectors)
print("Final demand categories:", test_mrio.Y_categories)
print("Extensions:", test_mrio.extensions)
print("Rows in emissions extension:", test_mrio.emissions.rows)
IO System with parameters: Z, Y, unit, population, meta, factor_inputs, emissions
Type of object: <class 'pymrio.core.mriosystem.IOSystem'>
Available extensions: ['Factor Inputs', 'Emissions']
Regions: Index(['reg1', 'reg2', 'reg3', 'reg4', 'reg5', 'reg6'], dtype='object', name='region')
Sectors: Index(['food', 'mining', 'manufactoring', 'electricity', 'construction',
       'trade', 'transport', 'other'],
      dtype='object', name='sector')
Final demand categories: Index(['Final consumption expenditure by households',
       'Final consumption expenditure by non-profit organisations serving households (NPISH)',
       'Final consumption expenditure by government',
       'Gross fixed capital formation', 'Changes in inventories',
       'Changes in valuables', 'Export'],
      dtype='object', name='category')
Extensions: ['Factor Inputs', 'Emissions']
Rows in emissions extension: MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])

Search Functionality

Pymrio offers comprehensive search capabilities to find specific accounts, regions, sectors, stressors, and impacts: The terms are the same as the pandas regex method names and work in the same way. For more details, check out the explore notebook and the pandas regex documentation.

[2]:
# Search for specific terms across the system
search_results = test_mrio.find("food")
print("Search results for 'food':", search_results)
Search results for 'food': {'index': MultiIndex([('reg1', 'food'),
            ('reg2', 'food'),
            ('reg3', 'food'),
            ('reg4', 'food'),
            ('reg5', 'food'),
            ('reg6', 'food')],
           names=['region', 'sector']), 'sectors': Index(['food'], dtype='object', name='sector')}
[3]:
# More specific search methods
contains_results = test_mrio.contains("electricity")
print("Contains 'electricity':", contains_results)
Contains 'electricity': MultiIndex([('reg1', 'electricity'),
            ('reg2', 'electricity'),
            ('reg3', 'electricity'),
            ('reg4', 'electricity'),
            ('reg5', 'electricity'),
            ('reg6', 'electricity')],
           names=['region', 'sector'])
[4]:
# Search within extensions
extension_search = test_mrio.extension_contains("emission")
print("Extension search for 'emission':", extension_search)

# Full match search
match_results = test_mrio.match("reg1")
print("Full match for 'reg1':", match_results)
Extension search for 'emission': {'Factor Inputs': Index([], dtype='object', name='inputtype'), 'Emissions': MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])}
Full match for 'reg1': MultiIndex([('reg1',          'food'),
            ('reg1',        'mining'),
            ('reg1', 'manufactoring'),
            ('reg1',   'electricity'),
            ('reg1',  'construction'),
            ('reg1',         'trade'),
            ('reg1',     'transport'),
            ('reg1',         'other')],
           names=['region', 'sector'])

Tip: Use the find method to get a quick overview where you find a specific term, in particular for mrio systems with multiple extensions. For example, the following finds “air” in the compartment information of one extension.

[5]:
print("Search for occurance of >air< in the whole system:", test_mrio.find("air"))
Search for occurance of >air< in the whole system: {'emissions_index': MultiIndex([('emission_type1', 'air')],
           names=['stressor', 'compartment'])}

Core Calculations with calc_all

The calc_all method is fundamental to pymrio analysis. It automatically identifies missing tables and calculates all necessary accounts:

Before the calculation, we have the following accounts available in the test MRIO system:

[6]:
print("Before calc_all:")
print(test_mrio.DataFrames)
print(test_mrio.emissions.DataFrames)
Before calc_all:
['Z', 'Y', 'unit', 'population']
['F', 'F_Y', 'unit']

Calculate all missing parts

[7]:
test_mrio.calc_all()
[7]:
<pymrio.core.mriosystem.IOSystem at 0x7fe341d63d00>

After calculation, these accounts are available

[8]:
print("After calc_all:")
print(test_mrio.DataFrames)
After calc_all:
['Z', 'Y', 'x', 'A', 'L', 'unit', 'population']

And we now also have several classical EE-MRIO results available:

[9]:
print("\nEmissions accounts:")
print(test_mrio.emissions.DataFrames)

Emissions accounts:
['F', 'F_Y', 'S', 'S_Y', 'M', 'D_cba', 'D_pba', 'D_imp', 'D_exp', 'unit', 'D_cba_reg', 'D_pba_reg', 'D_imp_reg', 'D_exp_reg', 'D_cba_cap', 'D_pba_cap', 'D_imp_cap', 'D_exp_cap']

For example

[10]:
print("D_cba (consumption-based):", test_mrio.emissions.D_cba)
D_cba (consumption-based): region                              reg1                               \
sector                              food         mining manufactoring
stressor       compartment
emission_type1 air          2.056183e+06  179423.535893  9.749300e+07
emission_type2 water        2.423103e+05   25278.192086  1.671240e+07

region                                                                \
sector                       electricity  construction         trade
stressor       compartment
emission_type1 air          1.188759e+07  3.342906e+06  3.885884e+06
emission_type2 water        1.371303e+05  3.468292e+05  7.766205e+05

region                                                          reg2  \
sector                         transport         other          food
stressor       compartment
emission_type1 air          1.075027e+07  1.582152e+07  1.793338e+06
emission_type2 water        4.999628e+05  8.480505e+06  2.136528e+05

region                                    ...          reg5                \
sector                            mining  ...     transport         other
stressor       compartment                ...
emission_type1 air          19145.604911  ...  4.209505e+07  1.138661e+07
emission_type2 water         3733.601474  ...  4.243738e+06  7.307208e+06

region                              reg6                              \
sector                              food        mining manufactoring
stressor       compartment
emission_type1 air          1.517235e+07  1.345318e+06  7.145075e+07
emission_type2 water        4.420574e+06  5.372216e+05  1.068144e+07

region                                                                \
sector                       electricity  construction         trade
stressor       compartment
emission_type1 air          3.683167e+07  1.836696e+06  4.241568e+07
emission_type2 water        5.728136e+05  9.069515e+05  5.449044e+07

region
sector                         transport         other
stressor       compartment
emission_type1 air          4.805409e+07  3.602298e+07
emission_type2 water        8.836484e+06  4.634899e+07

[2 rows x 48 columns]

Ghosh Calculations in calc_all

When calc_all is executed, it can optionally calculate Ghosh inverse matrices for downstream analysis:

[11]:
test_mrio.calc_all(include_ghosh=True)
print(test_mrio)
IO System with parameters: Z, Y, x, A, B, L, G, unit, population, meta, factor_inputs, emissions

This also calculates downstream multipliers M_down

[12]:
test_mrio.emissions.M_down
[12]:
region reg1 reg2 ... reg5 reg6
sector food mining manufactoring electricity construction trade transport other food mining ... transport other food mining manufactoring electricity construction trade transport other
stressor compartment
emission_type1 air 1.045907 26.607878 0.015271 22.163630 0.034868 0.00822 0.030033 0.047107 0.001025 23.995315 ... 0.076524 0.020483 0.010557 17.377205 0.025397 15.093929 0.210979 0.048656 0.029646 0.029898
emission_type2 water 0.074245 0.657819 0.000959 0.238178 0.000609 0.00032 0.001693 0.001882 0.000068 1.061891 ... 0.011976 0.003882 0.001871 0.667438 0.002734 0.333868 0.023223 0.005806 0.003279 0.003230

2 rows × 48 columns

See the math section of the documentation for further details on the Ghosh calculations.

Search and Extract Functionality

Extracting Specific Accounts

We can also extract consumption-based accounts for a specific stressor

[13]:
cba_emission1 = test_mrio.emissions.D_cba.loc[["emission_type1"]]
print("CBA emissions by region for emission_type1:")
print(cba_emission1)
CBA emissions by region for emission_type1:
region                              reg1                               \
sector                              food         mining manufactoring
stressor       compartment
emission_type1 air          2.056183e+06  179423.535893  9.749300e+07

region                                                                \
sector                       electricity  construction         trade
stressor       compartment
emission_type1 air          1.188759e+07  3.342906e+06  3.885884e+06

region                                                          reg2  \
sector                         transport         other          food
stressor       compartment
emission_type1 air          1.075027e+07  1.582152e+07  1.793338e+06

region                                    ...          reg5                \
sector                            mining  ...     transport         other
stressor       compartment                ...
emission_type1 air          19145.604911  ...  4.209505e+07  1.138661e+07

region                              reg6                              \
sector                              food        mining manufactoring
stressor       compartment
emission_type1 air          1.517235e+07  1.345318e+06  7.145075e+07

region                                                                \
sector                       electricity  construction         trade
stressor       compartment
emission_type1 air          3.683167e+07  1.836696e+06  4.241568e+07

region
sector                         transport         other
stressor       compartment
emission_type1 air          4.805409e+07  3.602298e+07

[1 rows x 48 columns]

And extract data for specific regions:

[14]:
reg1_data = test_mrio.emissions.D_cba_reg[["reg1", "reg3"]]
print("\nTotal CBA emissions for the selected regions:")
print(reg1_data)

Total CBA emissions for the selected regions:
region                              reg1          reg3
stressor       compartment
emission_type1 air          2.077521e+08  3.457988e+08
emission_type2 water        8.642744e+07  3.753335e+08

Besides the direct access to the DataFrames explained above, one can also extract data into dictionaries for alternative access.:

[15]:
emission_type1_data = test_mrio.emissions.get_row_data("emission_type1")

This extracts all data available for >emission_type1<

[16]:
emission_type1_data.keys()
[16]:
dict_keys(['F', 'F_Y', 'S', 'S_Y', 'M', 'M_down', 'D_cba', 'D_pba', 'D_imp', 'D_exp', 'unit', 'D_cba_reg', 'D_pba_reg', 'D_imp_reg', 'D_exp_reg', 'D_cba_cap', 'D_pba_cap', 'D_imp_cap', 'D_exp_cap'])

Advanced Search Patterns

Use regular expressions for more complex searches:

[17]:
emis_search = test_mrio.find("emission.*")
print("Emission... occurances:", emis_search)

all_extension_search = test_mrio.extension_contains("typ+")
print("Extensions containing 'type':", all_extension_search)
Emission... occurances: {'emissions_index': MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])}
Extensions containing 'type': {'Factor Inputs': Index([], dtype='object', name='inputtype'), 'Emissions': MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])}

Using Functions from iomath

Pymrio’s iomath module provides low-level functions for specific calculations:

import numpy as np

[18]:
from pymrio.tools import iomath

# Calculate specific matrices manually
A_manual = iomath.calc_A(test_mrio.Z, test_mrio.x)
print("Manual A matrix calculation matches:", np.allclose(A_manual, test_mrio.A))

# Calculate Leontief matrix
L_manual = iomath.calc_L(test_mrio.A)
print("Manual L matrix calculation matches:", np.allclose(L_manual, test_mrio.L))

# Calculate multipliers
S = test_mrio.emissions.S
M_manual = iomath.calc_M(S, test_mrio.L)
print(
    "Manual multiplier calculation matches:",
    np.allclose(M_manual, test_mrio.emissions.M),
)
Manual A matrix calculation matches: True
Manual L matrix calculation matches: True
Manual multiplier calculation matches: True

Gross Trade Analysis

The calc_gross_trade function provides insights into bilateral trade flows:

[19]:
gross_trade = test_mrio.get_gross_trade()

This give the total trade flows from one region/sectors to other regions

[20]:
gross_trade.bilat_flows.head()
[20]:
region reg1 reg2 reg3 reg4 reg5 reg6
region sector
reg1 food 0.0 9.874311e+03 3.772336e+03 2.327343e+02 1.231784e+03 4.615724e+03
mining 0.0 2.905520e+03 3.657874e+03 4.020829e+02 6.660429e+02 9.281742e+02
manufactoring 0.0 6.027532e+07 5.111218e+07 2.709138e+07 3.349291e+07 3.814142e+07
electricity 0.0 3.775794e+03 3.629075e+02 2.492309e+00 2.222702e+03 9.412412e+02
construction 0.0 6.629450e+02 2.530807e+02 2.995250e+02 1.537160e+03 1.401676e+02

As well as the totals for each region

[21]:
gross_trade.totals.head()
[21]:
exports imports
region sector
reg1 food 1.972689e+04 1.504225e+05
mining 8.559694e+03 1.418970e+05
manufactoring 2.101132e+08 3.888102e+08
electricity 7.305137e+03 7.365582e+03
construction 2.892878e+03 5.157738e+03

Extension Methods: Concatenate, Convert, and Characterize

Extension Concatenation

The extension_concate method allows combining multiple extensions.

[22]:
# Create a copy for demonstration
ext_emis2 = test_mrio.emissions.copy()
# Combine two extensions with same index structure
new_ext = pymrio.extension_concate(
    test_mrio.emissions, ext_emis2, new_extension_name="emissions_combined"
)
new_ext.rows
[22]:
MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water'),
            ('emission_type1',   'air'),
            ('emission_type2', 'water')],
           names=['stressor', 'compartment'])

Combining extensions with different indicies results in a new index called >indicator<. Any indicies not avaialable in one of the extensions is set to NaN.

[23]:
all_ext = pymrio.extension_concate(
    test_mrio.emissions, test_mrio.factor_inputs, new_extension_name="All"
)
print(all_ext)
all_ext.rows
Extension All with parameters: name, F, F_Y, S, S_Y, M, M_down, D_cba, D_pba, D_imp, D_exp, unit, D_cba_cap, D_imp_reg, D_pba_reg, D_cba_reg, D_exp_reg, D_exp_cap, D_pba_cap, D_imp_cap
[23]:
MultiIndex([('emission_type1',   'air'),
            ('emission_type2', 'water'),
            (   'Value Added',     nan)],
           names=['indicator', 'compartment'])

In any case, the extension can be attached to the mrio object and used alongside the others.

[24]:
test_mrio.all_ext = all_ext
print(test_mrio.extensions)
print(test_mrio.extensions_instance_names)
['Factor Inputs', 'Emissions', 'All']
['factor_inputs', 'emissions', 'all_ext']

Extension Conversion

The convert and and extension_convert methods transforms extensions based on mapping functions:

[25]:
import pandas as pd

conversion_factors = pd.DataFrame(
    columns=[
        "stressor",
        "compartment",
        "total__stressor",
        "factor",
        "unit_orig",
        "unit_new",
    ],
    data=[
        ["emis.*", "air|water", "total_sum_tonnes", 1e-3, "kg", "t"],
        ["emission_type[1|2]", ".*", "total_sum", 1, "kg", "kg"],
        ["emission_type1", ".*", "air_emissions", 1e-3, "kg", "t"],
        ["emission_type2", ".*", "water_emissions", 1000, "kg", "g"],
        ["emission_type1", ".*", "char_emissions", 2, "kg", "kg_eq"],
        ["emission_type2", ".*", "char_emissions", 10, "kg", "kg_eq"],
    ],
)

Importantly, the columns names >stressor< and >compartment< match the index names of the extension to be converted. Bridge columns are columns with ‘__’ in the name. They defined a new name >impact< and how it is based on a previous column name. The column >Factor< is the conversion factor, and the last 2 columns define new colums and check the orignal ones.

[26]:
new_emis = test_mrio.emissions.convert(
    conversion_factors, new_extension_name="converted_emissions"
)
new_emis.F
[26]:
region reg1 reg2 ... reg5 reg6
sector food mining manufactoring electricity construction trade transport other food mining ... transport other food mining manufactoring electricity construction trade transport other
total
air_emissions 1.848065e+03 9.864481e+02 2.361379e+04 2.813910e+04 2.584142e+03 4.132656e+03 2.176699e+04 7.842091e+03 1.697937e+03 3.473782e+02 ... 4.229932e+04 1.077383e+04 1.577800e+04 6.420956e+03 1.131724e+05 5.602253e+04 4.861838e+03 1.819562e+04 4.704654e+04 2.163287e+04
char_emissions 5.088634e+06 2.196329e+06 5.486327e+07 5.901802e+07 8.342249e+06 2.081009e+07 5.366396e+07 4.017596e+07 5.444229e+06 9.893957e+05 ... 1.265970e+08 9.345772e+07 7.981707e+07 3.149816e+07 3.533468e+08 1.195772e+08 3.671656e+07 1.753144e+08 1.817509e+08 2.110913e+08
total_sum 1.987315e+06 1.008791e+06 2.437736e+07 2.841308e+07 2.901538e+06 5.387134e+06 2.277999e+07 1.029127e+07 1.902773e+06 3.768421e+05 ... 4.649916e+07 1.796483e+07 2.060410e+07 8.286581e+06 1.258726e+08 5.677575e+07 7.561127e+06 3.208793e+07 5.581233e+07 3.841542e+07
total_sum_tonnes 1.987315e+03 1.008791e+03 2.437736e+04 2.841308e+04 2.901538e+03 5.387134e+03 2.277999e+04 1.029127e+04 1.902773e+03 3.768421e+02 ... 4.649916e+04 1.796483e+04 2.060410e+04 8.286581e+03 1.258726e+05 5.677575e+04 7.561127e+03 3.208793e+04 5.581233e+04 3.841542e+04
water_emissions 1.392505e+08 2.234330e+07 7.635692e+08 2.739816e+08 3.173965e+08 1.254478e+09 1.012999e+09 2.449178e+09 2.048354e+08 2.946394e+07 ... 4.199841e+09 7.191006e+09 4.826108e+09 1.865625e+09 1.270019e+10 7.532137e+08 2.699288e+09 1.389231e+10 8.765784e+09 1.678255e+10

5 rows × 48 columns

Tip: Due to the regular expression capabilities this function is quite powerful but also rather slow. Use is before doing the full analysis, and use the characterization function for “standard” characterization tasks (see below).

Characterization of stressors

Pymrio uses an innovative string-matching approach to characterize stressors. This method matches stressors in the characterization table (in long format) with those in the MRIO system, ensuring consistent stressor mapping, automatic unit verification, and flexibility regardless of entry order. It also handles characterization factors for stressors not present in the satellite account, efficiently manages region- and sector-specific factors, and supports characterization across different extensions.

Unlike traditional matrix multiplication methods, which require strict 1:1 correspondence and precise ordering, this approach is more flexible. Characterization can be performed using either an extension object method or a top-level function that accepts MRIO objects or extension collections.

To start, we need to first define a characterization factors table.

[27]:
char_factors = pd.DataFrame(
    {
        "stressor": ["emission_type1", "emission_type2", "emission_type3"],
        "compartment": ["air", "water", "land"],
        "impact": ["climate_change", "acidification", "eutrophication"],
        "factor": [25.0, 1.5, 0.8],  # kg CO2-eq, SO2-eq, PO4-eq
        "impact_unit": ["kg CO2-eq", "kg SO2-eq", "kg PO4-eq"],
        "stressor_unit": ["kg", "kg", "kg"],
    }
)

This can be used to characterize the emissions extension of the test MRIO system.

[28]:
characterization_result = test_mrio.emissions.characterize(
    factors=char_factors,
    characterized_name_column="impact",
    characterization_factors_column="factor",
    characterized_unit_column="impact_unit",
    orig_unit_column="stressor_unit",
)

The result contains a validation table, informing about the missing stressor.

[29]:
characterization_result.validation
[29]:
stressor compartment impact factor impact_unit stressor_unit error_unit_impact error_unit_stressor error_missing_stressor
0 emission_type1 air climate_change 25.0 kg CO2-eq kg False False False
1 emission_type2 water acidification 1.5 kg SO2-eq kg False False False
2 emission_type3 land eutrophication 0.8 kg PO4-eq kg False False True

TIP: Alway verify and check via the validation table. It is also returned in cases when the characterization can not be performed (e.g. due to unit errors).

The characterized is available as the second attribute of the result:

[30]:
characterization_result.extension
[30]:
<pymrio.core.mriosystem.Extension at 0x7fe339b9d9d0>
[31]:
characterization_result.extension.F
[31]:
region reg1 reg2 ... reg5 reg6
sector food mining manufactoring electricity construction trade transport other food mining ... transport other food mining manufactoring electricity construction trade transport other
impact
acidification 2.088757e+05 3.351494e+04 1.145354e+06 4.109723e+05 4.760948e+05 1881716.7 1.519499e+06 3673767.0 307253.16 44195.916 ... 6.299762e+06 1.078651e+07 7.239162e+06 2.798438e+06 1.905029e+07 1.129821e+06 4.048932e+06 20838469.5 1.314868e+07 25173829.5
climate_change 4.620162e+07 2.466120e+07 5.903447e+08 7.034775e+08 6.460354e+07 103316407.5 5.441747e+08 196052265.0 42448432.50 8684453.750 ... 1.057483e+09 2.693456e+08 3.944499e+08 1.605239e+08 2.829311e+09 1.400563e+09 1.215460e+08 454890525.0 1.176164e+09 540821700.0
eutrophication 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.0 0.000000e+00 0.0 0.00 0.000 ... 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.0 0.000000e+00 0.0

3 rows × 48 columns

For more details on region-specific characterization and characterization across multiple extensions, see the notebook stressor_characterization.

Parsing, saving and loading MRIOs

Parsing MRIOs

Pymrio supports any symmetric MRIO table and provides automatic downloading and parsing for several common datasets. For details, see the sections “Automatic MRIO download” and “Handling MRIO data”.

Saving processed MRIOs

You can save your MRIO after you’ve parsed and analysed it. Pymrio lets you save in text, pickle or parquet formats. Parquet works well if your dataset is on the larger side.

import os

[32]:
import tempfile
from pathlib import Path

# Create temporary directory for demonstration
temp_dir = Path(tempfile.mkdtemp())

The difference for into the supported formats it given by the argument to the >save_all< method

[33]:
# Save to text format
txt_path = temp_dir / "test_mrio_txt"
test_mrio.save_all(txt_path, table_format="txt")
list(txt_path.glob("**/*"))
[33]:
[PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/A.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/unit.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/B.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/x.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/L.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/metadata.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/G.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/population.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/Z.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/F.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/S.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/unit.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_cba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/M_down.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_exp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/M.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_imp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_cba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_imp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_pba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_imp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_pba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/F_Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_pba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_exp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_cba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/S_Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/all_ext/D_exp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/F.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/S.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/unit.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_cba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/M_down.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_exp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/M.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_imp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_cba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_imp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_pba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_imp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_pba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/F_Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_pba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_exp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_cba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/S_Y.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/emissions/D_exp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/F.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/S.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/unit.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_cba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/M_down.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_exp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/M.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_imp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_cba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_imp_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_pba.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_imp_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_pba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_pba_cap.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_exp.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_cba_reg.txt'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_txt/factor_inputs/D_exp_cap.txt')]
[34]:
# Save to parquet format
parquet_path = temp_dir / "test_mrio_parquet"
test_mrio.save_all(parquet_path, table_format="parquet")
list(parquet_path.glob("**/*"))
[34]:
[PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/x.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/Z.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/metadata.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/population.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/L.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/unit.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/G.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/A.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/B.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/F.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_imp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_exp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_cba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/F_Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/M.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_imp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/S_Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/S.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_pba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_cba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_imp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_cba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_exp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_pba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/unit.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/M_down.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_pba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/all_ext/D_exp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/F.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_imp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_exp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_cba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/F_Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/M.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_imp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/S_Y.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/S.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_pba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_cba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_imp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_cba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_exp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_pba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/unit.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/M_down.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_pba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/emissions/D_exp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/F.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_imp_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_exp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_cba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/M.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/file_parameters.json'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_imp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/S.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_pba_cap.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_cba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_imp.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_cba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_exp_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_pba_reg.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/unit.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/M_down.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_pba.parquet'),
 PosixPath('/tmp/tmph4w3mja7/test_mrio_parquet/factor_inputs/D_exp_cap.parquet')]

In both cases, each DataFrame (account) is stored as separate file, with satellite accounts a subfolders. The file >file_paramters.json< stores the definition of index/columns such that files can be read back in correctly.

[35]:
mrio_reload_txt = pymrio.load_all(txt_path)
mrio_reload_parquet = pymrio.load_all(parquet_path)
print(test_mrio)
print(mrio_reload_txt)
print(mrio_reload_parquet)
IO System with parameters: Z, Y, x, A, B, L, G, unit, population, meta, factor_inputs, emissions, all_ext
IO System with parameters: Z, Y, x, A, B, L, G, unit, population, meta, all_ext, emissions, factor_inputs
IO System with parameters: Z, Y, x, A, B, L, G, unit, population, meta, all_ext, emissions, factor_inputs

Clean up temporary directory

[36]:
import shutil

shutil.rmtree(temp_dir)
print("Temporary files cleaned up")
Temporary files cleaned up

Conclusion

We covered the main functionality of pymrio in this tutorial.

Pymrio has many more features, including aggregation, renaming and restructuring, and analysing the source of stressors. You can explore these topics in more detail in the following example notebooks:

For working with specific MRIO databases, see for example:

You can also check the API Reference for a full overview of available functions and classes.

If you have questions or need help, please open an issue on our GitHub page. We’re happy to help!

Thank you for following the tutorial, and good luck with your MRIO analyses!