Exploring MRIOs with Pymrio
The first step when working with a new MRIO data set is to familiarize yourself with the data. This notebook shows how to use the pymrio
package to explore the data. We use the test data set that is included in the pymrio
package. This is a completely artificial, very small MRIO. It is not meant to be realistic, but it is useful for developing, testing and learning.
First we import the required packages:
[1]:
import pymrio
We can now load the test data set with the load_test
function. We can call the MRIO whatever we want, here we use mrio.
[2]:
mrio = pymrio.load_test()
We can get some first information about the MRIO by printing it.
[3]:
print(mrio)
IO System with parameters: Z, Y, unit, population, meta, factor_inputs, emissions
This tells us what the MRIO data we just loaded contains. We find a Z and Y matrix, some unit information and two satellite accounts, factor_inputs and emissions.
To get more specific data we can ask pymrio for regions, sectors, products, etc.
[4]:
mrio.name
[4]:
'testmrio'
[5]:
mrio.get_regions()
[5]:
Index(['reg1', 'reg2', 'reg3', 'reg4', 'reg5', 'reg6'], dtype='object', name='region')
[6]:
mrio.get_sectors()
[6]:
Index(['food', 'mining', 'manufactoring', 'electricity', 'construction',
'trade', 'transport', 'other'],
dtype='object', name='sector')
[7]:
mrio.get_Y_categories()
[7]:
Index(['Final consumption expenditure by households',
'Final consumption expenditure by non-profit organisations serving households (NPISH)',
'Final consumption expenditure by government',
'Gross fixed capital formation', 'Changes in inventories',
'Changes in valuables', 'Export'],
dtype='object', name='category')
The same methods can be used to explore one of the satellite accounts.
[8]:
print(mrio.emissions)
Extension Emissions with parameters: name, F, F_Y, unit
[9]:
mrio.emissions.name
[9]:
'Emissions'
[10]:
mrio.emissions.get_regions()
[10]:
Index(['reg1', 'reg2', 'reg3', 'reg4', 'reg5', 'reg6'], dtype='object', name='region')
The satellite accounts also have a special method to get index (rows) of the acccounts.
[11]:
mrio.emissions.get_rows()
[11]:
MultiIndex([('emission_type1', 'air'),
('emission_type2', 'water')],
names=['stressor', 'compartment'])
Searching through the MRIO
Several methods are available to search through the whole MRIO. These generally accept regular expressions as search terms.
The most general method is ‘find’. This can be used for a quick overview where a specific term appears in the MRIO.
[12]:
mrio.find('air')
[12]:
{'emissions_index': MultiIndex([('emission_type1', 'air')],
names=['stressor', 'compartment'])}
[13]:
mrio.find("trade")
[13]:
{'index': MultiIndex([('reg1', 'trade'),
('reg2', 'trade'),
('reg3', 'trade'),
('reg4', 'trade'),
('reg5', 'trade'),
('reg6', 'trade')],
names=['region', 'sector']),
'sectors': Index(['trade'], dtype='object', name='sector')}
Not that ‘find’ (and all other search methods) a case sensitive. Do make a case insensitive search, add the regular expression flag (?i)
to the search term.
[14]:
mrio.find('value')
[14]:
{}
[15]:
mrio.find('(?i)value')
[15]:
{'factor_inputs_index': Index(['Value Added'], dtype='object', name='inputtype')}
Specific search methods: contains, match, fullmatch,
The MRIO class also contains a set of specific regular expresion search methods, mirroring the ‘contains’, ‘match’ and ‘fullmatch’ methods of the pandas DataFrame str column type. See the pandas documentation for details, in short:
-. ‘contains’ looks for a match anywhere in the string -. ‘match’ looks for a match at the beginning of the string -. ‘fullmatch’ looks for a match of the whole string
These methods are available for all index columns of the MRIO and have a similar signature:
As for ‘find_all’, the search term is case sensitive. To make it case insensitive, add the regular expression flag
(?i)
to the search term.The search term can be passed to the keyword argument ‘find_all’ or as the first positional argument to search in all index levels.
Alternativels, the search term can be passed to the keyword argument with the level name to search only in that index level.
The following examples show how to use these methods.
[16]:
mrio.contains(find_all = 'ad')
mrio.contains('ad')
[16]:
MultiIndex([('reg1', 'trade'),
('reg2', 'trade'),
('reg3', 'trade'),
('reg4', 'trade'),
('reg5', 'trade'),
('reg6', 'trade')],
names=['region', 'sector'])
[17]:
mrio.match('ad')
[17]:
MultiIndex([], names=['region', 'sector'])
[18]:
mrio.match('trad')
[18]:
MultiIndex([('reg1', 'trade'),
('reg2', 'trade'),
('reg3', 'trade'),
('reg4', 'trade'),
('reg5', 'trade'),
('reg6', 'trade')],
names=['region', 'sector'])
[19]:
mrio.fullmatch('trad')
[19]:
MultiIndex([], names=['region', 'sector'])
[20]:
mrio.fullmatch('trade')
[20]:
MultiIndex([('reg1', 'trade'),
('reg2', 'trade'),
('reg3', 'trade'),
('reg4', 'trade'),
('reg5', 'trade'),
('reg6', 'trade')],
names=['region', 'sector'])
[21]:
mrio.fullmatch('(?i).*AD.*')
[21]:
MultiIndex([('reg1', 'trade'),
('reg2', 'trade'),
('reg3', 'trade'),
('reg4', 'trade'),
('reg5', 'trade'),
('reg6', 'trade')],
names=['region', 'sector'])
For the rest of the notebook, we will do the examples with the ‘contains’ method, but the same applies to the other methods.
To search only at one specific level, pass the search term to the keyword argument with the level name.
[22]:
mrio.contains(region='trade')
[22]:
MultiIndex([], names=['region', 'sector'])
[23]:
mrio.contains(sector='trade')
[23]:
MultiIndex([('reg1', 'trade'),
('reg2', 'trade'),
('reg3', 'trade'),
('reg4', 'trade'),
('reg5', 'trade'),
('reg6', 'trade')],
names=['region', 'sector'])
And of course, the method are also available for the satellite accounts.
[24]:
mrio.emissions.contains(compartment='air')
[24]:
MultiIndex([('emission_type1', 'air')],
names=['stressor', 'compartment'])
Passing a non-existing level to the keyword argument is silently ignored.
[25]:
mrio.factor_inputs.contains(compartment='trade')
[25]:
Index([], dtype='object')
This allows to search for terms that are only in some index levels. Locially, this is an ‘or’ search.
[26]:
mrio.factor_inputs.contains(compartment='air', inputtype="Value")
[26]:
Index(['Value Added'], dtype='object', name='inputtype')
But note, that if both levels exist, both must match (so it becomes a logical ‘and’).
[27]:
mrio.emissions.contains(stressor='emission', compartment='air')
[27]:
MultiIndex([('emission_type1', 'air')],
names=['stressor', 'compartment'])
Search through all extensions
All three search methods are also available to loop through all extensions of the MRIO.
[28]:
mrio.extension_contains(stressor='emission', compartment='air')
[28]:
{'Factor Inputs': Index([], dtype='object'),
'Emissions': MultiIndex([('emission_type1', 'air')],
names=['stressor', 'compartment'])}
If only a subset of extensions should be searched, pass the extension names to the keyword argument ‘extensions’.
Generic search method for any dataframe index
Internally, the class methods ‘contains’, ‘match’ and ‘fullmatch’ all the ‘index_contains’, ‘index_match’ and ‘index_fullmatch’ methods of ioutil module. This function can be used to search through index of any pandas DataFrame.
[29]:
df = mrio.Y
Depending if a dataframe or an index is passed, the return is either the dataframe or the index.
[30]:
pymrio.index_contains(df, 'trade')
[30]:
region | reg1 | reg2 | ... | reg5 | reg6 | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
category | Final consumption expenditure by households | Final consumption expenditure by non-profit organisations serving households (NPISH) | Final consumption expenditure by government | Gross fixed capital formation | Changes in inventories | Changes in valuables | Export | Final consumption expenditure by households | Final consumption expenditure by non-profit organisations serving households (NPISH) | Final consumption expenditure by government | ... | Changes in inventories | Changes in valuables | Export | Final consumption expenditure by households | Final consumption expenditure by non-profit organisations serving households (NPISH) | Final consumption expenditure by government | Gross fixed capital formation | Changes in inventories | Changes in valuables | Export | |
region | sector | |||||||||||||||||||||
reg1 | trade | 769535.93000 | 16.638920 | 2.408807e+07 | 6.727345e+07 | 1230.218200 | 216.211080 | 0 | 8063.52380 | 12.738233 | 163.205380 | ... | 204.383170 | 1.684295e+00 | 0 | 49782414.00 | 0.224933 | 14.445660 | 16739029.00 | 12.145465 | 0.013888 | 0 |
reg2 | trade | 5678.26740 | 0.075424 | 2.312962e+02 | 6.339521e+02 | 35.607157 | 3.192694 | 0 | 385664.01000 | 178.501140 | 8160.881200 | ... | 358.249320 | 2.271962e+01 | 0 | 26592464.00 | 0.139745 | 10.623962 | 11775351.00 | 20.572534 | 0.005433 | 0 |
reg3 | trade | 2753.86080 | 0.111540 | 1.956911e+00 | 3.598675e+02 | 23.391120 | 0.000455 | 0 | 2072.24890 | 0.044811 | 1.242613 | ... | 309.984270 | 6.283278e+00 | 0 | 114505.89 | 0.630098 | 37.095549 | 31317361.00 | 212.707510 | 0.014929 | 0 |
reg4 | trade | 373.28393 | 0.009382 | 3.585011e-01 | 2.514957e-02 | 0.002016 | 0.000144 | 0 | 192.21539 | 0.019666 | 0.537107 | ... | 73.859706 | 7.199126e-02 | 0 | 40152651.00 | 0.255523 | 17.253634 | 14011134.00 | 4.052444 | 0.001935 | 0 |
reg5 | trade | 4287.40670 | 0.038941 | 7.014679e+00 | 1.955479e+02 | 6.675656 | 0.524015 | 0 | 3633.68750 | 2.536312 | 50.624916 | ... | 9177.081800 | 1.330591e+06 | 0 | 60992225.00 | 0.823823 | 34.208026 | 27870911.00 | 85.191511 | 0.008929 | 0 |
reg6 | trade | 4772.75750 | 0.113112 | 2.321101e+01 | 2.417571e+02 | 16.267049 | 1.488818 | 0 | 2031.49640 | 1.864492 | 18.787893 | ... | 91.040319 | 2.122217e+00 | 0 | 851864.04 | 23.371306 | 1966.030900 | 131182.13 | 1549.410400 | 0.266033 | 0 |
6 rows × 42 columns
[31]:
pymrio.index_contains(df.index, 'trade')
[31]:
MultiIndex([('reg1', 'trade'),
('reg2', 'trade'),
('reg3', 'trade'),
('reg4', 'trade'),
('reg5', 'trade'),
('reg6', 'trade')],
names=['region', 'sector'])
[32]:
pymrio.index_fullmatch(df, region='reg[2,4]', sector='m.*')
[32]:
region | reg1 | reg2 | ... | reg5 | reg6 | |||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
category | Final consumption expenditure by households | Final consumption expenditure by non-profit organisations serving households (NPISH) | Final consumption expenditure by government | Gross fixed capital formation | Changes in inventories | Changes in valuables | Export | Final consumption expenditure by households | Final consumption expenditure by non-profit organisations serving households (NPISH) | Final consumption expenditure by government | ... | Changes in inventories | Changes in valuables | Export | Final consumption expenditure by households | Final consumption expenditure by non-profit organisations serving households (NPISH) | Final consumption expenditure by government | Gross fixed capital formation | Changes in inventories | Changes in valuables | Export | |
region | sector | |||||||||||||||||||||
reg2 | mining | 1.653997e+02 | 1.817989e-05 | 0.334824 | 3.283238e+01 | 2.910648e+01 | 3.970468e-05 | 0 | 1.091126e+03 | 2.751312 | 1.544777e+01 | ... | 0.040217 | 0.000420 | 0 | 6.127299e-01 | 0.008212 | 0.013061 | 1.167170e+01 | 166.539580 | 0.000002 | 0 |
manufactoring | 9.928459e+07 | 4.187143e+00 | 1373.370200 | 4.237878e+07 | 4.415752e+03 | 1.637658e+02 | 0 | 3.210316e+05 | 125.729110 | 1.603833e+07 | ... | 951.809210 | 21.641280 | 0 | 1.074192e+07 | 62.832488 | 6363.192800 | 1.170497e+07 | 1060.321500 | 0.145903 | 0 | |
reg4 | mining | 1.072728e+02 | 9.421644e-09 | 0.209851 | 1.055704e+00 | 2.697312e+01 | 3.112643e-09 | 0 | 2.940734e+02 | 0.000001 | 1.061796e-01 | ... | 45.388457 | 0.000015 | 0 | 7.705800e+04 | 0.348367 | 2.326858 | 9.917552e-02 | 68.266815 | 0.000789 | 0 |
manufactoring | 4.086352e+07 | 1.170611e+00 | 377.322810 | 3.032655e+07 | 2.263532e+06 | 4.696135e+01 | 0 | 1.510445e+07 | 7.598440 | 1.285316e+02 | ... | 1004.074200 | 1.476200 | 0 | 2.190604e+07 | 94.282910 | 6712.972800 | 3.167496e+07 | 825.256940 | 0.036510 | 0 |
4 rows × 42 columns