Re: #rdf #issues #exiobase #issues #rdf #exiobase

Stefano Merciai

Hi Chris, trade data is only in the USE tables. The SUPPLY tables just show the production of activities and the location of the outputs is the same of the activities which produce them. Roughly speaking, the supply table is a diagonalized vector.

On 08/11/2019 10:59, Chris Mutel wrote:
Thanks Bo, this is clear and (at least in my opinion) completely consistent with all group decisions. Maybe I just missed it - entirely possible! - but the issue discussions did not seem to have this clear framework. I will quote from this liberally when updating the software documentation.

If I understand correctly, we take the classic activity (maybe transforming activity :) and market approach, in that:
- Activities consume from and produce to markets
- All trade is between markets

The EXIOBASE importer will then need to create triples:
For each activity:
  In each place:
    For each flow object:
      A supply flow to the national market, if non-zero
      A use flow from the national market, if non-zero

For each place alpha
  For each other place beta
    For each flow object
      A trade flow from the national market in alpha to the national market in beta (trade volume is the sum of data in EXIOBASE)

The EXIOBASE RDF URI creator will need to create:

For each activity
  In each place
    An activity
    A market activity (currently missing, AFAICT)
For each flow object
  A flow object
For each place
  A place
  For every other place
    A trade activity (does trade need to be flow-object specific? In the future, does it need to be specific to transport mode?) (currently missing, AFAICT)

One minor technical question - do we take trade data from the supply or use table? From my uninformed perspective, I would expect this data to be the same in both tables, but I don't underestimate the ability of data providers to surprise me anymore!

Comments and clarifications welcome!


On Thu, 7 Nov 2019 at 15:01, Bo Weidema <bo.weidema@...> wrote:

Dear Chris,

The issue is not about interpreting or deleting data, but simply to make the import of it consistent with our ontology. The Exiobase SUT data includes two kinds of data 1) inputs and outputs of industries ("production activities") 2) inputs and outputs of bi-lateral trade ("market activities"). Both kinds of flow data being SUT data, they do not have location as a property, i.e. the input flows are not linked to an origin, and the output flows are not linked to a destination. This linking is what happens when we produce the Direct Requirement Matrix using the product system algorithms on the SUT.

Since the Exiobase SUT has a convention of integrating the bi-lateral market data in the data for each industry, our import algorithm needs to separate these two kinds of data, to make them consistent with the ontology, but more importantly to make them useful for the later linking.

This is done by:

1) Placing the information on bi-lateral trade flows in their respective market activities (for each of the 169 products, there are 49*49 bi-lateral markets, many of which will be empty (having no flows), and may therefore be ignored). This takes care of the disaggregated import data of the industries.

2) Aggregating the disaggregated import data of the industries, so that each industry only have 169 imported products, not the current 49*169 (since that information is already present in the above bi-lateral trade data).

This way of organising the Exiobase import preserves all data intact, and now in a more meaningful format that allows use by any relevant product system algorithm.

Of course, this transformation is completely transparent and an alternative could be to make an Exiobase ontology term for this "exiobase:import origin" and use this for importing the Exiobase data to RDF with this term attached and then do the "stripping" to BONSAI ontology in RDF. However, this would create precedence for making RDF ontologies for all other strange data formats that poeople wish to provide and making RDF converters for these. I do not think that is a road that we would want to go down. The whole purpose of the BONSAI ontology is to be lean and nevertheless complete enough to allow loss-less import of all the different kind of data people may wish to provide.

I hope this clarifies the reason for staying with the BONSAI ontology on this point and to adapt the import so that loss-less imprt of Exiobase nevertheless is possible.

Best regards


Den 2019-11-07 kl. 12.28 skrev Chris Mutel:
Sorry, I don't want to be a pain in the ass, but I think we are going
a bit down the raod of Stalin's ideological purity here... In an ideal
world, we would have separate sources of trade data, and could ignore
the EXIOBASE trade "assumptions"; but in this ideal world, we could
also take the SU tables directly from each country. EXIOBASE has the
trade information, and it is balanced. We need this trade data, and
don't really want to start a whole new project to get it from another
source (and clean it!). Bo says that "In a true SUT, the flows enter
and leave an activity but do not yet have information on their origin
and destination," but EXIOBASE is not just a SUT, it is also trade

"The EXIOBASE SUT is overspecified in this sense that it already has
interpreted the information in the trade statistics in a specific
(attributional) way. This error should not be imported into the BONSAI
implementation, which should leave the user free to link SUT
activities with different linking algorithms." But we are free to
(re-)link SUT activities with different linking algorithms, even if we
import this data! All data is BONSAI are factual claims that we can
use or ignore as we wish.

We go here to a fundamental decision for the entire project, namely:
Should we let our collective or individual biases lead to data
modification **before** it enters the system? It was my impression
that our consensus decision from the hackathon was that we do not
alter or delete data before it enters the system, unless such
modification would never be controversial in any way (i.e. unit
conversions or changing labels in cases where there is zero
ambiguity). Did this change? I don't accept that it changed in a
comment in a Github issue where two people reported that they
discussed something offline.

On Wed, 6 Nov 2019 at 13:27, Matteo Lissandrini (AAU) <matteo@...> wrote:
Will this require to write from scratch the Exiobase RDF converter? do I understand correctly or is this about some other data?
But we need to update this software anyway to a) make it a proper
installable package, b) follow the URI schema that we are now using,
and c) fill in all the "TODO"s in the code.
Ok, I had the task to update this script for the USE table, as per other email, but then I'm blocked until I have the new output.

This means that for now, the published data is only for the supply table.

We will need to re-sync for completing this work.



Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
5232 Villigen PSI
Telefon: +41 56 310 5787


Join to automatically receive all group messages.