Re: #rdf #issues #exiobase #issues #rdf #exiobase


Matteo Lissandrini (AAU)
 

Hi all,


I am not in the position of taking sides.

I just want to provide one clarification here.


Both options (keep the data as is in Exiobase, or re-aggregate) are compatible with the ontology. I would say we should be quite happy with this result per se, is not a minor feat! :)


Both option require an extra step when importing the data:

 - keep the data as is requires the instantiation of the "activity" that links the flow with the location so that the flow is an output of that activity, I say instantiation, this is the same thing we do when instantiating the URI for Paddy-rice or for Rest of Europe. So this is not a change in the ontology

 - re-aggregate requires during conversion to take all the splitted flows and to sum  them up in a single flow.



Cheers,

Matteo







---
Matteo Lissandrini

Department of Computer Science
Aalborg University

http://people.cs.aau.dk/~matteo






From: main@bonsai.groups.io <main@bonsai.groups.io> on behalf of Bo Weidema via Groups.Io <bo.weidema@...>
Sent: Thursday, November 7, 2019 3:01:32 PM
To: main@bonsai.groups.io
Subject: Re: [bonsai] #rdf #issues #exiobase
 

Dear Chris,

The issue is not about interpreting or deleting data, but simply to make the import of it consistent with our ontology. The Exiobase SUT data includes two kinds of data 1) inputs and outputs of industries ("production activities") 2) inputs and outputs of bi-lateral trade ("market activities"). Both kinds of flow data being SUT data, they do not have location as a property, i.e. the input flows are not linked to an origin, and the output flows are not linked to a destination. This linking is what happens when we produce the Direct Requirement Matrix using the product system algorithms on the SUT.

Since the Exiobase SUT has a convention of integrating the bi-lateral market data in the data for each industry, our import algorithm needs to separate these two kinds of data, to make them consistent with the ontology, but more importantly to make them useful for the later linking.

This is done by:

1) Placing the information on bi-lateral trade flows in their respective market activities (for each of the 169 products, there are 49*49 bi-lateral markets, many of which will be empty (having no flows), and may therefore be ignored). This takes care of the disaggregated import data of the industries.

2) Aggregating the disaggregated import data of the industries, so that each industry only have 169 imported products, not the current 49*169 (since that information is already present in the above bi-lateral trade data).

This way of organising the Exiobase import preserves all data intact, and now in a more meaningful format that allows use by any relevant product system algorithm.

Of course, this transformation is completely transparent and an alternative could be to make an Exiobase ontology term for this "exiobase:import origin" and use this for importing the Exiobase data to RDF with this term attached and then do the "stripping" to BONSAI ontology in RDF. However, this would create precedence for making RDF ontologies for all other strange data formats that poeople wish to provide and making RDF converters for these. I do not think that is a road that we would want to go down. The whole purpose of the BONSAI ontology is to be lean and nevertheless complete enough to allow loss-less import of all the different kind of data people may wish to provide.

I hope this clarifies the reason for staying with the BONSAI ontology on this point and to adapt the import so that loss-less imprt of Exiobase nevertheless is possible.

Best regards

Bo

Den 2019-11-07 kl. 12.28 skrev Chris Mutel:
Sorry, I don't want to be a pain in the ass, but I think we are going
a bit down the raod of Stalin's ideological purity here... In an ideal
world, we would have separate sources of trade data, and could ignore
the EXIOBASE trade "assumptions"; but in this ideal world, we could
also take the SU tables directly from each country. EXIOBASE has the
trade information, and it is balanced. We need this trade data, and
don't really want to start a whole new project to get it from another
source (and clean it!). Bo says that "In a true SUT, the flows enter
and leave an activity but do not yet have information on their origin
and destination," but EXIOBASE is not just a SUT, it is also trade
data.

"The EXIOBASE SUT is overspecified in this sense that it already has
interpreted the information in the trade statistics in a specific
(attributional) way. This error should not be imported into the BONSAI
implementation, which should leave the user free to link SUT
activities with different linking algorithms." But we are free to
(re-)link SUT activities with different linking algorithms, even if we
import this data! All data is BONSAI are factual claims that we can
use or ignore as we wish.

We go here to a fundamental decision for the entire project, namely:
Should we let our collective or individual biases lead to data
modification **before** it enters the system? It was my impression
that our consensus decision from the hackathon was that we do not
alter or delete data before it enters the system, unless such
modification would never be controversial in any way (i.e. unit
conversions or changing labels in cases where there is zero
ambiguity). Did this change? I don't accept that it changed in a
comment in a Github issue where two people reported that they
discussed something offline.

On Wed, 6 Nov 2019 at 13:27, Matteo Lissandrini (AAU) <matteo@...> wrote:

Will this require to write from scratch the Exiobase RDF converter? do I understand correctly or is this about some other data?

But we need to update this software anyway to a) make it a proper
installable package, b) follow the URI schema that we are now using,
and c) fill in all the "TODO"s in the code.
Ok, I had the task to update this script for the USE table, as per other email, but then I'm blocked until I have the new output.

This means that for now, the published data is only for the supply table.

We will need to re-sync for completing this work.

Thanks,
Matteo



--

Join main@bonsai.groups.io to automatically receive all group messages.