The issue is not about interpreting or deleting data, but simply
to make the import of it consistent with our ontology. The
Exiobase SUT data includes two kinds of data 1) inputs and outputs
of industries ("production activities") 2) inputs and outputs of
bi-lateral trade ("market activities"). Both kinds of flow data
being SUT data, they do not have location as a property, i.e. the
input flows are not linked to an origin, and the output flows are
not linked to a destination. This linking is what happens when we
produce the Direct Requirement Matrix using the product system
algorithms on the SUT.
Since the Exiobase SUT has a convention of integrating the
bi-lateral market data in the data for each industry, our import
algorithm needs to separate these two kinds of data, to make them
consistent with the ontology, but more importantly to make them
useful for the later linking.
This is done by:
1) Placing the information on bi-lateral trade flows in their
respective market activities (for each of the 169 products, there
are 49*49 bi-lateral markets, many of which will be empty (having
no flows), and may therefore be ignored). This takes care of the
disaggregated import data of the industries.
2) Aggregating the disaggregated import data of the industries,
so that each industry only have 169 imported products, not the
current 49*169 (since that information is already present in the
above bi-lateral trade data).
This way of organising the Exiobase import preserves all data
intact, and now in a more meaningful format that allows use by any
relevant product system algorithm.
Of course, this transformation is completely transparent and an
alternative could be to make an Exiobase ontology term for this
"exiobase:import origin" and use this for importing the Exiobase
data to RDF with this term attached and then do the "stripping" to
BONSAI ontology in RDF. However, this would create precedence for
making RDF ontologies for all other strange data formats that
poeople wish to provide and making RDF converters for these. I do
not think that is a road that we would want to go down. The whole
purpose of the BONSAI ontology is to be lean and nevertheless
complete enough to allow loss-less import of all the different
kind of data people may wish to provide.
I hope this clarifies the reason for staying with the BONSAI
ontology on this point and to adapt the import so that loss-less
imprt of Exiobase nevertheless is possible.
Den 2019-11-07 kl. 12.28 skrev Chris Mutel:
Sorry, I don't want to be a pain in the ass, but I think we are going
a bit down the raod of Stalin's ideological purity here... In an ideal
world, we would have separate sources of trade data, and could ignore
the EXIOBASE trade "assumptions"; but in this ideal world, we could
also take the SU tables directly from each country. EXIOBASE has the
trade information, and it is balanced. We need this trade data, and
don't really want to start a whole new project to get it from another
source (and clean it!). Bo says that "In a true SUT, the flows enter
and leave an activity but do not yet have information on their origin
and destination," but EXIOBASE is not just a SUT, it is also trade
"The EXIOBASE SUT is overspecified in this sense that it already has
interpreted the information in the trade statistics in a specific
(attributional) way. This error should not be imported into the BONSAI
implementation, which should leave the user free to link SUT
activities with different linking algorithms." But we are free to
(re-)link SUT activities with different linking algorithms, even if we
import this data! All data is BONSAI are factual claims that we can
use or ignore as we wish.
We go here to a fundamental decision for the entire project, namely:
Should we let our collective or individual biases lead to data
modification **before** it enters the system? It was my impression
that our consensus decision from the hackathon was that we do not
alter or delete data before it enters the system, unless such
modification would never be controversial in any way (i.e. unit
conversions or changing labels in cases where there is zero
ambiguity). Did this change? I don't accept that it changed in a
comment in a Github issue where two people reported that they
discussed something offline.
On Wed, 6 Nov 2019 at 13:27, Matteo Lissandrini (AAU) <matteo@...> wrote:
Will this require to write from scratch the Exiobase RDF converter? do I understand correctly or is this about some other data?
But we need to update this software anyway to a) make it a proper
installable package, b) follow the URI schema that we are now using,
and c) fill in all the "TODO"s in the code.
Ok, I had the task to update this script for the USE table, as per other email, but then I'm blocked until I have the new output.
This means that for now, the published data is only for the supply table.
We will need to re-sync for completing this work.