The issue is not about interpreting or deleting data, but
simply to make the import of it consistent with our
ontology. The Exiobase SUT data includes two kinds of data
1) inputs and outputs of industries ("production
activities") 2) inputs and outputs of bi-lateral trade
("market activities"). Both kinds of flow data being SUT
data, they do not have location as a property, i.e. the
input flows are not linked to an origin, and the output
flows are not linked to a destination. This linking is
what happens when we produce the Direct Requirement Matrix
using the product system algorithms on the SUT.
Since the Exiobase SUT has a convention of integrating
the bi-lateral market data in the data for each industry,
our import algorithm needs to separate these two kinds of
data, to make them consistent with the ontology, but more
importantly to make them useful for the later linking.
This is done by:
1) Placing the information on bi-lateral trade flows in
their respective market activities (for each of the 169
products, there are 49*49 bi-lateral markets, many of
which will be empty (having no flows), and may therefore
be ignored). This takes care of the disaggregated import
data of the industries.
2) Aggregating the disaggregated import data of the
industries, so that each industry only have 169 imported
products, not the current 49*169 (since that information
is already present in the above bi-lateral trade data).
This way of organising the Exiobase import preserves all
data intact, and now in a more meaningful format that
allows use by any relevant product system algorithm.
Of course, this transformation is completely transparent
and an alternative could be to make an Exiobase ontology
term for this "exiobase:import origin" and use this for
importing the Exiobase data to RDF with this term attached
and then do the "stripping" to BONSAI ontology in RDF.
However, this would create precedence for making RDF
ontologies for all other strange data formats that poeople
wish to provide and making RDF converters for these. I do
not think that is a road that we would want to go down.
The whole purpose of the BONSAI ontology is to be lean and
nevertheless complete enough to allow loss-less import of
all the different kind of data people may wish to provide.
I hope this clarifies the reason for staying with the
BONSAI ontology on this point and to adapt the import so
that loss-less imprt of Exiobase nevertheless is possible.
Den 2019-11-07 kl. 12.28 skrev Chris Mutel:
Sorry, I don't want to be a pain in the ass, but I think we are going
a bit down the raod of Stalin's ideological purity here... In an ideal
world, we would have separate sources of trade data, and could ignore
the EXIOBASE trade "assumptions"; but in this ideal world, we could
also take the SU tables directly from each country. EXIOBASE has the
trade information, and it is balanced. We need this trade data, and
don't really want to start a whole new project to get it from another
source (and clean it!). Bo says that "In a true SUT, the flows enter
and leave an activity but do not yet have information on their origin
and destination," but EXIOBASE is not just a SUT, it is also trade
"The EXIOBASE SUT is overspecified in this sense that it already has
interpreted the information in the trade statistics in a specific
(attributional) way. This error should not be imported into the BONSAI
implementation, which should leave the user free to link SUT
activities with different linking algorithms." But we are free to
(re-)link SUT activities with different linking algorithms, even if we
import this data! All data is BONSAI are factual claims that we can
use or ignore as we wish.
We go here to a fundamental decision for the entire project, namely:
Should we let our collective or individual biases lead to data
modification **before** it enters the system? It was my impression
that our consensus decision from the hackathon was that we do not
alter or delete data before it enters the system, unless such
modification would never be controversial in any way (i.e. unit
conversions or changing labels in cases where there is zero
ambiguity). Did this change? I don't accept that it changed in a
comment in a Github issue where two people reported that they
discussed something offline.
On Wed, 6 Nov 2019 at 13:27, Matteo Lissandrini (AAU) <matteo@...> wrote:
Will this require to write from scratch the Exiobase RDF converter? do I understand correctly or is this about some other data?
But we need to update this software anyway to a) make it a proper
installable package, b) follow the URI schema that we are now using,
and c) fill in all the "TODO"s in the code.
Ok, I had the task to update this script for the USE table, as per other email, but then I'm blocked until I have the new output.
This means that for now, the published data is only for the supply table.
We will need to re-sync for completing this work.