The issue is not about interpreting or deleting data, but simply to make the import of it consistent with our ontology. The Exiobase SUT data includes two kinds of data 1) inputs and outputs of industries ("production activities") 2) inputs and outputs of bi-lateral trade ("market activities"). Both kinds of flow data being SUT data, they do not have location as a property, i.e. the input flows are not linked to an origin, and the output flows are not linked to a destination. This linking is what happens when we produce the Direct Requirement Matrix using the product system algorithms on the SUT.
Since the Exiobase SUT has a convention of integrating the
bi-lateral market data in the data for each industry, our import
algorithm needs to separate these two kinds of data, to make them
consistent with the ontology, but more importantly to make them
useful for the later linking.
This is done by:
1) Placing the information on bi-lateral trade flows in their
respective market activities (for each of the 169 products, there
are 49*49 bi-lateral markets, many of which will be empty (having
no flows), and may therefore be ignored). This takes care of the
disaggregated import data of the industries.
2) Aggregating the disaggregated import data of the industries, so that each industry only have 169 imported products, not the current 49*169 (since that information is already present in the above bi-lateral trade data).
This way of organising the Exiobase import preserves all data
intact, and now in a more meaningful format that allows use by any
relevant product system algorithm.
Of course, this transformation is completely transparent and an alternative could be to make an Exiobase ontology term for this "exiobase:import origin" and use this for importing the Exiobase data to RDF with this term attached and then do the "stripping" to BONSAI ontology in RDF. However, this would create precedence for making RDF ontologies for all other strange data formats that poeople wish to provide and making RDF converters for these. I do not think that is a road that we would want to go down. The whole purpose of the BONSAI ontology is to be lean and nevertheless complete enough to allow loss-less import of all the different kind of data people may wish to provide.
I hope this clarifies the reason for staying with the BONSAI ontology on this point and to adapt the import so that loss-less imprt of Exiobase nevertheless is possible.
Sorry, I don't want to be a pain in the ass, but I think we are going a bit down the raod of Stalin's ideological purity here... In an ideal world, we would have separate sources of trade data, and could ignore the EXIOBASE trade "assumptions"; but in this ideal world, we could also take the SU tables directly from each country. EXIOBASE has the trade information, and it is balanced. We need this trade data, and don't really want to start a whole new project to get it from another source (and clean it!). Bo says that "In a true SUT, the flows enter and leave an activity but do not yet have information on their origin and destination," but EXIOBASE is not just a SUT, it is also trade data. "The EXIOBASE SUT is overspecified in this sense that it already has interpreted the information in the trade statistics in a specific (attributional) way. This error should not be imported into the BONSAI implementation, which should leave the user free to link SUT activities with different linking algorithms." But we are free to (re-)link SUT activities with different linking algorithms, even if we import this data! All data is BONSAI are factual claims that we can use or ignore as we wish. We go here to a fundamental decision for the entire project, namely: Should we let our collective or individual biases lead to data modification **before** it enters the system? It was my impression that our consensus decision from the hackathon was that we do not alter or delete data before it enters the system, unless such modification would never be controversial in any way (i.e. unit conversions or changing labels in cases where there is zero ambiguity). Did this change? I don't accept that it changed in a comment in a Github issue where two people reported that they discussed something offline. On Wed, 6 Nov 2019 at 13:27, Matteo Lissandrini (AAU) <matteo@...> wrote:Will this require to write from scratch the Exiobase RDF converter? do I understand correctly or is this about some other data?But we need to update this software anyway to a) make it a proper installable package, b) follow the URI schema that we are now using, and c) fill in all the "TODO"s in the code.Ok, I had the task to update this script for the USE table, as per other email, but then I'm blocked until I have the new output. This means that for now, the published data is only for the supply table. We will need to re-sync for completing this work. Thanks, Matteo