Re: Start of the #ontology sub-group #ontology


Massimo Pizzol
 

Dear Ontology/RDF group

 

We have a meeting Friday and I would like to share some points for discussion.

 

I am thinking a lot about our ontology and there are two pressing issues that I hope we can clarify.

 

  1. The use of “input” and “output” subclasses

 

Bo has suggested this below as arguments to NOT introduce subclasses like “product”, “emission”, and “waste”.

 

>>> Principle: We try to avoid making fixed choices, like sign nomenclatures, that are only useful in specific contexts.

>>> Principle: It is a good practice for a model to stay as close to reality as possible

>>> Principle: Do not introduce unneccesary (obligatory) classifications

 

I agree with these principles and I think it makes sense not to have the subclasses product/waste/etc. My problem is that I don’t see how the choice of using the “input” and “output” subclasses fits with these principles. It is a sign convention, useful in specific context, and it is an obligatory classification. I don’t know if classifying things in “input” and “output” is close to reality more than classifying them in “products”, “emission”, “waste”. Thus, my preference is to remove the “input” and “output” subclasses and keep only the “isInputof” and “isOutputof” predicates.

 

So far the arguments for using the input and output subclasses have been:

 

>>> at the ontology level to restrict the domain of the input and output relationships.

I am not totally clear on what this means. My concern is whether the use of subclasses unnecessarily increases the complexity of the model because – assuming I have understood thing correctly - there would be two instances of e.g. a “coal” flow. One is the “coal input flow” and the other is the “coal output flow” each of them with a different URI. So if you are looking at the instance “electricity production” you will find it is related to a specific URI for coal input,  and if you are looking at the instance of “coal production” you will find it’s related to another different URI for coal output. So the same thing (coal) in the physical reality is now described by two different codes.

>>>Assume you are looking at a specific instance of a 10tonnes of coal in your database, then you ask yourself “is this an input for something or an output of something?”

My view is that in the physical reality 10 kg coal is not the output or the input of something in absolute terms. It is just coal, i.e. an object. The fact that is input or output is determined only in relative terms, i.e. in relation to another object (activity). Coal is output of coal production. Coal is input to electricity production. I would instead ask this type of questions: “Who is this 10 kg of coal associated with?” And what I would expect to find out is that it is the output of a coal production activity and the input of a electricity production activity.

 

>>>for sure you could find the answer by checking  "is this the source of inputOf relationships?",

This sounds really nice IMO! I was thinking this was actually how we should find out about things. I also guess that this is a competency question? I would like to better understand why this is not  sufficiently “operational”.

 

>>>but operationally you can ask "what is the type of this? And in my view this would be the correct way to do this because something is either input or output"

I argued above that in the physical reality something is not either input or output in absolute terms. Anyway, we could certainly ask the question “"what is the type of this?” with referent to whether something is classified as input or output. But if we start asking these type of questions for the input vs output classification, then if we are consistent why not asking the same type of question for each other possible classification? For example: if something is a product exchange? or an environmental exchange? For example I could ask “Is CO2 an emission or a product?” But Bo has argued based on the principles above that this is not a relevant question. So why is the question relevant for input and output?

 

  1. In general I am unclear on how much should we adhere to existing LCA frameworks.

 

>>> LCA people all have their own mental model.

On one hand I agree we should keep an open mind and not be constrained by specific mental models. But on the other end, I also understand that we are doing this for the use of “LCA people” too. I thought one of our purposes was to create an infrastructure to support LCAs (e.g. because by making specific queries one can get LCA datasets). If our purpose is to make an ontology that is valid for all models in all disciplines from economics to environmental sciences, then perhaps the terms “input” and “output” are the most generic ones (can apply to anything from a tree to a whole country economy) and this might be sufficient (preferably as predicates, as I argued above). However, In order to use the linked data to create some LCIs, we would need some ways of separating what is A matrix (products) and what is B matrix (substances, costs, or many other things) , and what is reference flow, because this is what LCA people are used to work with. So perhaps we have to allow for the possibility of identifying this LCA-specific information. With the current ontology the “only” information we can obtain from e.g. the graph of steel production is a list of inputs and outputs. So how do I distinguish if steel is the reference flow of steel production instead of CO2?

 

 

Hope this was useful and I am looking forward to a good discussion on Friday
Massimo

 

 

Join hackathon2019@bonsai.groups.io to automatically receive all group messages.