Start of the #ontology sub-group #ontology
Bo Weidema
This message is for those who signed up for the ontology
sub-group (Current members: Bo, Elias, Massimo); if this isn't
you, you can mute the #ontology hashtag. Based on the description by Chris and Matteo, I think the tasks for the ontology sub-group, to be distributed among us, is: 1) Upload task description for sub-group (do we need a project
board?) 2) Coordinate tasks with sub-group on RDF (Agneta, Matteo) as it is not completely clear where the borderline between these two groups are 3) Decide on namespacing for the part of the ontology that is BONSAI-specific (i.e., which does not reside elsewhere): Suggestion to use purl.org - or bonsai.org - any other suggestions or arguments for one or the other? 3a) Create a "proper" ontology definition, based on the content
of
https://github.com/BONSAMURAIS/bonsai/wiki/Data-Storage#specify-minimum-core-data-and-metadata-formats
- in the form of a picture like the one in the "minimal
ontology pattern". Matteo suggests to use the vocabs QB and/or QB4OLAP for
this. This also implies the creation of initial classifications
for activities and flow-objects (including biosphere flows) as per
https://github.com/BONSAMURAIS/bonsai/wiki/Data%20Integration#classifications-and-correspondence-tables
- ensuring that we also include metadata for our testbench
EXIOBASE dataset. 3b) Complement 3a) with an RDF schema, like in http://tcga.deri.ie/ or http://qweb.cs.aau.dk/qboairbase/ 3c) Place output of 3a) and 3 b) on new github repo. 3d) Link from wiki to 3c) and provide also here any arguments for choices made or alternatives considered. Bo
|
|
I would suggest taking the US EPA flow list as the basis for the
toggle quoted messageShow quoted text
BONSAI flow list - they have put in quite a lot of work in basing their names and metadata on accepted standards and ontologies. https://github.com/USEPA/Federal-LCA-Commons-Elementary-Flow-List. The processing scripts aren't complete, but you can download the current output here: https://github.com/USEPA/Federal-LCA-Commons-Elementary-Flow-List/tree/master/fedelemflowlist/output Wes said that they should have a complete 1.0 release in a few months.
On Tue, 5 Mar 2019 at 12:46, Bo Weidema <bo.weidema@bonsai.uno> wrote:
--
############################ Chris Mutel Technology Assessment Group, LEA Paul Scherrer Institut OHSA D22 5232 Villigen PSI Switzerland http://chris.mutel.org Telefon: +41 56 310 5787 ############################
|
|
Hi Thanks for this message Bo. Actually I am currently working on points 3a and 3b,- Using existing LCA ontology ( based on BONSAI wiki and Kuczenski et al (2016)) as a starting point to develop an RDF schema . So I am not entirely sure if there is much difference between the groups BONSAI ontology and an RDF framework. Agneta
|
|
Bo Weidema
Dear Agneta, Yes, I think a joint meeting Friday would be good to clarify the boundary between the tasks of the two groups. Bo Den 2019-03-05 kl. 13.12 skrev Agneta:
--
|
|
Elias Sebastian Azzi
3) Decide on namespacing for the part of the ontology that is BONSAI-specific (i.e., which does not reside elsewhere): Suggestion to use purl.org - or bonsai.org - any other suggestions or arguments for one or the other? I don't have any good pros and cons. Purl seems to be safest option in the long term? http://purl.org/linked-data/bonsai# is available 3a) Create a "proper" ontology definition, based on the content of https://github.com/BONSAMURAIS/bonsai/wiki/Data-Storage#specify-minimum-core-data-and-metadata-formats - in the form of a picture like the one in the "minimal ontology pattern". Matteo suggests to use the vocabs QB and/or QB4OLAP for this. This also implies the creation of initial classifications for activities and flow-objects (including biosphere flows) as per https://github.com/BONSAMURAIS/bonsai/wiki/Data%20Integration#classifications-and-correspondence-tables - ensuring that we also include metadata for our testbench EXIOBASE dataset.
|
|
Bo Weidema
Den 2019-03-07 kl. 02.19 skrev Elias Sebastian Azzi:
Makes sense. I will create an account. Do we need to include in the ontology some of the thoughts you brought up in Weidema, B. P. et al. (2018) ‘On the boundary between economy and environment in life cycle assessment’, The International Journal of Life Cycle Assessment, 23(9), pp. 1839–1846. https://doi.org/10.1007/s11367-017-1398-4 I think the definitions of "activities" and "flow-objects" are
broad enough to cover both economy and environment, so this rather
comes down to be a question of making the actual classification
within each of these two dimensions complete enough. Best regards Bo
|
|
Sorry; I missed it. I had two concurrent meeting this morning.
|
|
Matteo Lissandrini (AAU)
Hi all,
after the call and the discussion I've drafted a possible schema with an example (please see [1]). It is based on the works of : A) A minimal ontology pattern for life cycle assessment data, by Janowicz et al. B) An Ontology For Specifying Spatiotemporal Scopes in Life Cycle Assessment, by Yan et al. The draft is currently incomplete and naturally all choices are up to discussion. Current feedback include: - Use OM ontology instead of QUDT - Use xsd:dateTime requires to add hh:mm to the date. - Explain the role of Literal in the figure - Keep track and limits as much as possible the introduction of new terms in the vocabulary - Limit as much as possible the number of ontologies we refer to - Use the provo Ontology [6] - "inInputOf" does not roll off the toungue - replacing schema:Place with geonames. The way Place is defined (https://schema.org/Place) is weird and not helpful for our use case - Consider the ontology for time: https://www.w3.org/TR/owl-time/ - Consider https://www.w3.org/TR/2017/NOTE-eo-qb-20170928/ for raster data Here are my comments on a couple of points - Regarding the ontology of units I went for QUDT because it seems more standard (is W3C member), but I leave the final decision to the domain experts. Here is some relevant material: 1) the paper on OM [2] and the reviews of the paper [3] 2) the official websites [4,5] - It is correct, dateTime requires time, we can use xsd:date. Do we need/want time? - Literal here is the generic RDF concept of literal values, it has been added just to clarify a doubt that I noted during our last call. - Regarding PROV-O i have the impression that this is not appropriate. The ontology, as I understand it, is designed to describe 'provenance' in the sense of Data Lineage [7], so I think there would be a semantic mismatch. - I agree to use Geonames as URI for places, but as ontology I've investigate the definition, in Geoneames places are of type <http://www.geonames.org/ontology#Feature> which is defined as <http://www.geonames.org/ontology#Feature> a owl:Class ; rdfs:comment "A geographical feature"@en ; rdfs:subClassOf schema:Place, geo:SpatialThing ; owl:equivalentClass <http://www.mindswap.org/2003/owl/geo/geoFeatures20040307.owl#GeographicFeature>, <http://geovocab.org/spatial#Feature> . So it is rdfs:subClassOf schema:Place , this is why by adopting Schema Place we allow for the widest interoperability. Please let me know your thoughts/feebdacks on these. Note also that there are example use cases that are currently not well addressed by my draft, e.g., storage of goods. I'm also not sure about the class name "ReferenceFlow", the way I understood it is probably better described by "ReferenceOutput" ? Cheers, Matteo [1] https://docs.google.com/presentation/d/10Kd3zQEFPMEl7qB29xP65JGsNa9IKF8DvEg4SeiTKno/edit?usp=sharing [2] http://semantic-web-journal.org/sites/default/files/swj177_3.pdf [3] http://www.semantic-web-journal.net/content/ontology-units-measure-and-related-concepts [4] http://www.qudt.org/ [5] http://www.ontology-of-units-of-measure.org/ [6] https://www.w3.org/TR/prov-o/#description-starting-point-terms [7] https://en.wikipedia.org/wiki/Data_lineage
|
|
Massimo Pizzol
Dear Matteo and all
Looks good to me but I am a bit puzzled about two things:
Hope these are not too stupid questions and thanks for the good work done so far.
BR
From: <hackathon2019@bonsai.groups.io> on behalf of "Matteo Lissandrini (AAU) via Groups.Io" <matteo@...>
Hi all,
|
|
Thanks Matteo for bringing the discussion to the platform. My questions in relation to the schema and our discussions on friday are: 1. I personally find the terms ‘Flow’ and ‘Flow object’ interchangeable and hence confusing. Is there a consensus on the use of ‘Flow object’ instead of ‘Flow’? I would need some clarity on this. 2. I don’t understand why Input and Output should be separate subclasses of a flow? Instead I would think of this as a flow property, in which the flow instances are input and output. After all every flow is either an output or an input. Similarly reference flow is also a flow property, not a subClass. 3. To follow up the question on adding an ‘Agent’ to the existing schema. An ‘Agent’ is defined as ‘one who performs the activity’. For example – Coal power plant (agent) generates electricity (activity). This adds the advantage for defining stocks, as an agent usually invests on stocks such as infrastructure or machinery. So in my understanding we need to add Agent as another class. Ofcourse the agent has properties like – location So I envision the simple schema to have 3 main classes- Agent (performs activities), Activity (has flows) and Flows. Each of these classes have one of many properties.
4. 4. Then again I feel that our schema is pretty similar to the schema published by (Janowicz et al.) , without ofcourse some objects such as intermediate or elementary flows. The novelty of the new schema is not entirely clear to me.
|
|
On Mon, 11 Mar 2019 at 14:14, Agneta <agneta.20@gmail.com> wrote:
1. I personally find the terms ‘Flow’ and ‘Flow object’ interchangeable and hence confusing. Is there a consensus on the use of ‘Flow object’ instead of ‘Flow’? I would need some clarity on this.The problem is that "flow" can be a verb or a noun - we want to make sure people use it as a noun, hence "object." 2. I don’t understand why Input and Output should be separate subclasses of a flow? Instead I would think of this as a flow property, in which the flow instances are input and output. After all every flow is either an output or an input. Similarly reference flow is also a flow property, not a subClass.+1 3. To follow up the question on adding an ‘Agent’ to the existing schema. An ‘Agent’ is defined as ‘one who performs the activity’. For example – Coal power plant (agent) generates electricity (activity). This adds the advantage for defining stocks, as an agent usually invests on stocks such as infrastructure or machinery. So in my understanding we need to add Agent as another class. Ofcourse the agent has properties like – locationThe working example should have a model of how we can store data related to the coal plant (e.g. capacity, year built), as well as the operator of the coal plant, which is what I would have thought of when I hear the term "agent."
|
|
Bo Weidema
Den 2019-03-11 kl. 14.14 skrev Agneta:
The flow-object is the "thing" that can flow, e.g. "steel". The
flow is the instance of a flow-object flowing in or out of an
activity, e.g. "23 kg of steel output from steel mill X".
Correct!
Fine!
Some properties can also be classified. Especially important are
the balanceable properties (dry mass, water mass, monetary value,
time, ...)
It does not have to be new! As long as it is precise :-) and does what we
want it to! Cheers Bo
|
|
Bo Weidema
Den 2019-03-11 kl. 14.09 skrev Massimo Pizzol:
Good point. No need for separate subclasses, as also Agneta
pointed out. It is just an input relation or an output relation of
a flow-object in a specific activity context.
Environmental exchanges are also just flows (or flow-objects). CO2 can both be an emission to the environment and a (by-)product, but as a chemical it is the same. For this reason, it may be reasonable to distinguish between "carbon dioxide (product)" and "carbon dioxide (emission)". Whether that is best done by creating "product" and "emission" as sub-classes of flow-objects or just as properties, I am not completely sure. The general idea is not to be too specific if it can be avoided which would be an argument for the latter (property) option, but on the other hand the sub-calss option makes in clear that the "carbon dioxide" is placed in one or the other class. Any views on this? Bo
|
|
Bo Weidema
Den 2019-03-11 kl. 12.42 skrev Matteo Lissandrini (AAU):
- Regarding PROV-O i have the impression that this is not appropriate. The ontology, as I understand it, is designed to describe 'provenance' in the sense of Data Lineage [7], so I think there would be a semantic mismatch.Is there a semantic difference of the input of data and the input of a physical "thing"? Note also that there are example use cases that are currently not well addressed by my draft, e.g., storage of goods.We may need to distinguish between "stock" as an activity (having inputs and outputs over a specified period) for use in a transactions matrix and "stock" as the amount stored at a specific point in time for use in a balance sheet. I'm also not sure about the class name "ReferenceFlow", the way I understood it is probably better described by "ReferenceOutput" ?A reference flow does not need to be an output. It can also be an input, e.g. a waste or by-product for disposal or recycling. Best regards Bo
|
|
Massimo Pizzol
>>>Environmental exchanges are also just flows (or flow-objects). CO2 can both be an emission to the environment and a (by-)product, but as a chemical it is the same. For this reason, it may be reasonable to distinguish between "carbon dioxide (product)" and "carbon dioxide (emission)". Whether that is best done by creating "product" and "emission" as sub-classes of flow-objects or just as properties, I am not completely sure. The general idea is not to be too specific if it can be avoided which would be an argument for the latter (property) option, but on the other hand the sub-calss option makes in clear that the "carbon dioxide" is placed in one or the other class. Any views on this?
Not sure how subclasses work honeslty.
BR
Groups.io Links: You receive all messages sent to this group. View/Reply Online (#84) |
Reply To Group |
Reply To Sender |
Mute This Topic | New Topic _._,_._,_
|
|
On the topic bu also as a general comment - I would suggest to distinguish them not only to make clear that the "Carbon dioxide" is an emission (in the example made) but also to make more intuitive for users (LCA and beyond). The Contra of the "Classic solution" may also be easier to address. Michele
|
|
Matteo Lissandrini (AAU)
About why do we separate between two subclass "input" and "output", here is my reasoning.
On the one hand this is at the ontology level to restrict the domain of the input and output relationships. Assume you are looking at a specific instance of a 10tonnes of coal in your database, then you ask yourself "is this an input for something or an output of something?" for sure you could find the answer by checking "is this the source of inputOf relationships?", but operationally you can ask "what is the type of this?" and in my view this would be the correct way to do this because something is either input or output. It was also because my understanding was that only an output would be a "reference flow", so now I understand this is not true. What is the utility and the actual definition of a "reference flow"? The existence of the reference flow to be both input and output makes things a little bit more complex. Regarding provenance, yes, the concept is extremely loaded of semantics in the field, I'm really sure we do not want to use that. We will use that ontology instead when we will describe, for instance, that the data produced is coming from the exiobase dataset. There we really want to use that ontology. The naming: the names of the classes may be not appropriate, here the instances of flow describe the tangibile things that get consumed/produced by the execution of an activty, the instances of an activity are exactly that specific happening of the process in that place/time. In general, we do not want, at all costs, to distinguish flows/activity types by using the "label", labels are hints for humans, but URI, classes and instances, are what provides an identity to things.
|
|
Bo Weidema
Den 2019-03-11 kl. 16.22 skrev Matteo Lissandrini (AAU):
What is the utility and the actual definition of a "reference flow"?The more semantically precise term is actually "determining flow". The definition is: "Flow of an activity for which a change will affect the production volume of the activity" The utility is to be able to distinguish the flow that drives (causes) the activity from flows that are caused by the activity. Regarding provenance, yes, the concept is extremely loaded of semantics in the field, I'm really sure we do not want to use that.OK. Argument accepted. Bo
|
|
Matteo Lissandrini (AAU)
What is the utility and the actual definition of a "reference flow"? The more semantically precise term is actually "determining flow". The definition is: "Flow of an activity for which a change will affectthe production volume of the activity" The utility is to be able to distinguish the flow that drives (causes)the activity from flows that are caused by the activity. Now I see, so, as you suggested earlier, a waste can be the determining flow as input of a waste disposal activity that produces something else, because we need to dispose of this waste. Is this correct? Thanks a lot for the clarification. Matteo --- Matteo Lissandrini Department of Computer Science Aalborg University http://people.cs.aau.dk/~matteo
|
|
Massimo Pizzol
(Disclaimer: I am simplifying things a bit here I hope the LCA people will forgive me)
toggle quoted messageShow quoted text
Dear Matteo I believe you have understood how it works, but there are some other details that perhaps you should know. Examples: If an activity produces electricity from burning coal, the determining flow is the product output flow ‘electricity’. If an activity produces
simultaneously electricity AND heat from coal, then either electricity OR heat will be the determining flow. Waste example: You described an activity that converts waste into something else. For example this could be incinerating municipal solid waste to generate
electricity. In this case the ‘treatment of municipal solid waste’ is the determining product
output flow of the ‘waste incineration’ activity that has also another product output flow of ‘electricity’. Electricity is not the determining flow here because as you rightly concluded we burn waste because we want to get rid of it (or in other
words: we don’t produce more waste just because we want more electricity...).
Summing up:
- the determining flow is always a product output flow
- activities can have multiple product output flows, but
- there is only one determining flow per each activity
- ‘product’ is a generic term that includes both ‘goods’ (e.g. coal) and ‘services’ (e.g. treatment of waste)
Now the confusing thing here is that ‘waste treatment’ is a product flow (a service in fact) but *sounds* like an activity. Same with ‘transport’. So the next question for the LCA people in this thread is: how are we going to represent waste flows in the
schema? My only reference for names is ecoinvent but I don’t think that is really super understandable (my students generally have a hard time understanding them, for example)
Massimo
On 11 Mar 2019, at 16.51, Matteo Lissandrini (AAU) via Groups.Io <matteo@...> wrote:
|
|