Start of the #ontology sub-group #ontology
Brandon Kuczenski
Massimo, Let me weigh in on the input / output question. In my view, a flow is not an input or an output- it has to be both. It has to be an output from the process that created it and an input from the process that consumed it. The flow is the same in both cases; therefore it is an error to call it one or the other. I haven't seen the term 'exchange' used very much but in my view, a flow is simply a product/substance/material/service and a quantity of measurement (say, 'mass'). (this has to be fixed in order for the use of many different databases to be stable). I think of an exchange as a 4-tuple: an activity that defines the exchange (which I call the parent), a flow that is being exchanged, a direction with respect to the parent, and a termination, which is the other activity (or compartment or stock or market) that is the partner to the exchange. If the termination is null, then it's a cutoff flow- auditing these flows is part of reviewing a model. This view is pretty consistent with your discussion about "Who is this 10 kg of coal associated with?" A characteristic of this definition is that it is non-numeric, i.e. there is no quantitative information- only adjacency. This helps to define the model without getting hung up on what the exchange value or uncertainty is. Obviously there could be uncertainty in the termination - from where / what supplier / what time of day / etc? but that is not quantitative uncertainty. When the parent activity is invoked as part of a query, it would be "responsible" for "figuring out" the exchange value given the query it is answering, and the termination could / would have to be figured out by the software that is doing the query. But it's the exchange that is directional, not the flow. -- Brandon Kuczenski, Ph.D. Associate Researcher University of California at Santa Barbara Institute for Social, Behavioral, and Economic Research Santa Barbara, CA 93106-5131 email: bkuczenski@...
|
|
Massimo Pizzol
Thanks Brandon
>>> a flow is not an input or an output- it has to be both. I completely agree, and this is what I was trying to write as well. In my understanding a “flow” object is not an input or output in absolute terms but only in relative terms, i.e. in relation to another “activity” object. Therefore, using the predicates “IsInputof” and “IsOutputof” seems to me an appropriate and sufficient way to express this relationship while I don’t think we should use of the “Input” and “output” subclasses for the reasons previously outlined (not fully correct, redundant, inconsistent).
BR
|
|
Hello,
Finally have time for some inputs to your extensive discussions and nice summaries. ALPHA/ Arguments to NOT introduce subclasses like “product”, “emission”, and “waste”. I add to the list of arguments: Products (goods or services), emissions, or waste are terms that embody value judgements. Examples: (i) CO2 is mostly seen as an emission to the atmosphere, but some describe it as a waste of our industrial activities (for which we could also provide treatment, direct air capture, or carbon capture and storage). (ii) In circular economy, waste becomes a resource; zero-waste people would say there is no waste only resources. (iii) The boundary between waste (paying for a treatment service) and by-product (getting paid for the material) can vary with markets/supply/demand changes. So, the physical raw fact is that things go in and out of an activity (i.e. metabolism); and we (or I) think that this is what must be stored in the database (pure accounting). The value judgment can come as a second layer, when using the database for life cycle impact assessments. Basically, an impact assessment method is a set of value judgments that gives us characterization factors. The Bonsai implementation of GWP100 will take all flow instances of (fossil) CO2 from any activity in the life cycle to the atmosphere and sum them up. References for the BEP: Weidema, B. P.; Schmidt, J.; Fantke, P.; Pauliuk, S. On the Boundary between Economy and Environment in Life Cycle Assessment. Int. J. Life Cycle Assess. 2018, 23 (9), 1839–1846; DOI 10.1007/s11367-017-1398-4. BRAVO/ Use of subclasses for input and output VS Use of predicates isInputOf and isOutputOf From the previous meeting, I understood that the discussion raised by Massimo depends on which database we are talking about: unlinked or linked database? I wrote this after our meeting, to clarify my understanding: In a linked database, a flow-instance is output of a single activity and is input to a single activity. In an unlinked database, a flow-instance is either input or output of an activity. [I feel I could be heavily wrong on that statement; and also feel that predicates are interesting] CHARLY/ Validation, agency and social LCA in the ontology Bo and Chris mentioned how external data can be used for validation of the BONSAI database; e.g. with GDP data from the World Bank. This echoes to https://chris.mutel.org/next-steps.html#id1 but also extends to other options for validation (e.g. remote sensing data for land use change; anthropogenic emissions). Conceptual difference [do you agree with it?] >> GDP value by World Bank & GDP value by Bonsai are somehow based on the same raw data (though not easily accessible) >> Remote sensing data for NOx emissions vs Bonsai data for NOx emissions are not of the same type of raw data; different measurement techniques/reporting frameworks This said, for the purpose of the hackathon, GDP validation is enough to implement. The GDP example raised the question: how to include agents in the ontology. This sounds also important for social LCA (Is the social LCA database on our list of data sources?). Agents are complex. The first terms I think of are: Companies; Governments; Individuals; Employees; Households; Multinationals; Teams. I have failed (tonight) in finding an existing ontology of agents; but I am sure it exists With our focus on "activities", the first predicate I can think of is "isPerformedBy" an agent or set of agents. But then, it gets blurry / not easy to generalise. Example: Activity = "Electricity production" #isPerformedBy Agent = "Coal power plant nb 1234" Agent = "Coal power plant nb 1234" #isLocatedIn Country = "Germany", #isOwnedBy Entity = "InternationalPowerCompany" (at 60%), #isOwnedBy Entity= "City of Dusseldorf" (at 40%) Agent = "Coal power plant nb 1234" #hasWorkers literal = "70" Company = "InternationalPowerCompany" #hasHighSkilledEmployees literal = "2000" Company = "InternationalPowerCompany" #hasMediumSkilledEmployees literal = "5000" ... Issues: - ownership of a plant by several entities - entities, companies, being multinational, much larger than the plant - workers not the same as employee; working somewhere; employed by someone? Can be employed by a company but work in several places/plants. Simplification: only have population/agent data for "super classes" that aggregate at a sector level or country level? as in Exiobase; issues: how to deal with multinational entities; companies; workers? DELTA/ How do we make the ontology/database usable for LCA-people if it does not have LCA-specific information in it? Massimo asked this question. If not directly included, I am guessing that the ontology/database becomes usable for LCA-people (or other type of people) via some additional layers. For impact assessments, see reply to ALPHA. For knowing the reference flow of an activity, I think that this is solved in the linked databases (linked as in BRAVO, not LinkedData); but if you work with the raw unlinked data, you have to make the assumption anyway- Besides, my (very) long-term vision with Bonsai is to advance further in merging Industrial ecology methods: LCA, MFA, and even IOA, IAM, all forms of socioeconomic metabolism analysis. Including bridges with dynamic system modelling and complex system modelling.
|
|
Massimo Pizzol
Thanks Elias for the interesting reflections. I believe all your points are related. My impression is that we are converging towards an ontology that is operational with a minimal number of elements and can potentially be expanded with additional layers for specific uses (e.g. LCA).
After my short chat with Matteo, I understand that even if redundant these subclasses are a more elegant way (semantically speaking) to structure our ontology because they allow us to “get the answer we want by making the right question”. We can get the same answer indirectly but this approach is less elegant (and since I am Italian, for me elegance is everything…). So it might be actually advantageous to keep them.
Agree, and formalizing them limits our flexibility. But indeed some of those might be useful to work in LCA context. I think that the only two pieces of information we actually need for doing LCA are: if a flow belongs to the technosphere (all the rest is B matrix) and if a flow is a reference flow (diagonal of tech matrix). Right now I can’t think of any automatic way of determining this information from a raw list of inputs and outputs. So we have to include this info in the ontology because we can’t use an algorithm or write a code to figure this out. But perhaps I am wrong and somebody in the group has a solution for this and then we can skip these classifications altogether, that would be perfect. I also recognize that this means introducing some subjective elements in the model, because who decides what is technosphere? But as I wrote before if we want to use the liked data for LCA we have to accept that there is an LCA framework.
Looking forward to the meeting on Friday. BR p.s. +1 for “ALPHA”, ”BRAVO”, “CHARLY”. I am having some good laughs thinking about “Hot shots!” right now
_._,_._,_
|
|
On Wed, 20 Mar 2019 at 12:30, Massimo Pizzol <massimo@plan.aau.dk> wrote:
“product”, “emission”, etc. are subjective.This is a great comment, and is to me a perfect example of how people's experience leads them to accept restraints without even realizing it. 1. Mathematically, we don't need to distinguish between technosphere and biosphere, this can be one big matrix. In practical terms, our biosphere will be a different set of names; or, they will be flows for which there is no associated producing activity. 2. We don't need the concept of a reference flow to make a technosphere matrix, and there isn't anything special about positive numbers of the diagonal. Production amounts can be randomly ordered, and in any case everything produced is positive, everything consumed is negative, regardless of whether it is a reference product, co-product, or whatever. The notion of reference product is helpful for humans trying to understand the reason a particular dataset was modelled, but irrelevant for the computer doing the math.
|
|
Massimo Pizzol
Ok thanks so this is the solution I was asking for. We can separate between technosphere and rest by using an external list of names. And yes you are right about the matrix operation it will work even if the order of columns and rows is not the same. So we neither need the ref flow predicate nor any product subclass in the ontology.
toggle quoted messageShow quoted text
Massimo
On 20 Mar 2019, at 15.29, Chris Mutel via Groups.Io <cmutel=gmail.com@groups.io> wrote:On Wed, 20 Mar 2019 at 12:30, Massimo Pizzol <massimo@plan.aau.dk> wrote:This is a great comment, and is to me a perfect example of how
|
|
Stefano Merciai
Hi, Thank you for the nice exchange of ideas. I just want to add some little inputs. How do we distinguish between CO2 emitted from the chimney
and the CO2 used for soft drinks? Then I think that the value of "waste/residues" is determined by more factors. Homogeneity of materials for example. The same material, if mixed to other waste flows, may have a lower value (even negative) because a service of waste separation is needed. So the final price of the waste flow may be the value of the material, which can be somehow fixed, less the cost of the separation. This to say that there could be properties, such as " sorted" and “unsorted”, that could indicate if it is a waste flow or not. As for the reference flow, I think that the classification of activity gives an idea of the reference flow. A coal mining activity will have coal as reference flow (or perhaps it is the other way around, if coal is the output of an activity, then that activity is a coal mining). If we intend to insert economic values, such as prices, the determining products could be that resulting into more revenues for the activity. However, by doing that, the classification of the activity may change. I think Elias mentioned the CHP plants, where both heat and electricity may be determing flows depending on the period of the year/day. Last thing, there are values that are important when building a database, such as combustion coefficients (emissions produced in the activity act123 when burning fuel123). Are these properties of products? Best, Stafano
On 20/03/2019 15:43, Massimo Pizzol
wrote:
Ok thanks so this is the solution I was asking for. We can separate between technosphere and rest by using an external list of names. And yes you are right about the matrix operation it will work even if the order of columns and rows is not the same. So we neither need the ref flow predicate nor any product subclass in the ontology. MassimoOn 20 Mar 2019, at 15.29, Chris Mutel via Groups.Io <cmutel@...> wrote:On Wed, 20 Mar 2019 at 12:30, Massimo Pizzol <massimo@...> wrote: “product”, “emission”, etc. are subjective. Agree, and formalizing them limits our flexibility. But indeed some of those might be useful to work in LCA context. I think that the only two pieces of information we actually need for doing LCA are: if a flow belongs to the technosphere (all the rest is B matrix) and if a flow is a reference flow (diagonal of tech matrix). Right now I can’t think of any automatic way of determining this information from a raw list of inputs and outputs. So we have to include this info in the ontology because we can’t use an algorithm or write a code to figure this out. But perhaps I am wrong and somebody in the group has a solution for this and then we can skip these classifications altogether, that would be perfect. I also recognize that this means introducing some subjective elements in the model, because who decides what is technosphere? But as I wrote before if we want to use the liked data for LCA we have to accept that there is an LCA framework.This is a great comment, and is to me a perfect example of how people's experience leads them to accept restraints without even realizing it. 1. Mathematically, we don't need to distinguish between technosphere and biosphere, this can be one big matrix. In practical terms, our biosphere will be a different set of names; or, they will be flows for which there is no associated producing activity. 2. We don't need the concept of a reference flow to make a technosphere matrix, and there isn't anything special about positive numbers of the diagonal. Production amounts can be randomly ordered, and in any case everything produced is positive, everything consumed is negative, regardless of whether it is a reference product, co-product, or whatever. The notion of reference product is helpful for humans trying to understand the reason a particular dataset was modelled, but irrelevant for the computer doing the math. -- Best, S.
|
|
Hi Elias (and others) Just a few comments on some of the points you have mentioned: Now lets say as an LCA research group we are interested in structuring our data in a traditional way (e.g. product, by-product, emission)/ (Impact methods and characterization factors); we can develop a secondary ontology which continues to use BONSAI as the primary ontology and build on top of it. Eg. all my segregations of(product, by-product, emission) can be a sub-class of Bonsai: Flow object. But we dont want to do this now as adding complexity to the ontology will be a barrier to its uptake among different IE groups.
Thanks again for your comments we will bring some discussion on these issues on our meeting this Friday. Agneta
|
|
Massimo Pizzol
>>>2. We don't need the concept of a reference flow to make a technosphere matrix,
But we need that to make a square matrix. The question is whether it needs to be specified in the ontology or can be done externally like in the case of the biosphere. Massimo
|
|
Stefano Merciai
I think that a square matrix can just be done by aggregation. Best, Stefano On 20/03/2019 16:33, Massimo Pizzol
wrote:
-- Best, S.
|
|
Bo Weidema
Den 2019-03-20 kl. 16.14 skrev Stefano Merciai:
In the wiki, we have about the observation (datapoint): Number: Floating point numbers xsd:float. (...) Note also that ecoSpold2 has the option of providing a @mathematicalRelation that defines a mathematical formula which can also contain variables and which will fill the value of the @amount if @isCalculatedAmount is TRUE. This is a very explicit and recommendable option for providing provenance information. In ecoSpold2 the formula are defined by a sub-set of the OpenFormula standard. Other RDF-related formula standards are described on the Wikipedia-page for MathML. The @mathematicalRelation allows exactly to e.g. express an emission as an input number multiplied by a fixed combustione coefficient. Re. the distinction between CO2 emissions and
CO2 as a product, one option can be to use different names
(such as "carbon dioxide, purified" or ""carbon dioxide, under pressure"
for the product).
|
|
Massimo Pizzol
Within yesterday’s correspondence on biosphere flows and reference flows Agneta wrote that:
>> [for doing LCA…] we can develop a secondary ontology which continues to use BONSAI as the primary ontology and build on top of it.
My understanding right now is that a RDF-type database built from raw data linked using our ontology in its current format can be queried to automatically generate a list of the input and output flows of a number of activities, with the main advantage being that although the raw data might be from different sources, the activities and flows in the list will be uniquely and consistently identified.
But how to convert the list in a specific format and for a specific purpose (e.g. organize the activities and flows in B matrix + square A matrix to be used in the g = BA-1f calculation for LCA) remains outside the scope of this ontology. This conversion should be regarded as a separate step. One would have to develop another additional ontology on top of the base one, or using some additional information external to the ontology like a list specifying what are the biosphere flows, or make other additional assumptions and choices like deciding which of two output flows is the determining one and applying one or another method to solve multifunctionality.
I have been wrong before so…is my understanding correct? I hope this helps in aligning expectations.
Massimo
|
|
But how to convert the list in a specific format and for a specific purpose (e.g. organize the activities and flows in B matrix + square A matrix to be used in the g = BA-1f calculation for LCA) remains outside the scope of this ontology. This conversion should be regarded as a separate step. One would have to develop another additional ontology on top of the base one, or using some additional information external to the ontology like a list specifying what are the biosphere flows, or make other additional assumptions and choices like deciding which of two output flows is the determining one and applying one or another method to solve multifunctionality.We will be able to talk about this soon in person, which should be helpful as we might be going a bit in circles. In any case, here are my contributions to the dance: 1. Identifying potential biosphere flows is relatively easy - they are either only consumed and never produced, or vice versa. We can debate if this is enough, or even fits into our logical sense of how we want to build the model, but if we want to make a solvable linear algebra problem it is enough. Think about the *purpose* of separating the A and B matrices. 2. A square technosphere matrix is an output of one possible system model, but is not required. Some system models will not produce square matrices, and other approaches will use graph traversal and not use matrices at all. 3. Our database is not only used for matrix-based LCA, but also e.g. MFA, so should not be too specific to what you read in an LCA textbook. This has already been said before by multiple people, but is worth repeating :)
|
|
Massimo Pizzol
>>> as we might be going a bit in circles Thanks. I think we agree on all points. You confirm that we are going for a general solution and that if I want to upgrade to a specific solution (one that looks like the textbook) then it’s a separate step. Concretely: some info on biosphere, ref flows, etc. does not strictly need to be in the ontology because can either be obtained from external sources or because belongs to a specific mental model only.
Massimo
|
|
In my understanding the next steps would be (correct me if I am wrong): 1.To explore on interoperability- How to connect unlinked data (something a potential data provider- e.g. LCA practitioner, national statistics, etc) be connected with our existing database. Agneta
|
|
Massimo Pizzol
Updated version of “PEP 0003 ontology” document here, revised after our meeting on Friday and discussions during the week. I didn’t have much time to do this so I hope to have captured the main issues, but please check it and change/comment if this is not the case.
BR
|
|