Date   

Re: Start of the #ontology sub-group #ontology

Massimo Pizzol
 

Updated version of “PEP 0003 ontology” document here, revised after our meeting on Friday and discussions during the week.

I didn’t have much time to do this so I hope to have captured the main issues, but please check it and change/comment if this is not the case.

 

BR
Massimo


Re: How does the ontology group support the correspondence table group #rdf #correspondencetables #ontology

Elias Sebastian Azzi
 

We have lists of activities and flows in different classifications.

- Each classification has its URI / is a super-class for all its elements?
- Each activity or flow has its URI and a property saying belongsTo "this classification" (redundant with being part of a class, but same as what was discussed for flows in #ontology)?

If this is the case I still find it quite challenging as many of the columns in the correspondence tables just refer to some codes, which are just codes of certain activities labels of which are provided in another column.
I think we need to re-write the CSV files with the name-literals that we plan to use in the ontology/RDF of the activities and flows.

So are these just qualitative flow properties?
I think we need to define some vocabulary (as Brandon already started in another thread) to describe the different type of correspondences. Brandon referred to this vocab https://www.w3.org/2009/08/skos-reference/skos.html  https://www.w3.org/TR/skos-reference/#mapping
- 1-1,  skos:exactMatch
- N-1, skos:broadMatch (second item is broader than first one)
- 1-N and M-N, in this skos vocab no option to add weights?


Re: #correspondencetables - what needs to be done? #correspondencetables

Brandon Kuczenski
 

I can work on that but it occurred to me to suggest Protege software: https://protege.stanford.edu which i have seen used for this sort of thing. It only now occurred to me; otherwise i would have mentioned it sooner. 



Sent on my Samsung Galaxy S® 6.

-------- Original message --------
From: Bo Weidema <bo.weidema@...>
Date: 3/22/19 3:01 AM (GMT-08:00)
To: hackathon2019@bonsai.groups.io
Subject: Re: [hackathon2019] #correspondencetables - what needs to be done?

I would strongly urge Brandon to provide an example of how this information could be (better) provided in an RDF format

Thanks for considering.

Bo

Den 2019-03-22 kl. 10.28 skrev michele.derosa@...:

[Edited Message Follows]

Thanks a lot guys for your contribution so far! This is great! A few comments and answers to the questions above:

@ALL (IMPORTANT):

  1. Please remember to update the file _Overview_of_available_correspondence_files.csv with all the required metadata when uploading and / or modifying the correspondence tables (as explained in the readme file).
  2. I have added a field ("Author") to this file. I can see that there are already a few new correspondence files that have not been registered in _Overview_of_available_correspondence_files.csv . This is required to track what correspondence tables we have and their status (and maybe also to collect the metadata for conversion in frictionless data format?).
  3. If you modify the location or replace the file _Overview_of_available_correspondence_files.csv remember to update the link to it in the readme file. I found that the link was already broken.
  4. I added the (before missing) file name to the respective rows in _Overview_of_available_correspondence_files.csv (thanks Miguel A. for letting me know)

@Stefano: The status of the correspondence table is incomplete. Remember to check the file _Overview_of_available_correspondence_files.csv

@Arthur: please send me your GitHub name. All contributors shall have access to the GitHub

@Mighuel A.: you have created a new folder distinguishing raw and final (correspondence) tables. That's fine but could you add in the readme a description of what these file contains, their difference and how to use them? Also, should the file _Overview_of_available_correspondence_files.csv now include both files in the raw and in the final tables folder? (I'd say so maybe specifying in the status why we have two if it is necessary to keep both?)

--


How does the ontology group support the correspondence table group #rdf #correspondencetables #ontology

Agneta
 

I am not very clear on how we could support each other. Perhaps you can provide a clearer description

What I understand is the correspondence table ( currently in xls format) should be converted into an rdf format. So we have to see how the current xls files containing correspondence tables fits into our ontology? If this is the case I still find it quite challenging as many of the columns in the correspondence tables just refer to some codes, which are just codes of certain activities labels of which are provided in another column. So are these just qualitative flow properties?


Regards

Agneta


#softwaremethods Call for volunteers to test Python library skeleton #softwaremethods

 

Dear all-

The python library skeleton is nearing its first release, and we need people to test it. You don't need to write a real library, just a very small fake one, and try to follow the instructions and see what needs to be better explained or is missing altogether. You can give Tomas and me your feedback at any point in this process, completely up to you. But please do let us know who is volunteering!


#bentso Bentso pre-hackathon deliverable finished (version 0.1) #bentso

 

0.1 is released on pypi and conda, and the hackathon project board is updated.


Re: #correspondencetables - what needs to be done? #correspondencetables

Bo Weidema
 

I would strongly urge Brandon to provide an example of how this information could be (better) provided in an RDF format

Thanks for considering.

Bo

Den 2019-03-22 kl. 10.28 skrev michele.derosa@...:

[Edited Message Follows]

Thanks a lot guys for your contribution so far! This is great! A few comments and answers to the questions above:

@ALL (IMPORTANT):

  1. Please remember to update the file _Overview_of_available_correspondence_files.csv with all the required metadata when uploading and / or modifying the correspondence tables (as explained in the readme file).
  2. I have added a field ("Author") to this file. I can see that there are already a few new correspondence files that have not been registered in _Overview_of_available_correspondence_files.csv . This is required to track what correspondence tables we have and their status (and maybe also to collect the metadata for conversion in frictionless data format?).
  3. If you modify the location or replace the file _Overview_of_available_correspondence_files.csv remember to update the link to it in the readme file. I found that the link was already broken.
  4. I added the (before missing) file name to the respective rows in _Overview_of_available_correspondence_files.csv (thanks Miguel A. for letting me know)

@Stefano: The status of the correspondence table is incomplete. Remember to check the file _Overview_of_available_correspondence_files.csv

@Arthur: please send me your GitHub name. All contributors shall have access to the GitHub

@Mighuel A.: you have created a new folder distinguishing raw and final (correspondence) tables. That's fine but could you add in the readme a description of what these file contains, their difference and how to use them? Also, should the file _Overview_of_available_correspondence_files.csv now include both files in the raw and in the final tables folder? (I'd say so maybe specifying in the status why we have two if it is necessary to keep both?)

--


Examples of property-references #ontology #rdfframework

Bo Weidema
 

General format in everyday language: Flow F is given in proportion to X amount of [Time OR Flow G] of activity A (this last information may be omitted if we require that Flow F and G are always from the same activity, but the more general option is to specify the activity)

Alternative formalisation: Flow A hasPropertyRelation of amount X of [Time OR Flow G] of activity A


Example 1: 614790.55 tonne of Rice output hasPropertyRelation of 1 year of Time of Australian rice production

Example 2: 5.5 tonne of carbon dioxide output hasPropertyRelation of 1000 kg of steel output of Australian steel production

Example 3: 444 tonne of carbon dioxide output hasPropertyRelation of 614790.55 tonne of Rice output of Australian rice production

(numbers in examples are not necessarily reflection any real size of flows)


Possible RDF formalisation of Example 1:

    "measure":{

        "@type":"om:Measure",

        "om:hasNumericalValue":"614790.55",

        "om:hasUnit":"om:tonne",

        "propertyRelation":{

            "activity":"A_PARI",

            "om:hasNumericalValue":"1",

            "om:hasUnit":"om:year"

        }


Re: #correspondencetables - what needs to be done? #correspondencetables

Michele De Rosa
 
Edited

Thanks a lot guys for your contribution so far! This is great! A few comments and answers to the questions above:

@ALL (IMPORTANT):

  1. Please remember to update the file _Overview_of_available_correspondence_files.csv with all the required metadata when uploading and / or modifying the correspondence tables (as explained in the readme file).
  2. I have added a field ("Author") to this file. I can see that there are already a few new correspondence files that have not been registered in _Overview_of_available_correspondence_files.csv . This is required to track what correspondence tables we have and their status (and maybe also to collect the metadata for conversion in frictionless data format?).
  3. If you modify the location or replace the file _Overview_of_available_correspondence_files.csv remember to update the link to it in the readme file. I found that the link was already broken.
  4. I added the (before missing) file name to the respective rows in _Overview_of_available_correspondence_files.csv (thanks Miguel A. for letting me know)

@Stefano: The status of the correspondence table is incomplete. Remember to check the file _Overview_of_available_correspondence_files.csv

@Arthur: please send me your GitHub name. All contributors shall have access to the GitHub

@Mighuel A.: you have created a new folder distinguishing raw and final (correspondence) tables. That's fine but could you add in the readme a description of what these file contains, their difference and how to use them? Also, should the file _Overview_of_available_correspondence_files.csv now include both files in the raw and in the final tables folder? (I'd say so maybe specifying in the status why we have two if it is necessary to keep both?)


Re: #correspondencetables - what needs to be done? #correspondencetables

arthur.jakobs@...
 

Hi,

I finished the correspondence between the elementary flows of ecoinvent and exiobase. But also do not have writing permission on github.
Can someone add the attached files. I also updated the _Overview...csv file.

Cheers,
Arthur


Re: Start of the #ontology sub-group #ontology

Agneta
 

In my understanding the next steps would be (correct me if I am wrong):

1.To explore on interoperability- How to connect unlinked data (something a potential data provider- e.g. LCA practitioner, national statistics, etc) be connected with our existing database.
2. Develop query templates to extract the data we could extract from BONSAI, which could be useful for our analysis (whether it is LCA/ MFA)

Agneta


Re: Start of the #ontology sub-group #ontology

Massimo Pizzol
 

>>> as we might be going a bit in circles

Thanks. I think we agree on all points. You confirm that we are going for a general solution and that if I want to upgrade to a specific solution (one that looks like the textbook) then it’s a separate step.

Concretely: some info on biosphere, ref flows, etc. does not strictly need to be in the ontology because can either be obtained from external sources or because belongs to a specific mental model only.

 

Massimo

 


Re: Start of the #ontology sub-group #ontology

 

But how to convert the list in a specific format and for a specific purpose (e.g. organize the activities and flows in B matrix + square A matrix to be used in the g = BA-1f calculation for LCA) remains outside the scope of this ontology. This conversion should be regarded as a separate step. One would have to develop another additional ontology on top of the base one, or using some additional information external to the ontology like a list specifying what are the biosphere flows, or make other additional assumptions and choices like deciding which of two output flows is the determining one and applying one or another method to solve multifunctionality.
We will be able to talk about this soon in person, which should be helpful as we might be going a bit in circles. In any case, here are my contributions to the dance:

1. Identifying potential biosphere flows is relatively easy - they are either only consumed and never produced, or vice versa. We can debate if this is enough, or even fits into our logical sense of how we want to build the model, but if we want to make a solvable linear algebra problem it is enough. Think about the *purpose* of separating the A and B matrices.
2. A square technosphere matrix is an output of one possible system model, but is not required. Some system models will not produce square matrices, and other approaches will use graph traversal and not use matrices at all.
3. Our database is not only used for matrix-based LCA, but also e.g. MFA, so should not be too specific to what you read in an LCA textbook. This has already been said before by multiple people, but is worth repeating :)


Re: #correspondencetables - what needs to be done? #correspondencetables

arthur.jakobs@...
 

Hi Stefano,

Thanks for the file.

Cheers,
Arthur


Re: Start of the #ontology sub-group #ontology

Massimo Pizzol
 

Within yesterday’s correspondence on biosphere flows and reference flows Agneta wrote that:

 

>> [for doing LCA…] we can develop a secondary ontology which continues to use BONSAI as the primary ontology and build on top of it.

 

My understanding right now is that a RDF-type database built from raw data linked using our ontology in its current format can be queried to automatically generate a list of the input and output flows of a number of activities, with the main advantage being that although the raw data might be from different sources, the activities and flows in the list will be uniquely and consistently identified.

 

But how to convert the list in a specific format and for a specific purpose (e.g. organize the activities and flows in B matrix + square A matrix to be used in the g = BA-1f calculation for LCA) remains outside the scope of this ontology. This conversion should be regarded as a separate step. One would have to develop another additional ontology on top of the base one, or using some additional information external to the ontology like a list specifying what are the biosphere flows, or make other additional assumptions and choices like deciding which of two output flows is the determining one and applying one or another method to solve multifunctionality.

 

I have been wrong before so…is my understanding correct? I hope this helps in aligning expectations.

 

Massimo


Re: #correspondencetables - what needs to be done? #correspondencetables

Stefano Merciai
 

Hi all,

I just checked the correspondence between Exiobase v2 and Nace Rev.2 and I've noticed that some sectors of the Nace Rev.2 were not included in the table. This means that the table was not exhaustive. Is there any reason for that?

However, I have uploaded a new version with suffix v2 but I did not check all the sectors. I think we will need some cross-check later on.

Best,

Stefano


On 20/03/2019 17:53, Stefano Merciai via Groups.Io wrote:

Hi Jacob,

I have this old file I used to convert Exiobase HSUTs extension into a Simapro format.

Best,

Stefano


On 20/03/2019 17:04, arthur.jakobs@... wrote:
Hi,

I started on the correspondence between the elementary flows of ecoinvent and exiobase. Is anyone aware of any classification scheme for the exiobase resources and emissions?
Or should they be manually mapped on the names?

@Stefano: are there such concordances available for exiobase? I got a bunch of concordances from Konstantin Stadler but non on the environmental flows.

@Tiago: How did you go about this?

Thanks,
Arthur

-- 
Best,
S.

-- 
Best,
S.


Re: #correspondencetables - what needs to be done? #correspondencetables

Stefano Merciai
 

Hi Jacob,

I have this old file I used to convert Exiobase HSUTs extension into a Simapro format.

Best,

Stefano


On 20/03/2019 17:04, arthur.jakobs@... wrote:
Hi,

I started on the correspondence between the elementary flows of ecoinvent and exiobase. Is anyone aware of any classification scheme for the exiobase resources and emissions?
Or should they be manually mapped on the names?

@Stefano: are there such concordances available for exiobase? I got a bunch of concordances from Konstantin Stadler but non on the environmental flows.

@Tiago: How did you go about this?

Thanks,
Arthur

-- 
Best,
S.


Re: #correspondencetables - what needs to be done? #correspondencetables

arthur.jakobs@...
 

Hi,

I started on the correspondence between the elementary flows of ecoinvent and exiobase. Is anyone aware of any classification scheme for the exiobase resources and emissions?
Or should they be manually mapped on the names?

@Stefano: are there such concordances available for exiobase? I got a bunch of concordances from Konstantin Stadler but non on the environmental flows.

@Tiago: How did you go about this?

Thanks,
Arthur


Re: Start of the #ontology sub-group #ontology

Bo Weidema
 

Den 2019-03-20 kl. 16.14 skrev Stefano Merciai:

Last thing, there are  values that are important when building a database, such as combustion coefficients (emissions produced in the activity act123 when burning fuel123). Are these properties of products?

In the wiki, we have about the observation (datapoint):

Number: Floating point numbers xsd:float. (...) Note also that ecoSpold2 has the option of providing a @mathematicalRelation that defines a mathematical formula which can also contain variables and which will fill the value of the @amount if @isCalculatedAmount is TRUE. This is a very explicit and recommendable option for providing provenance information. In ecoSpold2 the formula are defined by a sub-set of the OpenFormula standard. Other RDF-related formula standards are described on the Wikipedia-page for MathML.

The @mathematicalRelation allows exactly to e.g. express an emission as an input number multiplied by a fixed combustione coefficient.

Re. the distinction between CO2 emissions and CO2 as a product, one option can be to use different names (such as "carbon dioxide, purified" or ""carbon dioxide, under pressure" for the product).


Re: Determining Flow = Reference Flow #ontology #rdfframework

Bo Weidema
 

Hi,

The purpose of the determining flow is to be able to distinguish 1) the flow that makes an activity occur 2) the flows that occurs as a consequence of the activity occurring.

It will allow us to distinguish two otherwise identical activities, such as sheep farming for wool and sheep farming for meat

If we do not have a place to indicate this, we would loose information that is essential for the linking of the flows between activities. The identification of determining flows is based on empirical observation and algorithms, i.e. it is not a value choice or something that will depend on the application of the data in different contexts.

It does not have anything to do with square matrices as such, but in a square matrix there is often the convention to place the determining flow on the diagonal.

Best regards

Bo