Date   

Re: Adding provenance #ontology #intro #provenance

Emil Riis Hansen
 

Hi everyone,

I have prepared a pull request for the initial implementation of provenance.

The request extends upon the work by the Arborist working group, by an initial minimal provenance implementation, which adds lineage information between instances of entities and the EXIOBASE dataset, as well as provenance information regarding the Arborist script itself.
Further improvements will be needed and will arrive.

The request also fixes the issue of "missing dataset declarations" from the RDF repository.

Matteo has helped review the request before the submit, but I look forward to your feedback.

Pull request: https://github.com/BONSAMURAIS/arborist/pull/14
Issue: https://github.com/BONSAMURAIS/rdf/issues/3

Best Regards,
Emil


Re: Base nomenclature - theory into practice

Miguel Fernández Astudillo
 

Hi

Having a look to the state of the wiki of the openenergyplatform, it does not seem that the ontology has advanced much. https://github.com/OpenEnergyPlatform/ontology/wiki but it would be worth double-checking with Ludwig Hulk and or Martin Glauer, I think both involved in the development of an ontology to represent energy scenarios. I'd contact also Daniel Huppmann (IIASA), to see how are they planning to get around this issue in the openentrance (https://openentrance.eu/) project.

If there is nothing that we can use, for energy products I think it makes sense to use the UN names as much as possible (https://unstats.un.org/unsd/energy/ESCM_Whitecover_170323.pdf)

For activities, NACE is very vague, and as you point out it does not make any difference between electricity producers. NAICS provides some more detail https://classcodes.com/lookup/naics-5-digit-industry-22111/

It is quite incredible that with the importance of the electricity sector there are no more standard names to call technologies...

Best, Miguel

-----Original Message-----
From: main@bonsai.groups.io <main@bonsai.groups.io> On Behalf Of Chris Mutel
Sent: 28 November 2019 10:26
To: main@bonsai.groups.io
Subject: [bonsai] Base nomenclature - theory into practice

Dear all-

PSI and LIST are part of a consortium of research groups trying to make prospective LCA better. To accomplish this, we have decided to create a standard implementation merging LCI databases with scenarios from energy system models or integrated assessment models. The software would be in Python and based on https://github.com/IndEcol/wurst.

As our aim is to support multiple systems, we need a reference base nomenclature for products and activities. As far as I can tell, we have invested a lot in building up various correspondence tables, but do not have a lot of development on a base nomenclature with scientific justification. (On the other hand, I think we have made a lot of progress in how to describe any such system.) I assume that such a system will start with a standard classification (e.g. GS1, NACE, etc.), and then use industry-specific classifications when more detail is needed (e.g.
https://en.wikipedia.org/wiki/SAE_steel_grades). One important exception here is electricity, where I don't find a standard system that captures the level of detail we need (e.g. utility versus roof-top solar).

As Tomas N. and I are responsible for the technical foundation of this consortium, we have the chance to build it BONSAI-compatible from the beginning - but we need to make decisions soon on this critical issue.
Any input from the list would be greatly appreciated.

-Chris


Re: Base nomenclature - theory into practice

Bo Weidema
 

Hi Chris,

Would be happy to help, but what is your question? In BONSAI parlance, I think the "base classification" is the one that at any point in time is most detailed.

Bo

Den 28.11.2019 kl. 10.25 skrev Chris Mutel:

Dear all-

PSI and LIST are part of a consortium of research groups trying to
make prospective LCA better. To accomplish this, we have decided to
create a standard implementation merging LCI databases with scenarios
from energy system models or integrated assessment models. The
software would be in Python and based on
https://github.com/IndEcol/wurst.

As our aim is to support multiple systems, we need a reference base
nomenclature for products and activities. As far as I can tell, we
have invested a lot in building up various correspondence tables, but
do not have a lot of development on a base nomenclature with
scientific justification. (On the other hand, I think we have made a
lot of progress in how to describe any such system.) I assume that
such a system will start with a standard classification (e.g. GS1,
NACE, etc.), and then use industry-specific classifications when more
detail is needed (e.g.
https://en.wikipedia.org/wiki/SAE_steel_grades). One important
exception here is electricity, where I don't find a standard system
that captures the level of detail we need (e.g. utility versus
roof-top solar).

As Tomas N. and I are responsible for the technical foundation of this
consortium, we have the chance to build it BONSAI-compatible from the
beginning - but we need to make decisions soon on this critical issue.
Any input from the list would be greatly appreciated.

-Chris



--


Base nomenclature - theory into practice

 

Dear all-

PSI and LIST are part of a consortium of research groups trying to
make prospective LCA better. To accomplish this, we have decided to
create a standard implementation merging LCI databases with scenarios
from energy system models or integrated assessment models. The
software would be in Python and based on
https://github.com/IndEcol/wurst.

As our aim is to support multiple systems, we need a reference base
nomenclature for products and activities. As far as I can tell, we
have invested a lot in building up various correspondence tables, but
do not have a lot of development on a base nomenclature with
scientific justification. (On the other hand, I think we have made a
lot of progress in how to describe any such system.) I assume that
such a system will start with a standard classification (e.g. GS1,
NACE, etc.), and then use industry-specific classifications when more
detail is needed (e.g.
https://en.wikipedia.org/wiki/SAE_steel_grades). One important
exception here is electricity, where I don't find a standard system
that captures the level of detail we need (e.g. utility versus
roof-top solar).

As Tomas N. and I are responsible for the technical foundation of this
consortium, we have the chance to build it BONSAI-compatible from the
beginning - but we need to make decisions soon on this critical issue.
Any input from the list would be greatly appreciated.

-Chris


Re: Presentation by Wes Ingwersen, US EPA - Friday, Nov. 22 - 15:00 CET/9:00 EST

Ingwersen, Wesley
 

Hello All,

Thank you for attending the presentation today for those who could make it. My slides are attached for all with some annotations. Links are embedded to most of the tools. I look forward to more conversations about our collaboration.

Wes


Wesley W. Ingwersen, Ph.D.
Environmental Decision Analytics Branch/Land Remediation and Technology Division
Center for Environmental Solutions and Emergency Response (CESER)
US EPA Office of Research and Development
61 Forsyth Street, SW
Atlanta, GA 30303

-----Original Message-----
From: main@bonsai.groups.io <main@bonsai.groups.io> On Behalf Of Chris Mutel
Sent: Friday, November 22, 2019 8:00 AM
To: main@bonsai.groups.io
Subject: Re: [bonsai] Presentation by Wes Ingwersen, US EPA - Friday, Nov. 22 - 15:00 CET/9:00 EST

Reminder that Wes will be presenting today in one hour.

Here are the connection data:

Time: Nov 22, 2019 03:00 PM Copenhagen

Join Zoom Meeting
https://zoom.us/j/799298250

Meeting ID: 799 298 250
Find your local number: https://zoom.us/u/acOELCujEO

On Mon, 11 Nov 2019 at 23:26, Christopher Mutel <@cmutel> wrote:

Dear all-

I would like to invite Wes Ingwersen to present the work that the US
EPA is doing on open data infrastructure and associated efforts to the
BONSAI group. We have had a "hang out" on Friday afternoons before,
and that time works well for him, so those who are interested and
available are welcome to join.

Bo, can you send out a Zoom link?

-Chris

--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################


--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################


Re: Presentation by Wes Ingwersen, US EPA - Friday, Nov. 22 - 15:00 CET/9:00 EST

Matteo Lissandrini (AAU)
 

Hi Chris,


unfortunately I will not be able to join due to other meetings, but if there is any material (e.g., slides or notes) I will be happy to take a look.


Thanks,

Matteo


---
Matteo Lissandrini

Department of Computer Science
Aalborg University

http://people.cs.aau.dk/~matteo






From: main@bonsai.groups.io <main@bonsai.groups.io> on behalf of Chris Mutel via Groups.Io <cmutel@...>
Sent: 22 November 2019 14:00:00
To: main@bonsai.groups.io
Subject: Re: [bonsai] Presentation by Wes Ingwersen, US EPA - Friday, Nov. 22 - 15:00 CET/9:00 EST
 
Reminder that Wes will be presenting today in one hour.

Here are the connection data:

Time: Nov 22, 2019 03:00 PM Copenhagen

Join Zoom Meeting
https://zoom.us/j/799298250

Meeting ID: 799 298 250
Find your local number: https://zoom.us/u/acOELCujEO

On Mon, 11 Nov 2019 at 23:26, Christopher Mutel <cmutel@...> wrote:
>
> Dear all-
>
> I would like to invite Wes Ingwersen to present the work that the US
> EPA is doing on open data infrastructure and associated efforts to the
> BONSAI group. We have had a "hang out" on Friday afternoons before,
> and that time works well for him, so those who are interested and
> available are welcome to join.
>
> Bo, can you send out a Zoom link?
>
> -Chris
>
> --
> ############################
> Chris Mutel
> Technology Assessment Group, LEA
> Paul Scherrer Institut
> OHSA D22
> 5232 Villigen PSI
> Switzerland
> http://chris.mutel.org
> Telefon: +41 56 310 5787
> ############################



--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################




Re: Presentation by Wes Ingwersen, US EPA - Friday, Nov. 22 - 15:00 CET/9:00 EST

 

Reminder that Wes will be presenting today in one hour.

Here are the connection data:

Time: Nov 22, 2019 03:00 PM Copenhagen

Join Zoom Meeting
https://zoom.us/j/799298250

Meeting ID: 799 298 250
Find your local number: https://zoom.us/u/acOELCujEO

On Mon, 11 Nov 2019 at 23:26, Christopher Mutel <@cmutel> wrote:

Dear all-

I would like to invite Wes Ingwersen to present the work that the US
EPA is doing on open data infrastructure and associated efforts to the
BONSAI group. We have had a "hang out" on Friday afternoons before,
and that time works well for him, so those who are interested and
available are welcome to join.

Bo, can you send out a Zoom link?

-Chris

--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################
--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################


Re: Adding provenance #ontology #intro #provenance

Emil Riis Hansen
 

Hi Michele


Username: IKnowLogic

I will share some more details soon before creating the PR

Thank you, Michele


Best Regards

Emil Riis Hansen


From: main@bonsai.groups.io <main@bonsai.groups.io> on behalf of Michele De Rosa via Groups.Io <michele.derosa@...>
Sent: Friday, November 15, 2019 9:46:47 AM
To: main@bonsai.groups.io
Subject: Re: [bonsai] Adding provenance #ontology #intro #provenance
 
Hi Emil, 

send me your GitHub username.

Mic


Re: Adding provenance #ontology #intro #provenance

Michele De Rosa
 

Hi Emil, 

send me your GitHub username.

Mic


Re: Adding provenance #ontology #intro #provenance

Emil Riis Hansen
 

Hello everyone,
I have made a proposed first implementation of provenance in the BONSAI project, including lineage information for flows, activityTypes, and locations as well as versioning of the activity (Arborist Script), used in the extraction of the data. I believe the implementation satisfies our initial requirements, or at least the requirements needed to write a resource paper regarding the BONSAI database. The proposal is very flexible, and can easily be extended or changed.

I would like to share the proposal by a pull quest. How do I get permission to do this?

Best regards
Emil


Re: Adding provenance #ontology #intro #provenance

Matteo Lissandrini (AAU)
 

Hi all,


so what is listed as case 1) by Bo is what Emil proposal is about.


The idea is to annotate provenance of the named graphs we have.

This is a first, necessary, step because without that the data in each named graph is "orphan" of any basic information required by provenance.


Emil will come up with a proposal on what we need to extend in our scripts in order to have this information, I think it will be a few additions to the arborist code.


Thanks,

Matteo



---
Matteo Lissandrini

Department of Computer Science
Aalborg University

http://people.cs.aau.dk/~matteo






From: main@bonsai.groups.io <main@bonsai.groups.io> on behalf of loekke via Groups.Io <loekke@...>
Sent: 14 November 2019 13:17:14
To: main@bonsai.groups.io
Subject: Re: [bonsai] Adding provenance #ontology #intro #provenance
 
Yes, this is also in line with the talk we had with Emil yesterday


Re: Adding provenance #ontology #intro #provenance

Søren
 

Yes, this is also in line with the talk we had with Emil yesterday


Re: Intro Mail

Agneta
 

Hi Matthias

Welcome to the group. 

BONSAI is a platform which aims to provide open data on sustainability assessment. You can read more about BONSAI here .
Its an open community with members across different domains contributing towards it. 

Currently we are in the process of integrating one of the commonly used datasets to the semantic web. The name of the database we have currently integrated is called Exiobase. It is a global multi regional input output database used for the analysis of the environmental impacts associated with the final consumption of multiple product categories (including textile).
This data can be queried if you have some experience with SPARQL. One of our members Romain Sacchi has developed a web application which should enable us to query the data easily i.e knowledge of SPARQL not required. However, when he developed the web application our dataset was not ready. Based on your introduction I think you might be interested supporting us with.

I have attached all this links to this message that might be useful for you. If you need any other information. Please don't hesitate in writing back. 

Best wishes

Team Bonsai 


Re: Adding provenance #ontology #intro #provenance

Bo Weidema
 

Dear Agneta,

First, it is important to distinguish between:

1) What is "raw data" in a BONSAI context, namely the data as they are received from elsewhere. These data may be either direct measurements (very rarely) or previously more or less processed (in the case of Exiobase definitively more so), with or without explicit previous provenance. For these data, it is obviously sufficient to report the direct source, as it is received (example: Exiobase version NNh, downloaded from URL at Time) which is then applicable to all datapoints within that dataset.

2) Data that are corrected or otherwise manipulated after receipt, in which case it is relevant to add the nature of the correction or calculation, and a timestamp for the changed dataset (but not for the parts unchanged). In this way, one can always trace the origin of any datum to the form it originally was provided to BONSAI.

As ambitions and resources increase, someone may later want to add further upstream provenance to the data in BONSAI, which is of course always possible and desirable.

Best regards

Bo

Den 2019-11-13 kl. 14.42 skrev Agneta:

Thanks for the document Bo

The document recommends timestamping of the datapoints and query outputs. Although I am unsure what degree will we be able to add provenance to each value on Exiobase. Although Exiobase does use data from multiple sources it adds some algorithms to provide a balanced dataset. In other words, its a secondary dataset (primary datasets are those which contain raw data)

If in future some values are changed, this leads to the publication of a new version of the dataset. So the provenance for all values in exiobase is generated as exiobase + (specific version).
My question is do we need to have provenance of individual values in a secondary dataset. Its different when we have minute by minute information of temperature change in a region (raw data). Here the timestamping of individual values might be more relevant.

What do you think?
Agneta

--


Re: Adding provenance #ontology #intro #provenance

Agneta
 

Thanks for the document Bo

The document recommends timestamping of the datapoints and query outputs. Although I am unsure what degree will we be able to add provenance to each value on Exiobase. Although Exiobase does use data from multiple sources it adds some algorithms to provide a balanced dataset. In other words, its a secondary dataset (primary datasets are those which contain raw data)

If in future some values are changed, this leads to the publication of a new version of the dataset. So the provenance for all values in exiobase is generated as exiobase + (specific version).
My question is do we need to have provenance of individual values in a secondary dataset. Its different when we have minute by minute information of temperature change in a region (raw data). Here the timestamping of individual values might be more relevant.

What do you think?
Agneta


Re: Adding provenance #ontology #intro #provenance

Bo Weidema
 

Dear Emil and Agneta,

A warm velcome to Emil.

Re. provenance of the individual numbers and calculations, there is a good description in the section "Versioning and citation" in this document, relating to the recommedations from RDA. This is an elegant and efficient way of handling this issue, I think. I thought I had added that to the wiki, but right now I cannot find it (?).

Best regards

Bo

Den 2019-11-13 kl. 12.42 skrev Agneta:

Dear all

I would like to introduce Emil Riis Hansen to the Bonsai community. He has been recently employed as a research assistant with the computer science department at Aalborg University. 

Emil is interested in working with adding provenance to our current BONSAI ontology. 

Provenance helps us add information on the origin of data.i.e where does the data come from/ who generated the data/ licence of the data etc. We had discussed this issue during the hackathon but hadn't developed it since.
Currently Emil has proposed a high level provenance which is limited to determining the origin of the dataset and not individual values in it. For example, if anyone queries data from BONSAI, they will get the info that the data is sourced from Exiobase, but if other datasets are integrated to semantic web using Bonsai ontology, they will find information on the origin of that dataset. Provenance of individual values in a dataset is harder to determine as they may be calculated, estimated, or raw data from the data provider.

Emil is currently also preparing a conference paper with respect to how he plans to add provenance to the current ontology. For the purpose of this paper, it would be useful to upload the provenance information to the current rdf data we have on the Jena database. This will help the reviewers query the information as presented in the paper.

If anyone here has been working with provenance or are interested, please feel free to write to me.
Kind regards

Agneta

--


Re: Adding provenance #ontology #intro #provenance

Bo Weidema
 

Dear Emil and Agneta,

A warm velcome to Emil. Re. provenance of the individual numbers, there is a good description on the wiki, relating to the recommedations from RDA. This is an elegant and efficient way of handling this issue, I think.

Best regards

Bo

Den 2019-11-13 kl. 12.42 skrev Agneta:

Dear all

I would like to introduce Emil Riis Hansen to the Bonsai community. He has been recently employed as a research assistant with the computer science department at Aalborg University. 

Emil is interested in working with adding provenance to our current BONSAI ontology. 

Provenance helps us add information on the origin of data.i.e where does the data come from/ who generated the data/ licence of the data etc. We had discussed this issue during the hackathon but hadn't developed it since.
Currently Emil has proposed a high level provenance which is limited to determining the origin of the dataset and not individual values in it. For example, if anyone queries data from BONSAI, they will get the info that the data is sourced from Exiobase, but if other datasets are integrated to semantic web using Bonsai ontology, they will find information on the origin of that dataset. Provenance of individual values in a dataset is harder to determine as they may be calculated, estimated, or raw data from the data provider.

Emil is currently also preparing a conference paper with respect to how he plans to add provenance to the current ontology. For the purpose of this paper, it would be useful to upload the provenance information to the current rdf data we have on the Jena database. This will help the reviewers query the information as presented in the paper.

If anyone here has been working with provenance or are interested, please feel free to write to me.
Kind regards

Agneta

--


Adding provenance #ontology #intro #provenance

Agneta
 

Dear all

I would like to introduce Emil Riis Hansen to the Bonsai community. He has been recently employed as a research assistant with the computer science department at Aalborg University. 

Emil is interested in working with adding provenance to our current BONSAI ontology. 

Provenance helps us add information on the origin of data.i.e where does the data come from/ who generated the data/ licence of the data etc. We had discussed this issue during the hackathon but hadn't developed it since.
Currently Emil has proposed a high level provenance which is limited to determining the origin of the dataset and not individual values in it. For example, if anyone queries data from BONSAI, they will get the info that the data is sourced from Exiobase, but if other datasets are integrated to semantic web using Bonsai ontology, they will find information on the origin of that dataset. Provenance of individual values in a dataset is harder to determine as they may be calculated, estimated, or raw data from the data provider.

Emil is currently also preparing a conference paper with respect to how he plans to add provenance to the current ontology. For the purpose of this paper, it would be useful to upload the provenance information to the current rdf data we have on the Jena database. This will help the reviewers query the information as presented in the paper.

If anyone here has been working with provenance or are interested, please feel free to write to me.
Kind regards

Agneta


Re: Presentation by Wes Ingwersen, US EPA - Friday, Nov. 22 - 15:00 CET/9:00 EST

Bo Weidema
 

Bo Weidema is inviting you to a scheduled Zoom meeting.

Time: Nov 22, 2019 03:00 PM Copenhagen

Join Zoom Meeting
https://zoom.us/j/799298250

Meeting ID: 799 298 250

One tap mobile
+16468769923,,799298250# US (New York)
+14086380968,,799298250# US (San Jose)

Dial by your location
        +1 646 876 9923 US (New York)
        +1 408 638 0968 US (San Jose)
        +1 669 900 6833 US (San Jose)
Meeting ID: 799 298 250
Find your local number: https://zoom.us/u/acOELCujEO

Den 2019-11-11 kl. 23.26 skrev Chris Mutel:

Dear all-

I would like to invite Wes Ingwersen to present the work that the US
EPA is doing on open data infrastructure and associated efforts to the
BONSAI group. We have had a "hang out" on Friday afternoons before,
and that time works well for him, so those who are interested and
available are welcome to join.

Bo, can you send out a Zoom link?

-Chris

--


Presentation by Wes Ingwersen, US EPA - Friday, Nov. 22 - 15:00 CET/9:00 EST

 

Dear all-

I would like to invite Wes Ingwersen to present the work that the US
EPA is doing on open data infrastructure and associated efforts to the
BONSAI group. We have had a "hang out" on Friday afternoons before,
and that time works well for him, so those who are interested and
available are welcome to join.

Bo, can you send out a Zoom link?

-Chris

--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################