Date   

Re: #infrastructure New working group and practice guidelines #infrastructure

 

On Thu, 4 Apr 2019 at 13:20, Matteo Lissandrini (AAU) <matteo@...> wrote:

Hi Chris,

I can imagine that some of my tests may have deleted some data from the database. Sorry for that. To be honest, I was under the impression that the data in jken
I would expect the database to be wiped out regularly until we reach a stable status.
But this should not be an issue.
I would like to help in establishing the (automatic) workflow that collects the data (in NON RDF formats) and parses and merges it with the ontology and the contents of the /rdf repo so that we can easily wipe and redeploy the Jena instance at will.
I believe this will require a coordination between the arborist repo, the rdf repo, the importer and probably some other?
No problem, this is to be expected as we are still evolving the
schema, and making sure our RDF is valid and implemented properly.
However, at some point soon we should get to a point where the Aalborg
server is considered stable, while db.b.u is still for playing.

It actually isn't that easy to restore everything, as we need a
relatively large amount of data currently (on the order of 3 gb for
EXIOBASE, and 300 mb for the electricity stuff). The metadata is easy
- arborist can rewrite the data in https://github.com/BONSAMURAIS/rdf,
which can in turn be the foundation of the triple store. It would be
nice to have a function that would take all these small turtle files
and merge them into one file (which could then be uploaded to the
triple store).

In the medium-term, I don't think that it makes sense to store
metadata for specific databases like exiobase in arborist - this can
just as easily be part of the file including the actual data as well.
We only evolved this code pathway because we were learning as we were
going. Indeed, it is probably more clever in the long term to have
https://github.com/BONSAMURAIS/rdf generated from the database itself.

I think the small importer you wrote will work fine for smaller
datasets, but we will need to do file uploads for larger ones, as they
won't fit into memory (to be loaded by RDFLib). This should be easy to
do, though there may be some Jena configuration bugs to work out
still.

So everything is in a bit of a flux, and it would be great if you
could take charge of this little bit of it! Please document the hell
out of stuff, so we don't have to bug you too much.

Probably in the triplestore repo?

Thanks,
Matteo






________________________________
From: main@bonsai.groups.io [main@bonsai.groups.io] on behalf of Chris Mutel via Groups.Io [cmutel=gmail.com@groups.io]
Sent: Thursday, April 04, 2019 10:34 AM
To: main@bonsai.groups.io
Subject: [bonsai] #infrastructure New working group and practice guidelines

Dear all-

As many of you have already realized, we need to organize and document our infrastructure a bit better. Specifically, I see a need for:

Standard practice guidelines on maintaining the RDF database. For example, it looks like https://github.com/BONSAMURAIS/bontofrom/issues/9 is fixed, but I am not sure by who or when. Also, I think someone (or more than one person :) has wiped this database since the hackathon, as the electricity data is missing.
A small guide to help everyday people know which named graphs to use, and how to use them.
Backup and restore procedures for the RDF database. We need to be dumping stuff anyway to make the downloads available.
A private repository with server configs and passwords. I have applied for https://github.com/nonprofit status, but we could also run a private instance of gitlab.
A list of all bonsai.uno websites, virtual servers, etc.

Tomas, would you coordinate this? It doesn't mean you have to do all the work.


--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################


Re: #infrastructure New working group and practice guidelines #infrastructure

Matteo Lissandrini (AAU)
 

Hi Chris,

my importer is actually doing the file upload, this is the command I ran yesterday night

```bash
for f in `find ../rdf -name '*.ttl'`; do bseeder -i $f; done
```

So you do not need to merge files in /rdf repo, actually if you do that
you end up with a big problem: you lose track of which triples go in which named graph.


In my view the RDF repo is for the instances of the taxonomies, small datasets that changes slowly (e.g., flow object/items or activity types).
While the actual data would remain out of it.

For very big files what we can do is:
1) upload them via scp/rsync to a dedicated directory on the server,
2) use the file importer utility provided by jena itself


I understand that restoring is not easy, but we need to have it for reproducibility and for reliability (if bad things happen we may need to restore the database from scratch)



Cheers,
Matteo

________________________________________
From: main@bonsai.groups.io [main@bonsai.groups.io] on behalf of Chris Mutel via Groups.Io [cmutel=gmail.com@groups.io]
Sent: Thursday, April 04, 2019 1:43 PM
To: main@bonsai.groups.io
Subject: Re: [bonsai] #infrastructure New working group and practice guidelines

On Thu, 4 Apr 2019 at 13:20, Matteo Lissandrini (AAU) <matteo@...> wrote:

Hi Chris,

I can imagine that some of my tests may have deleted some data from the database. Sorry for that. To be honest, I was under the impression that the data in jken
I would expect the database to be wiped out regularly until we reach a stable status.
But this should not be an issue.
I would like to help in establishing the (automatic) workflow that collects the data (in NON RDF formats) and parses and merges it with the ontology and the contents of the /rdf repo so that we can easily wipe and redeploy the Jena instance at will.
I believe this will require a coordination between the arborist repo, the rdf repo, the importer and probably some other?
No problem, this is to be expected as we are still evolving the
schema, and making sure our RDF is valid and implemented properly.
However, at some point soon we should get to a point where the Aalborg
server is considered stable, while db.b.u is still for playing.

It actually isn't that easy to restore everything, as we need a
relatively large amount of data currently (on the order of 3 gb for
EXIOBASE, and 300 mb for the electricity stuff). The metadata is easy
- arborist can rewrite the data in https://github.com/BONSAMURAIS/rdf,
which can in turn be the foundation of the triple store. It would be
nice to have a function that would take all these small turtle files
and merge them into one file (which could then be uploaded to the
triple store).

In the medium-term, I don't think that it makes sense to store
metadata for specific databases like exiobase in arborist - this can
just as easily be part of the file including the actual data as well.
We only evolved this code pathway because we were learning as we were
going. Indeed, it is probably more clever in the long term to have
https://github.com/BONSAMURAIS/rdf generated from the database itself.

I think the small importer you wrote will work fine for smaller
datasets, but we will need to do file uploads for larger ones, as they
won't fit into memory (to be loaded by RDFLib). This should be easy to
do, though there may be some Jena configuration bugs to work out
still.

So everything is in a bit of a flux, and it would be great if you
could take charge of this little bit of it! Please document the hell
out of stuff, so we don't have to bug you too much.

Probably in the triplestore repo?

Thanks,
Matteo






________________________________
From: main@bonsai.groups.io [main@bonsai.groups.io] on behalf of Chris Mutel via Groups.Io [cmutel=gmail.com@groups.io]
Sent: Thursday, April 04, 2019 10:34 AM
To: main@bonsai.groups.io
Subject: [bonsai] #infrastructure New working group and practice guidelines

Dear all-

As many of you have already realized, we need to organize and document our infrastructure a bit better. Specifically, I see a need for:

Standard practice guidelines on maintaining the RDF database. For example, it looks like https://github.com/BONSAMURAIS/bontofrom/issues/9 is fixed, but I am not sure by who or when. Also, I think someone (or more than one person :) has wiped this database since the hackathon, as the electricity data is missing.
A small guide to help everyday people know which named graphs to use, and how to use them.
Backup and restore procedures for the RDF database. We need to be dumping stuff anyway to make the downloads available.
A private repository with server configs and passwords. I have applied for https://github.com/nonprofit status, but we could also run a private instance of gitlab.
A list of all bonsai.uno websites, virtual servers, etc.

Tomas, would you coordinate this? It doesn't mean you have to do all the work.


--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################


Serializing large LD datasets

 

Maybe our approach to serializing large graphs is maybe not that great. You can see the current code here - basically, we convert Python to JSON line by line, with some text mangling. It sounds (and looks) a bit crazy; the idea behind this decision was that RDFLib can't really handle large datasets, such as BONSAI.

The latest straw was realizing that we need to declare a `dataset` for the actual data (not just metadata). In turtle, this is (for example):

@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ns1: <http://creativecommons.org/ns#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix ns2: <http://purl.org/vocab/vann/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .

brdfat: a dtype:Dataset ;
    ns1:license <http://creativecommons.org/licenses/by/3.0/> ;
    dc:contributor "BONSAI team" ;
    dc:creator <http://bonsai.uno/foaf/bonsai.rdf#bonsai> ;
    dc:description "ActivityType instances needed for BONSAI modelling of EXIOBASE version 3.3.17" ;
    dc:modified "2019-04-02"^^xsd:date ;
    dc:publisher "bonsai.uno" ;
    dc:title "EXIOBASE 3.3.17 activity types" ;
    ns2:preferredNamespaceUri <http://rdf.bonsai.uno/activitytype/exiobase3_3_17/#> ;
    owl:versionInfo "0.3" ;
    foaf:homepage brdfat:documentation.html .

In JSON-LD, if is... more involved:


{
  "@graph" : [ {
    "@id" : "http://rdf.bonsai.uno/activitytype/exiobase3_3_17/",
    "@type" : "dtype:Dataset",
    "license" : "http://creativecommons.org/licenses/by/3.0/",
    "contributor" : "BONSAI team",
    "creator" : "http://bonsai.uno/foaf/bonsai.rdf#bonsai",
    "description" : "ActivityType instances needed for BONSAI modelling of EXIOBASE version 3.3.17",
    "modified" : "2019-04-02",
    "publisher" : "bonsai.uno",
    "title" : "EXIOBASE 3.3.17 activity types",
    "preferredNamespaceUri" : "brdfat:#",
    "versionInfo" : "0.3",
    "homepage" : "brdfat:documentation.html"
  } ],
  "@context" : {
    "label" : {
      "@id" : "http://www.w3.org/2000/01/rdf-schema#label"
    },
    "versionInfo" : {
      "@id" : "http://www.w3.org/2002/07/owl#versionInfo"
    },
    "homepage" : {
      "@id" : "http://xmlns.com/foaf/0.1/homepage",
      "@type" : "@id"
    },
    "title" : {
      "@id" : "http://purl.org/dc/elements/1.1/title"
    },
    "publisher" : {
      "@id" : "http://purl.org/dc/elements/1.1/publisher"
    },
    "description" : {
      "@id" : "http://purl.org/dc/elements/1.1/description"
    },
    "preferredNamespaceUri" : {
      "@id" : "http://purl.org/vocab/vann/preferredNamespaceUri",
      "@type" : "@id"
    },
    "creator" : {
      "@id" : "http://purl.org/dc/elements/1.1/creator",
      "@type" : "@id"
    },
    "license" : {
      "@id" : "http://creativecommons.org/ns#license",
      "@type" : "@id"
    },
    "contributor" : {
      "@id" : "http://purl.org/dc/elements/1.1/contributor"
    },
    "modified" : {
      "@id" : "http://purl.org/dc/elements/1.1/modified",
      "@type" : "http://www.w3.org/2001/XMLSchema#date"
    },
    "dtype" : "http://purl.org/dc/dcmitype/",
    "brdfat" : "http://rdf.bonsai.uno/activitytype/exiobase3_3_17/",
  }
}

Moreover, it is difficult for me to reason about why the JSON-LD is formatted the way that it is. On the other hand, the Turtle file is much nicer to read and predict.

We had said earlier (though without a formal decision) that we want to use JSON-LD for data interchange, but it would make life a lot easier to use Turtle, if people were OK with that. Let me know what you think!
 


Re: Serializing large LD datasets

Miguel Fernández Astudillo
 

Hi!

 

In the correspondence table group we struggled a bit when we had to move from Turtle to json-LD. We spend some time trying to figure out how to do it in JSON and ended up writing turtle. We found it easier to write and read and we were told there was an automatic code to translate one to the other. I prefer Turtle, but I am not aware of the advantages of JSON-LD.   

 

Best,

 

Miguel

 

 

 

From: main@bonsai.groups.io <main@bonsai.groups.io> On Behalf Of Chris Mutel
Sent: 05 April 2019 10:06
To: main@bonsai.groups.io
Subject: [bonsai] Serializing large LD datasets

 

Maybe our approach to serializing large graphs is maybe not that great. You can see the current code here - basically, we convert Python to JSON line by line, with some text mangling. It sounds (and looks) a bit crazy; the idea behind this decision was that RDFLib can't really handle large datasets, such as BONSAI.

The latest straw was realizing that we need to declare a `dataset` for the actual data (not just metadata). In turtle, this is (for example):


@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix ns1: <http://creativecommons.org/ns#> .
@prefix dc: <
http://purl.org/dc/elements/1.1/> .
@prefix ns2: <http://purl.org/vocab/vann/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .

brdfat: a dtype:Dataset ;
    ns1:license <http://creativecommons.org/licenses/by/3.0/> ;
    dc:contributor "BONSAI team" ;
    dc:creator <http://bonsai.uno/foaf/bonsai.rdf#bonsai> ;
    dc:description "ActivityType instances needed for BONSAI modelling of EXIOBASE version 3.3.17" ;
    dc:modified "2019-04-02"^^xsd:date ;
    dc:publisher "bonsai.uno" ;
    dc:title "EXIOBASE 3.3.17 activity types" ;
    ns2:preferredNamespaceUri <http://rdf.bonsai.uno/activitytype/exiobase3_3_17/#> ;
    owl:versionInfo "0.3" ;
    foaf:homepage brdfat:documentation.html .

In JSON-LD, if is... more involved:

 


{
  "@graph" : [ {
    "@id" : "http://rdf.bonsai.uno/activitytype/exiobase3_3_17/",
    "@type" : "dtype:Dataset",
    "license" : "http://creativecommons.org/licenses/by/3.0/",
    "contributor" : "BONSAI team",
    "creator" : "http://bonsai.uno/foaf/bonsai.rdf#bonsai",
    "description" : "ActivityType instances needed for BONSAI modelling of EXIOBASE version 3.3.17",
    "modified" : "2019-04-02",
    "publisher" : "bonsai.uno",
    "title" : "EXIOBASE 3.3.17 activity types",
    "preferredNamespaceUri" : "brdfat:#",
    "versionInfo" : "0.3",
    "homepage" : "brdfat:documentation.html"
  } ],
  "@context" : {
    "label" : {
      "@id" : "http://www.w3.org/2000/01/rdf-schema#label"
    },
    "versionInfo" : {
      "@id" : "http://www.w3.org/2002/07/owl#versionInfo"
    },
    "homepage" : {
      "@id" : "http://xmlns.com/foaf/0.1/homepage",
      "@type" : "@id"
    },
    "title" : {
      "@id" : "http://purl.org/dc/elements/1.1/title"
    },
    "publisher" : {
      "@id" : "http://purl.org/dc/elements/1.1/publisher"
    },
    "description" : {
      "@id" : "http://purl.org/dc/elements/1.1/description"
    },
    "preferredNamespaceUri" : {
      "@id" : "http://purl.org/vocab/vann/preferredNamespaceUri",
      "@type" : "@id"
    },
    "creator" : {
      "@id" : "http://purl.org/dc/elements/1.1/creator",
      "@type" : "@id"
    },
    "license" : {
      "@id" : "http://creativecommons.org/ns#license",
      "@type" : "@id"
    },
    "contributor" : {
      "@id" : "http://purl.org/dc/elements/1.1/contributor"
    },
    "modified" : {
      "@id" : "http://purl.org/dc/elements/1.1/modified",
      "@type" : "http://www.w3.org/2001/XMLSchema#date"
    },
    "dtype" : "http://purl.org/dc/dcmitype/",
    "brdfat" : "http://rdf.bonsai.uno/activitytype/exiobase3_3_17/",
  }
}

Moreover, it is difficult for me to reason about why the JSON-LD is formatted the way that it is. On the other hand, the Turtle file is much nicer to read and predict.

We had said earlier (though without a formal decision) that we want to use JSON-LD for data interchange, but it would make life a lot easier to use Turtle, if people were OK with that. Let me know what you think!

 


Re: Serializing large LD datasets

Agneta
 

+1 for turtle format

Much easier to read and write. 


Re: Serializing large LD datasets

Massimo Pizzol
 

No opinion here, I trust those who have already worked hands-on on this, and their choice.

BR
Massimo

 

From: <main@bonsai.groups.io> on behalf of "Agneta via Groups.Io" <agneta.20@...>
Reply-To: "main@bonsai.groups.io" <main@bonsai.groups.io>
Date: Friday, 5 April 2019 at 10.56
To: "main@bonsai.groups.io" <main@bonsai.groups.io>
Subject: Re: [bonsai] Serializing large LD datasets

 

+1 for turtle format

Much easier to read and write. 


Re: #ontology Can we come up with a better term than "Flow Object"? #ontology

 

I added a table with what I could make of the existing systems, and the possible alternatives we have discussed, here: https://github.com/BONSAMURAIS/BONSAI-ontology-RDF-framework/blob/master/Terminology-discussion.md. Feel free to edit this if you think I have made a mistake.

> To re-iterate: Flow is a verb

Flow can be a verb or a noun, and there is something to be said for having all the core terms be nouns (I think everything else is).


Re: BEP-0004 BONSAI knowledge management and communication strategy | open for discussion / seeking editor

 

I have created a bonsai.uno repo, which we need to fill out, to eventually replace the existing content of the website (this is included in BEP 4). The current website structure looks like:

Homepage
    Challenge and vision
    Organization
        Static downloads
    Strategy
        Many working group pages
    Archive
        Static downloads
    Become a member
        Contributions

Here is the beginning of a new layout which emphasizes our concepts and work methods. I really think that the web page will be better for documentation than the wiki, as we can control the presentation more, and add a little white space so we don't have the "wall of text" effect. See the proposed BEP4 for a discussion of how best to use the different communication media.

Homepage
    Vision (short)
        -> Common ontology for LCA, MFA, and IE
        -> Open data pipeline
    By the community, for the community
        -> Getting started guide
        -> GH projects repo

    Common ontology

    Data pipeline

    Getting started guide
        Basic technologies

        -> Contribute data
        -> Build web apps
        -> Using the API

    Community management

    Data reconciliation

    NPO (BONSAI non-profit organization)
        Become a member
        Archive of official documents

One possible way to separate the content from the presentation by storing the text with some simple markup (e.g. Markdown) in a separate directory.

@agneta and @romain, let's discuss how we can each participate. Perhaps we could start by better planning an outline, and writing down what we want to accomplish. Feel free to provide your thoughts and concerns.


Two votes - please participate!

 

Dear all-

1. If you haven't voted for or against BEP 1, please do it now! If not enough people participate, the proposal will automatically fail.

2. We have had a lively discussion on the terminology used in the ontology, and have several different options before us. It would be nice to get a sense of the broader groups preferences through an indicative, though not necessarily binding, vote. When multiple option are present, ranked choice voting (in this case in the form of instant runoff) is a decent polling choice. So please visit the list of candidates: https://github.com/BONSAMURAIS/BONSAI-ontology-RDF-framework/blob/master/Terminology-discussion.md, and reply to this email with your preferences in order by letter, from first to last. For example, here are my personal preferences:

BDACFE

Please rank all six possibilities, so we can get complete statistics.


#bonsamurai.github.io

romain
 

Hey, I start a discussion here on the new bonsai.uno webpage.

Here is the structure suggested by Chris.

  1. Vision (short)
    1. Common ontology for LCA, MFA, and IE
    2. Open data pipeline
  2. By the community, for the community
    1. Getting started guide
      1. Basic technologies
        1. Contribute with data
        2. Build web apps
        3. Using the API
    2. GitHub projects repo
  3. Community management
  4. Data reconciliation
  5. NPO (BONSAI on-profit organization)
    1. Become a member
    2. Archive of official documents

Did i get the hierarchy right?


Re: Two votes - please participate!

Massimo Pizzol
 

DCAFBE


Re: Two votes - please participate!

Matteo Lissandrini (AAU)
 

AFDCEB


From: main@bonsai.groups.io [main@bonsai.groups.io] on behalf of Massimo Pizzol via Groups.Io [massimo@...]
Sent: Sunday, April 07, 2019 5:19 PM
To: main@bonsai.groups.io
Subject: Re: [bonsai] Two votes - please participate!

DCAFBE


Re: #ontology Can we come up with a better term than "Flow Object"? #ontology

Elias Sebastian Azzi
 

Hello,

Reading up that long email thread I wrote a summary of the different views expressed. I also summarise an article that describes another ontology for IE, rather different vocabulary, hoping it will help us see the ontology from a different perspective.

 

 

Alpha / Summary

 

Issue - Human vocabulary for BONSAI's core ontology

While there seems to be an agreement among the participants around the three core classes of the ontology (i.e. on their conceptual meaning), there is not yet a consensus on how these classes should be named in human readable language. There is however an agreement on the fact that the vocabulary used during the hackathon 2019 is not ideal. Most of the controversy lies in the term "flow object". This issue seems of high importance because it affects how people perceive the ontology, understand it and decide whether to take it up or not.

 

Below, we summarise the different views/suggestions on that issue, pros, cons and remarks.

 

V1. [Chris] "Flow object" is not consistent with the other terms of the ontology and is hard to related to. The alternative "item" is suggested.

Pro: definition of item is "an individual article or unit, especially one that is part of a list, collection, or set" which fits in the concept.

Pro: it echoes to fields of computer science and mathematics

Remark: activities are also part of a list/collection/set, according to that definition activities are also items of a collection of activities.

 

V2. [Chris] "Flow" is good but has no natural counterpart. An alternative for "flow" could be "exchange".

 

V3. [Agneta] Return to the published LCA ontology (Kuczenski et al. 2016), with the three terms Activity (a thing that happens), Flow (a thing in the world that exists because of some instance of an Activity), and Exchange (an established relationship between an activity instance and a flow instance).

Pro: (to verify) coherence with the vocabulary used by most industrial ecologist / (disagreement) in (1) the authors argue that terminology is not consistent between industrial ecologist, even for basic definitions.

(1) Pauliuk, S.; Majeau-Bettez, G.; Müller, D. B.; Hertwich, E. G. Toward a Practical Ontology for Socioeconomic Metabolism. J. Ind. Ecol. 2016, 20 (6), 1260–1272; DOI 10.1111/jiec.12386.

 

V4. [Rutger] In  ecospold1, only exchanges are defined. In ILCD data formats, both exchange and flows (i.e. flow objects) are defined. Environmental compartments are specified. In SimaPro platform, Flows do not include compartments, as in the Bonsai hackathon version. Exchange is not yet used, but is considered. At PRé, flow-objects are of two types: substances and products, but not perfect.

Con: flow and exchange are both dynamic terms

 

V5. [Matteo] Flow and Flow-object in the post-hackathon ontology are clear and well defined: they relate the Flow and the Object of the Flow (aka the Flow Object). In other words, by keeping the word "flow" in both definitions their link and subtle difference is kept explicit and forces the new-comer to think twice about these definitions.

Pro: all terms can be confusing, the advantage of Flow and Flow-object is that the difficulty is not hidden behind different terms, does not allow for misunderstanding to happen.

 

V6. [Bo] The vocabulary we use needs to distinguish between "the observation of a specific flow (22 kg input of steel) and the abstract flow-object (steel)".

 

V7. [Agneta] "hackathon vocabulary" -> "new vocabulary"

Flow-object => Flow

Flow => Exchange

 

Long List of Terms:

Flow object, entity, object, flux, item, thing, element, substance, component, Noumenon, Flow-item, commodity

Flow, Exchange, Phenomenon

Activity

 

-------

 

 

Bravo / Looking at it from a different angle

 

This being said, I would like to add to the discussion the following points:

-          Matteo has a point: by using the work “flow” twice (in flow and flow-object) we keep the complexity explicit.

 

-          We seem to agree on the structure, but finding the right words for human communication is tricky: do we have to choose? In the end, examples speak by themselves. We will choose a term now, but we can keep the list of alternatives: the list helps clarify things!

 

-          Do we actually agree on the structure? Your discussions forced me to re-open that article by Pauliuk and co: they have the same goal as Bonsai, performed a review of all IE fields, and (wait for it) came up with a totally different wording. I would say that it is one level of abstraction higher than the current Bonsai ontology, and rather stimulating to read. Here some highlights:

o   Many inconsistencies of vocabulary and definitions exist within IE and even within certain fields e.g. LCA

o   Industrial ecologist describe socioeconomic metabolism by a bipartite directed graph (i.e. SUTs) or directed graph

o   Five key definitions:

Definition 1, Sets: A set is a collection of distinct objects

Definition 2, Hierarchical, mutually exclusive and collectively exhaustive (H-MECE) object classification: An HMECE object classification is a grouping of a given set of objects into an H-MECE collection of sets.

Definition 3, Stock: A stock is a set of objects of interest.

Definition 4, Process: A process is a set-based description of one or several events of interest, expressed in terms of the objects of interest that are involved in these events during their course.

Definition 5, Flow: A flow is a description of a particular type of event, where objects are preserved and move from one set a to another set b.

o   In sounds very different, but when you read the article in details, all the issues we face are somehow discussed. Including how to handle the properties of objects of interest (see Figure 2)

o   Definition 2 is of interest for the correspondence table group

 

 

mvh

Elias

 

From: main@bonsai.groups.io <main@bonsai.groups.io> On Behalf Of Chris Mutel
Sent: den 5 april 2019 12:46
To: main@bonsai.groups.io
Subject: Re: [bonsai] #ontology Can we come up with a better term than "Flow Object"?

 

I added a table with what I could make of the existing systems, and the possible alternatives we have discussed, here: https://github.com/BONSAMURAIS/BONSAI-ontology-RDF-framework/blob/master/Terminology-discussion.md. Feel free to edit this if you think I have made a mistake.

> To re-iterate: Flow is a verb

Flow can be a verb or a noun, and there is something to be said for having all the core terms be nouns (I think everything else is).


Re: Two votes - please participate!

Elias Sebastian Azzi
 

ADCFBE  is my current preference.

 

mvh

Elias

 

From: main@bonsai.groups.io <main@bonsai.groups.io> On Behalf Of Matteo Lissandrini (AAU)
Sent: den 7 april 2019 17:44
To: main@bonsai.groups.io
Subject: Re: [bonsai] Two votes - please participate!

 

AFDCEB

 


From: main@bonsai.groups.io [main@bonsai.groups.io] on behalf of Massimo Pizzol via Groups.Io [massimo@...]
Sent: Sunday, April 07, 2019 5:19 PM
To: main@bonsai.groups.io
Subject: Re: [bonsai] Two votes - please participate!

DCAFBE


Re: #ontology Can we come up with a better term than "Flow Object"? #ontology

Andreas Ciroth
 

Dear all,

interesting. As part of the discussion you may want to consider also the JSON-LD format names:

http://greendelta.github.io/olca-schema/

In my view, process, flow, exchange is most commonly used (used in “our” JSON-LD format and in ILCD) and it is not too bad (meaning: short, not misleading; it is good to distinguish flows from exchanges). Yes, process, and flow, and also exchange can be a noun and a verb but this is common in English language. So, maybe, in view that there are really lots of things to do in LCA and data availability and LCA ontologies, it is maybe good to stick with this. Or, invent something really different. Point, line, square, e.g., would be different, for flow, exchange, process.

All the best!

Andreas

 

Von: main@bonsai.groups.io <main@bonsai.groups.io> Im Auftrag von Elias Sebastian Azzi
Gesendet: Sonntag, 7. April 2019 23:18
An: main@bonsai.groups.io
Betreff: Re: [bonsai] #ontology Can we come up with a better term than "Flow Object"?

 

Hello,

Reading up that long email thread I wrote a summary of the different views expressed. I also summarise an article that describes another ontology for IE, rather different vocabulary, hoping it will help us see the ontology from a different perspective.

 

 

Alpha / Summary

 

Issue - Human vocabulary for BONSAI's core ontology

While there seems to be an agreement among the participants around the three core classes of the ontology (i.e. on their conceptual meaning), there is not yet a consensus on how these classes should be named in human readable language. There is however an agreement on the fact that the vocabulary used during the hackathon 2019 is not ideal. Most of the controversy lies in the term "flow object". This issue seems of high importance because it affects how people perceive the ontology, understand it and decide whether to take it up or not.

 

Below, we summarise the different views/suggestions on that issue, pros, cons and remarks.

 

V1. [Chris] "Flow object" is not consistent with the other terms of the ontology and is hard to related to. The alternative "item" is suggested.

Pro: definition of item is "an individual article or unit, especially one that is part of a list, collection, or set" which fits in the concept.

Pro: it echoes to fields of computer science and mathematics

Remark: activities are also part of a list/collection/set, according to that definition activities are also items of a collection of activities.

 

V2. [Chris] "Flow" is good but has no natural counterpart. An alternative for "flow" could be "exchange".

 

V3. [Agneta] Return to the published LCA ontology (Kuczenski et al. 2016), with the three terms Activity (a thing that happens), Flow (a thing in the world that exists because of some instance of an Activity), and Exchange (an established relationship between an activity instance and a flow instance).

Pro: (to verify) coherence with the vocabulary used by most industrial ecologist / (disagreement) in (1) the authors argue that terminology is not consistent between industrial ecologist, even for basic definitions.

(1) Pauliuk, S.; Majeau-Bettez, G.; Müller, D. B.; Hertwich, E. G. Toward a Practical Ontology for Socioeconomic Metabolism. J. Ind. Ecol. 2016, 20 (6), 1260–1272; DOI 10.1111/jiec.12386.

 

V4. [Rutger] In  ecospold1, only exchanges are defined. In ILCD data formats, both exchange and flows (i.e. flow objects) are defined. Environmental compartments are specified. In SimaPro platform, Flows do not include compartments, as in the Bonsai hackathon version. Exchange is not yet used, but is considered. At PRé, flow-objects are of two types: substances and products, but not perfect.

Con: flow and exchange are both dynamic terms

 

V5. [Matteo] Flow and Flow-object in the post-hackathon ontology are clear and well defined: they relate the Flow and the Object of the Flow (aka the Flow Object). In other words, by keeping the word "flow" in both definitions their link and subtle difference is kept explicit and forces the new-comer to think twice about these definitions.

Pro: all terms can be confusing, the advantage of Flow and Flow-object is that the difficulty is not hidden behind different terms, does not allow for misunderstanding to happen.

 

V6. [Bo] The vocabulary we use needs to distinguish between "the observation of a specific flow (22 kg input of steel) and the abstract flow-object (steel)".

 

V7. [Agneta] "hackathon vocabulary" -> "new vocabulary"

Flow-object => Flow

Flow => Exchange

 

Long List of Terms:

Flow object, entity, object, flux, item, thing, element, substance, component, Noumenon, Flow-item, commodity

Flow, Exchange, Phenomenon

Activity

 

-------

 

 

Bravo / Looking at it from a different angle

 

This being said, I would like to add to the discussion the following points:

  • Matteo has a point: by using the work “flow” twice (in flow and flow-object) we keep the complexity explicit.

 

  • We seem to agree on the structure, but finding the right words for human communication is tricky: do we have to choose? In the end, examples speak by themselves. We will choose a term now, but we can keep the list of alternatives: the list helps clarify things!

 

  • Do we actually agree on the structure? Your discussions forced me to re-open that article by Pauliuk and co: they have the same goal as Bonsai, performed a review of all IE fields, and (wait for it) came up with a totally different wording. I would say that it is one level of abstraction higher than the current Bonsai ontology, and rather stimulating to read. Here some highlights:
    • Many inconsistencies of vocabulary and definitions exist within IE and even within certain fields e.g. LCA
    • Industrial ecologist describe socioeconomic metabolism by a bipartite directed graph (i.e. SUTs) or directed graph
    • Five key definitions:

Definition 1, Sets: A set is a collection of distinct objects

Definition 2, Hierarchical, mutually exclusive and collectively exhaustive (H-MECE) object classification: An HMECE object classification is a grouping of a given set of objects into an H-MECE collection of sets.

Definition 3, Stock: A stock is a set of objects of interest.

Definition 4, Process: A process is a set-based description of one or several events of interest, expressed in terms of the objects of interest that are involved in these events during their course.

Definition 5, Flow: A flow is a description of a particular type of event, where objects are preserved and move from one set a to another set b.

    • In sounds very different, but when you read the article in details, all the issues we face are somehow discussed. Including how to handle the properties of objects of interest (see Figure 2)
    • Definition 2 is of interest for the correspondence table group

 

 

mvh

Elias

 

From: main@bonsai.groups.io <main@bonsai.groups.io> On Behalf Of Chris Mutel
Sent: den 5 april 2019 12:46
To: main@bonsai.groups.io
Subject: Re: [bonsai] #ontology Can we come up with a better term than "Flow Object"?

 

I added a table with what I could make of the existing systems, and the possible alternatives we have discussed, here: https://github.com/BONSAMURAIS/BONSAI-ontology-RDF-framework/blob/master/Terminology-discussion.md. Feel free to edit this if you think I have made a mistake.

> To re-iterate: Flow is a verb

Flow can be a verb or a noun, and there is something to be said for having all the core terms be nouns (I think everything else is).


Re: #ontology Can we come up with a better term than "Flow Object"? #ontology

mmremolona@...
 

Hi all,

My philosophy on naming in ontologies revolves not on the simplicity of the terms used but on how they sound like when you talk about them in normal conversations. Does it sound awkward or normal? On the terms of the ontology:

Flow -> Right now this term is used to refer to the transfer of material or objects from an activity (as an output) to another activity (as an input), thereby connecting these two activities. In my opinion, changing this term to exchange does not affect the overall understanding of the ontology. Either would work. I can have a material flow from one activity to another activity.

Flow-object -> This is defined as an object that is referenced in a flow. Many flows can reference a single instance of a flow-object. I think this is where confusion may set in, as a flow-object can be imagined as an instance of flow. And I agree with Chris that this doesn’t sound right when talking about it. It just doesn’t seem natural to mention a flow object.

I don’t think flow itself works here as the word flow doesn’t equate to any object or material.

For the idea regarding using the term “thing”, everything in any ontology is a subclass of owl:Thing, at least according to the specifications of w3c, so this is redundant and may lead to confusion.
Regarding flow-item, while this seems like a good idea, I generally associate the term item to something that I can itemize or count. Steel, copper, coal, and all the other things used don’t have a problem. However, for CO2, water, steam, etc., this doesn’t seem like a good term to use.

My initial idea to fix this is by making flow an adjective, as in the case of Flowing-Object. However, this doesn’t sound right in language as well. My previous argument for the flow-item would then be reversed; coal steel and those solid objects do not necessarily flow.

My secondary idea involves using the term Exchanged-Object. This is not necessarily related to the first term flow, but both can be adapted so that it sounds more congruent overall. This also sounds better as the question that arises from it sounds better in English (e.g. What’s the exchanged-object between the two activities you mentioned? In this flow, what’s the exchanged-object?)

TLDR:
Flow -> “Exchange” or retain “Flow”
Flow-object -> “Exchanged-Object”

 

Best,

 

Miguel Remolona


5.4.19 Catch-up meeting minutes and next meeting planning

 

Next catch-up meeting

We will have another catch-up meeting on 12.4.19 at 15:00 CEST, and then skip the next week (19.4.19) due to Easter holidays.

5.4.19 Catch-up meeting minutes

Correspondence tables
 
Started https://github.com/BONSAMURAIS/grafter tool to change 1-1 CSVs to RDF with actual predicates
Added some new tables, and metadata to existing tables
Priority is EXIOBASE - ENTSO-E, as we need this for first proof on concept deliverable
Data cleaning/conversion now is laborious and manual, need a better way. See an example here: https://github.com/BONSAMURAIS/Correspondence-tables/blob/master/scripts/from_raw_to_clean_tables.ipynb
 
Communication
 
Need a clean and prominent place to summarize existing repos, their functions, and their interdependencies (one possible overview from Tom Millross is attached)
Could be on wiki or bonsai README
bonsai.uno website rework is starting, repo here: https://github.com/BONSAMURAIS/bonsai.uno
 
Ontology
 
Discussion on nomenclature is ongoing, with several creative solutions proposed
Adaptation of existing probability ontology is difficult due to all examples being XML; volunteers to help adapt this ontology please contact Agneta
Move away from JSON-LD and towards Turtle as default exchange format for RDF data, due to readability and ease of programming
 
System model / calculation interface
 
Work and documentation is proceeding after the hackathon, such as procedures for dis/aggregation (e.g. Aggregating different types of gas with different calorific values)
REST endpoints to be defined and documented
 
Outreach
 
Miguel A. will attend https://forum.openmod-initiative.org/t/aarhus-2019-workshop/1126
Those attending LCM 2019 will hold an outreach event, with organizing and content support from others in the BONSAI team
 


Re: 5.4.19 Catch-up meeting minutes and next meeting planning

 

Repo overview attachment


Re: #bonsamurai.github.io

 

Maybe easier to split it up into actual URLs:

Note that the following is just one possibility, and will be changed now and in the future. Our aim is to make such changes easy.

bonsai.uno
  • Homepage
  • Vision (short)
    • Common ontology for LCA, MFA, and IE
    • Open data pipeline
  • By the community, for the community
    • Getting started guide
    • GH projects repo
  • Should be short, more of an appetizer than a meal, with links to more documentation

bonsai.uno/ontology
  • Introduction to core concepts of the ontology, starting with a gentle introduction to linked data
  • Ends with links to other docs/visualization for complete ontology
  • Target audience is people who have never heard "RDF" before

bonsai.uno/data-pipeline
  • Subway-style map with the different data processing steps, and the accompanying repositories / web resources
  • Target audience is people who are used to using the "Excel hammer"

bonsai.uno/getting-started
  • Brief page with links to more specific getting-started guides. Help people decide what getting started guide is right for them.
  • Could also contain a toolkit, like http://toolbox.schoolofdata.ch/

bonsai.uno/getting-started/contribute-data

bonsai.uno/getting-started/our-api

bonsai.uno/getting-started/others as we develop

bonsai.uno/community
  • Community management philosophy
  • Links to BEPs

bonsai.uno/FAQs
  • FAQs to be populated. 
    • How is BONSAI different than other LCA databases?
    • How can I contribute?
    • Who is behind BONSAI?
    • What is the relationship between the project and the NPO?
    • Is anyone paid to work on BONSAI?

bonsai.uno/NPO
  • Archive of official documents
  • Become a member

To do:
  • Look into CSS classes used (everything necessary is in the repo already), decide if we want to keep using SASS as CSS preprocessor, create some more classes (and maybe more meaningful labels for common layouts). Write up brief notes on using the CSS to get what you want.
  • Write some sample content for 1-2 pages, esp. data flow, homepage, and ontology
    • Then do some layout with bright, colorful, and simple graphs (e.g. for links between ontology concepts).


Re: Two votes - please participate!

 

FYI:

1. The vote on BEP 1 is trending towards acceptance; the voting will stop if two more people participate and approve.

2. We currently have 5 votes in our nomenclature discussion. Here are the average ranks:
  • A 2.0
  • B 4.2
  • C 2.8
  • D 2.6
  • E 5.2
  • F 4.2

I have updated the table with the alternatives suggested by Miguel R. You are, of course, allowed to alter your votes if you want.