Topics

Hackathon concept and deliverables - request for comment


 

Dear Hackathon participants:

It is time to start preparing in earnest for our joint work at the end of March.

The following deliverables are required at the end of our time together:

a) Matching of EXIOBASE to the BONSAI ontology set, and import into an RDF database
b) Standards and implementation for updatable (transparent, executable) LCI data models
c) Matching of output from b to the same ontology, and import into an RDF database
d) Export of a reconciled database containing both input sources

This is already quite a lot, so I don't think we need to add more tasks, at least for now!

To actually accomplish these objectives in a large, diverse, and dispersed team, I propose that we follow the UNIX philosophy (https://en.wikipedia.org/wiki/Unix_philosophy), namely that we have a set of tools that each do one thing well. The BONSAI ontology can form the foundation for how these individual tools can transfer data with each other - in particular, the standards developed in task b become much easier if we just require the interface language spec to be the set of ontologies that BONSAI will use.

The model in task b will be either European electricity generation, distribution, trade, and losses/conversion; or, a model of automobile transportation where one can specify the model, number of passengers, etc. The groundwork for both of these models has already been done at PSI.

Before going into detailed planning, I would like your first reactions to the concept of the hackathon. If you are not familiar with these ideas, I have written a bit about my specific vision here: https://chris.mutel.org/next-steps.html

Yours,
Chris


Brandon Kuczenski
 

You just wanted to show off that princess photo!

Thanks for conceiving and enacting the hackathon project, and for sending out this introduction. I have some comments that I wanted to send out right away as a "hot take" (acknowledging that one purpose of such a missive is to provoke responses), and I will also have some (perhaps more reasoned) comments later.

The task you have conceived of is exciting and useful. And probably achievable.

My first critique is: none of those is a deliverable; those are tasks. The deliverables should be things that we can look at and send around. Perhaps the deliverables would be the outputs resulting from those tasks executing successfully. But we should be precise in specifying them. "An RDF database containing EXIOBASE in terms of the BONSAI ontology." and there should be some standard by which we decide whether the deliverable has been met. (e.g. querying "x" against the database should return "y" from EXIOBASE). Maybe writing these tests is itself in-scope for the hackathon. (in which case: perhaps "operating unit tests for EXIOBASE import" is a reasonable deliverable?")

Second: Well, (b) is a kind of a deliverable.. I suspect what you mean is more like "a document that reports the group's consensus on" those standards. But I want to observe that one such standard has already been proposed: (https://doi.org/10.1111/jiec.12810)  The work in that paper was intended to achieve this goal. I humbly offer it for consideration (and possibly rejection on grounds) prior to the hackathon, since it might help us move the ball down the field. The key utilities of this framework are:
(1) it provides a precise statement of a narrowly scoped process-flow model which can be easily revised and parameterized, 
(2) it explicitly requires linkages to remote data sources, and
(3) it explicitly excludes elementary flow matching, assuming the solution to that problem to be located in (2).
A product system model disclosure includes: three entity lists (of foreground flows, background flows, and emissions), and three sparse matrices that connect the entities to one another. One could easily imagine the entries in the entity lists to simply be references to the RDF database.

There is also a (semi-)working brightway implementation: (https://github.com/pjamesjoyce/lca_disclosures) that casts a BW2 database to a JSON document according to the framework; writing an RDF output would not be hard.

Really, I am just fishing for critical feedback on my disclosures proposal. But I do think it's relevant.

-Brandon




On Thu, Feb 21, 2019 at 2:01 PM Chris Mutel <cmutel@...> wrote:
Dear Hackathon participants:

It is time to start preparing in earnest for our joint work at the end of March.

The following deliverables are required at the end of our time together:

a) Matching of EXIOBASE to the BONSAI ontology set, and import into an RDF database
b) Standards and implementation for updatable (transparent, executable) LCI data models
c) Matching of output from b to the same ontology, and import into an RDF database
d) Export of a reconciled database containing both input sources

This is already quite a lot, so I don't think we need to add more tasks, at least for now!

To actually accomplish these objectives in a large, diverse, and dispersed team, I propose that we follow the UNIX philosophy (https://en.wikipedia.org/wiki/Unix_philosophy), namely that we have a set of tools that each do one thing well. The BONSAI ontology can form the foundation for how these individual tools can transfer data with each other - in particular, the standards developed in task b become much easier if we just require the interface language spec to be the set of ontologies that BONSAI will use.

The model in task b will be either European electricity generation, distribution, trade, and losses/conversion; or, a model of automobile transportation where one can specify the model, number of passengers, etc. The groundwork for both of these models has already been done at PSI.

Before going into detailed planning, I would like your first reactions to the concept of the hackathon. If you are not familiar with these ideas, I have written a bit about my specific vision here: https://chris.mutel.org/next-steps.html

Yours,
Chris



--
Brandon Kuczenski, Ph.D.
Associate Researcher

University of California at Santa Barbara
Institute for Social, Behavioral, and Economic Research
Santa Barbara, CA 93106-5131

email: bkuczenski@...


tomas Navarrete
 

Following a Test/Behavior driven approach as Brandon suggest (in fine) has my entire support.

Some of the artefacts that we would need to achieve the tasks are somehow already available. Pymrio to manipulate exiobase for example.
I suggest we start gathering a list of artefacts that could be used to produce the deliverables of each task.

After Brandon's mail, here is how I see things for task a:

a) EXIOBASE Meets BONSAI-ontology
a.1 Use pymrio to parse a subset (year?) of EXIOBASE3. Output called: exiobase3-dataset
a.2 design a tool capable of taking an element from exiobase3-dataset and transforming it to a BONSAI-ontology element. Output: specification + test design of `exiobase32bo`.
a.3 Implement exiobase32bo (happy coding)
a.4 apply exiobase32bo to exiobase3-dataset. Output called: bonsai-exaio3
a.5 Use (Apache jena) rdf tools to create a database from bonsai-exaio3

Of course, names are only examples and I have no particular attachment to them.

before diving into b Where is the BONSAI ontology ?

Anyway, I assume we should pay more attention to "b) Standards and implementation for updatable (transparent, executable) LCI data models"

@chris, how does b fit in your scheme of https://chris.mutel.org/next-steps.html ?

"Brandon Kuczenski" ---02/22/2019 12:11:11 AM---You just wanted to show off that princess photo! Thanks for conceiving and enacting the hackathon pr


From: "Brandon Kuczenski" <bkuczenski@...>
To: hackathon2019@bonsai.groups.io
Date: 02/22/2019 12:11 AM
Subject: Re: [hackathon2019] Hackathon concept and deliverables - request for comment
Sent by: hackathon2019@bonsai.groups.io





You just wanted to show off that princess photo!

Thanks for conceiving and enacting the hackathon project, and for sending out this introduction. I have some comments that I wanted to send out right away as a "hot take" (acknowledging that one purpose of such a missive is to provoke responses), and I will also have some (perhaps more reasoned) comments later.

The task you have conceived of is exciting and useful. And probably achievable.

My first critique is: none of those is a deliverable; those are tasks. The deliverables should be things that we can look at and send around. Perhaps the deliverables would be the outputs resulting from those tasks executing successfully. But we should be precise in specifying them. "An RDF database containing EXIOBASE in terms of the BONSAI ontology." and there should be some standard by which we decide whether the deliverable has been met. (e.g. querying "x" against the database should return "y" from EXIOBASE). Maybe writing these tests is itself in-scope for the hackathon. (in which case: perhaps "operating unit tests for EXIOBASE import" is a reasonable deliverable?")

Second: Well, (b) is a kind of a deliverable.. I suspect what you mean is more like "a document that reports the group's consensus on" those standards. But I want to observe that one such standard has already been proposed: (https://doi.org/10.1111/jiec.12810)  The work in that paper was intended to achieve this goal. I humbly offer it for consideration (and possibly rejection on grounds) prior to the hackathon, since it might help us move the ball down the field. The key utilities of this framework are:
(1) it provides a precise statement of a narrowly scoped process-flow model which can be easily revised and parameterized, 
(2) it explicitly requires linkages to remote data sources, and
(3) it explicitly excludes elementary flow matching, assuming the solution to that problem to be located in (2).
A product system model disclosure includes: three entity lists (of foreground flows, background flows, and emissions), and three sparse matrices that connect the entities to one another. One could easily imagine the entries in the entity lists to simply be references to the RDF database.

There is also a (semi-)working brightway implementation: (https://github.com/pjamesjoyce/lca_disclosures) that casts a BW2 database to a JSON document according to the framework; writing an RDF output would not be hard.

Really, I am just fishing for critical feedback on my disclosures proposal. But I do think it's relevant.

-Brandon




On Thu, Feb 21, 2019 at 2:01 PM Chris Mutel <cmutel@...> wrote:
    Dear Hackathon participants:

    It is time to start preparing in earnest for our joint work at the end of March.

    The following deliverables are required at the end of our time together:

    a) Matching of EXIOBASE to the BONSAI ontology set, and import into an RDF database
    b) Standards and implementation for updatable (transparent, executable) LCI data models
    c) Matching of output from b to the same ontology, and import into an RDF database
    d) Export of a reconciled database containing both input sources

    This is already quite a lot, so I don't think we need to add more tasks, at least for now!

    To actually accomplish these objectives in a large, diverse, and dispersed team, I propose that we follow the UNIX philosophy (
    https://en.wikipedia.org/wiki/Unix_philosophy), namely that we have a set of tools that each do one thing well. The BONSAI ontology can form the foundation for how these individual tools can transfer data with each other - in particular, the standards developed in task b become much easier if we just require the interface language spec to be the set of ontologies that BONSAI will use.

    The model in task b will be either European electricity generation, distribution, trade, and losses/conversion; or, a model of automobile transportation where one can specify the model, number of passengers, etc. The groundwork for both of these models has already been done at PSI.

    Before going into detailed planning, I would like your first reactions to the concept of the hackathon. If you are not familiar with these ideas, I have written a bit about my specific vision here:
    https://chris.mutel.org/next-steps.html

    Yours,
    Chris



--

Brandon Kuczenski, Ph.D.
Associate Researcher

University of California at Santa Barbara
Institute for Social, Behavioral, and Economic Research
Santa Barbara, CA 93106-5131

email:
bkuczenski@...


Bo Weidema
 

Den 2019-02-22 kl. 08.57 skrev tomas Navarrete:

before diving into b Where is the BONSAI ontology ?

https://github.com/BONSAMURAIS/bonsai/wiki/Data-Storage#specify-minimum-core-data-and-metadata-formats

(this is not provided in any formal ontology-language, but I assume someone more well versed in such languages should be able to translate from the text?)


Anyway, I assume we should pay more attention to "b) Standards and implementation for updatable (transparent, executable) LCI data models"


I think there may be some confusion here. As I understood Chris's request, this is what we in ecoinvent informally have called "LCI modules" - not to be confused with full "system models" as I understand Brandon's interpretation. But "LCI modules" (or maybe better call them "activity modules") have not been defined as part of the BONSAI ontology yet (nor have they been implemented in ecoinvent). So the first task here would be to make a more precise definition of what is an "activity module".

Best regards

Bo


Massimo Pizzol
 

Dear all

 

Seems like it would be useful to have a sort of “meta” reflection and formalize the “why and how and what” of this hackathon, so that we can have clear goals and optimize the work. I have prepared a draft (attached), hoping this can be of help. Not sure how/where I should upload it (github or google docs…the latter probably best for comments?). Let me know.


Thanks Brandon for sharing your paper. I think what you have formalised there is quite reasonable (I wrote something on this line here some time ago) so OK for me to follow these suggestions.

 

BR
Massimo

 

 

-- 

 

Massimo Pizzol 

 

DCEA | Department of Planning

Aalborg University (DK)

 

Mobile: +45 2067 5275

 

From: <hackathon2019@bonsai.groups.io> on behalf of Brandon Kuczenski <bkuczenski@...>
Reply-To: "hackathon2019@bonsai.groups.io" <hackathon2019@bonsai.groups.io>
Date: Friday, 22 February 2019 at 00.10
To: "hackathon2019@bonsai.groups.io" <hackathon2019@bonsai.groups.io>
Subject: Re: [hackathon2019] Hackathon concept and deliverables - request for comment

 

You just wanted to show off that princess photo!

 

Thanks for conceiving and enacting the hackathon project, and for sending out this introduction. I have some comments that I wanted to send out right away as a "hot take" (acknowledging that one purpose of such a missive is to provoke responses), and I will also have some (perhaps more reasoned) comments later.

 

The task you have conceived of is exciting and useful. And probably achievable.

 

My first critique is: none of those is a deliverable; those are tasks. The deliverables should be things that we can look at and send around. Perhaps the deliverables would be the outputs resulting from those tasks executing successfully. But we should be precise in specifying them. "An RDF database containing EXIOBASE in terms of the BONSAI ontology." and there should be some standard by which we decide whether the deliverable has been met. (e.g. querying "x" against the database should return "y" from EXIOBASE). Maybe writing these tests is itself in-scope for the hackathon. (in which case: perhaps "operating unit tests for EXIOBASE import" is a reasonable deliverable?")

 

Second: Well, (b) is a kind of a deliverable.. I suspect what you mean is more like "a document that reports the group's consensus on" those standards. But I want to observe that one such standard has already been proposed: (https://doi.org/10.1111/jiec.12810)  The work in that paper was intended to achieve this goal. I humbly offer it for consideration (and possibly rejection on grounds) prior to the hackathon, since it might help us move the ball down the field. The key utilities of this framework are:

(1) it provides a precise statement of a narrowly scoped process-flow model which can be easily revised and parameterized, 

(2) it explicitly requires linkages to remote data sources, and

(3) it explicitly excludes elementary flow matching, assuming the solution to that problem to be located in (2).

A product system model disclosure includes: three entity lists (of foreground flows, background flows, and emissions), and three sparse matrices that connect the entities to one another. One could easily imagine the entries in the entity lists to simply be references to the RDF database.

 

There is also a (semi-)working brightway implementation: (https://github.com/pjamesjoyce/lca_disclosures) that casts a BW2 database to a JSON document according to the framework; writing an RDF output would not be hard.

 

Really, I am just fishing for critical feedback on my disclosures proposal. But I do think it's relevant.

 

-Brandon

 

 

 

 

On Thu, Feb 21, 2019 at 2:01 PM Chris Mutel <cmutel@...> wrote:

Dear Hackathon participants:

It is time to start preparing in earnest for our joint work at the end of March.

The following deliverables are required at the end of our time together:

a) Matching of EXIOBASE to the BONSAI ontology set, and import into an RDF database
b) Standards and implementation for updatable (transparent, executable) LCI data models
c) Matching of output from b to the same ontology, and import into an RDF database
d) Export of a reconciled database containing both input sources

This is already quite a lot, so I don't think we need to add more tasks, at least for now!

To actually accomplish these objectives in a large, diverse, and dispersed team, I propose that we follow the UNIX philosophy (https://en.wikipedia.org/wiki/Unix_philosophy), namely that we have a set of tools that each do one thing well. The BONSAI ontology can form the foundation for how these individual tools can transfer data with each other - in particular, the standards developed in task b become much easier if we just require the interface language spec to be the set of ontologies that BONSAI will use.

The model in task b will be either European electricity generation, distribution, trade, and losses/conversion; or, a model of automobile transportation where one can specify the model, number of passengers, etc. The groundwork for both of these models has already been done at PSI.

Before going into detailed planning, I would like your first reactions to the concept of the hackathon. If you are not familiar with these ideas, I have written a bit about my specific vision here: https://chris.mutel.org/next-steps.html

Yours,
Chris



--

Brandon Kuczenski, Ph.D.
Associate Researcher

University of California at Santa Barbara
Institute for Social, Behavioral, and Economic Research
Santa Barbara, CA 93106-5131

email: bkuczenski@...


Elias Sebastian Azzi
 

Hello,

I had read your vision couple of months ago. It is great to see you diving into action with this hackathon.

While I am reading - on a 15-hour-long train ride -  the “assigned literature” about RDF, SPARQL, Linked Data, your email made me realise that I also should re-read and get familiar with the BONSAI ontology set (https://github.com/BONSAMURAIS/bonsai/wiki/Data-Storage & https://github.com/BONSAMURAIS/bonsai/wiki/Data-Integration; anything more specific?)

/Elias

 


Matteo Lissandrini (AAU)
 

Hi all,

I though would be appropriate for me to help with the definition of the deliverables.
I think task "a) Matching of EXIOBASE to the BONSAI ontology set, and import into an RDF database"
would be the crucial one.

For that to be successful, that is, for the RDF data to be accessible, understandable, maintainable, expandable, and - very importantly- interoperable, we would require to move from the description of the ontology here

https://github.com/BONSAMURAIS/bonsai/wiki/Data-Storage#specify-minimum-core-data-and-metadata-formats
to a proper ontology definition + an RDF Schema.
These should be mapped and matched with existing standards and vocabularies, in particular I suggest the QB[1] and QB4OLAP [2] as vocabularies, and to link to established ontologies like GeoNames [6].

Hence, on the side of the domain experts, I would imagine the first deliverables to be pictures like the one here [3] for the ontology and here [4,5] for the schema.
These are to be accompanied with a set of URI and RDF predicates which will constitute the BONSAI vocabulary.
With similar pictorial representation and the provided URI/predicates, it would be easy for any programmer to provide the required RDF specification and translation code for the data, and for anyone else to understand and re-use or link to the data you are publishing.

The decisions about how to structure those are the crucial point where those familiar with the data and the domain can clearly provide the "added-value".

I hope the above is useful to you, and please let me know whether I should elaborate in more details about anything.
I'll be happy to help you navigate the technicalities of the specifications of course.

Best,
Matteo




[1] https://www.w3.org/TR/vocab-data-cube/
[2] https://github.com/lorenae/qb4olap/wiki
[3] http://kbpedia.org/knowledge-graph/
[4] http://tcga.deri.ie/
[5] http://qweb.cs.aau.dk/qboairbase/
[6] http://www.geonames.org/ontology/documentation.html