Re: Opening a second work track for BONSAI #dataliberation


I have started a repo with the Bonsai ontology (to the best of my understanding) as a relational database schema here: (just a beginning, there are still issues!). I have also tried to label the Github repos with the topics "relational" and "semantic" to distinguish these two work tracks.

I think there are a number of people that want to start doing things with data, and this would be a "quick and somewhat dirty" way to get some data into people's hands. In particular, we still have quite some work to do on preparing data for linking, and on finding a consensus system model.

Data that could be added now

Storing data

We need a policy here, I don't really think it makes sense to import EXIOBASE for each commit, or at least not yet, but we still need to know that our data won't disappear. Storing a copy in e.g. Zenodo would perhaps be sensible - discuss.

Preparing data for linking

To link, we need to choose the correct temporal, spatial, and activity scale. We can't just pick randomly (this methodology already exists :), so we should be creative. Finding where differences matter is always nice. My expectation is that this processed data would be entered into a new database, though this could change. Plus of course we need data reconciliation! This is non-trivial, multiple people are writing PhD theses on it.

System modelling

Need software to implement system constructs. We can choose existing IO ones for now, just need something. Someone should check on whether it is possible to adapt mojo or if it would make more sense to start over. Mojo is very table focused, perhaps the ocelot approach, where data is stored as lists of dictionaries instead of in tables/arrays is more sensible.

Join to automatically receive all group messages.