Re: #correspondencetables : from raw to triplets #correspondencetables

Miguel Fernández Astudillo
 

Interesting, I will have a deeper look when possible.

I was updating the group readme. Should I move the references to the Hackathon somewhere else? it seems that this repo will survive and it will have a function in the workflow.

Miguel

-----Original Message-----
From: hackathon2019@bonsai.groups.io <hackathon2019@bonsai.groups.io> On Behalf Of Chris Mutel
Sent: 09 April 2019 13:44
To: hackathon2019@bonsai.groups.io
Subject: Re: [hackathon2019] #correspondencetables : from raw to triplets

As we are not the only people thinking about these topics, there has already been a lot of work in this area. It is relatively easy to find some half-baked implementations in RDF, e.g. datahub.io, joinedupdata.org, and the unstats web page Miguel linked is great.
However, the best resource I have found is here:
http://semstats.org/2016/challenge/classifications, with the actual data available here:
http://semstats.org/2016/challenge/challenge-data. The repo to generate these correspondences is https://github.com/FranckCo/Stamina,
with documentation here:
https://github.com/FranckCo/Stamina/blob/master/doc/content.md.

This data was produced by a project whose website is currently down (stamina-project.org); the easiest alternative would be to work with the original creator, but it doesn't look like he is responding to issues (I am also writing the creator). There are a few other things to clean up in this data, see e.g.
https://github.com/FranckCo/Stamina/issues/11 (and others).

Not sure about the next steps, except that I don't think we can create a better wheel than professionals already have. Maybe we can polish their wheel a bit, and use it?

On Mon, 8 Apr 2019 at 17:26, <miguel.astudillo@...> wrote:

Hello hello

lets see if I am getting this right.

Chris, when you say "put metadata systems in their native form into arborist (e.g. ISIC 3, ISIC 4, HS1, NACE, NAICS, CPC)" does that mean "as downloaded?" are we talking about the "list of possible names" (e.g. stuff under "codes and descriptions) https://unstats.un.org/unsd/classifications/business-trade/correspondence.asp#correspondence-head (e.g. "ISIC_Rev_4_english_structure.txt") If so I would put only the needed ones. I dont think we need "HS1988"

would it be to create the URIs? e.g.
<http://rdf.bonsai.uno/activitytype/isic_v4section/>:Manufacturing a
bont:ActivityType

and later move to the "official" (=ready to use?) correspondance tables specifying predicates?

To make use of the existing correspondance tables I think we would need "exiobase2 to exiobase3" otherwise they are completely disconnected to the (core?) of the database.

best, Miguel

PS: I think a getting started guide urges, I am getting lost already!


--
############################
Chris Mutel
Technology Assessment Group, LEA
Paul Scherrer Institut
OHSA D22
5232 Villigen PSI
Switzerland
http://chris.mutel.org
Telefon: +41 56 310 5787
############################

Join hackathon2019@bonsai.groups.io to automatically receive all group messages.