Re: Nutrition / Linked data from Thomas FRANCART on 2009-01-11 (public-lod@w3.org from January 2009)

From: Thomas FRANCART <thomas.francart@mondeca.com>
Date: Sun, 11 Jan 2009 12:01:00 +0100
To: "Daniel O'Connor" <daniel.oconnor@gmail.com>
Cc: public-lod@w3.org
Message-ID: <2d799a410901110301v29980f18r2c4ad046c639c94d@mail.gmail.com>
Hi Daniel

Regarding the SPARQL endpoint you might want to have a look at Joseki (
http://www.joseki.org/), that provides the infrastructure for a SPARQL
endpoint; all you need to do, I think, is to re-implement a Java class, but
the webservice infrastructure is given to you.
Maybe you want to look also to the Bibliographic Ontology (
http://bibliontology.com/) that models "scientific papers related" things.

Best,
Thomas

Thomas Francart
CTO
Mondeca
 3, cité Nollez 75018 Paris France <thomas.francart@mondeca.com>
Website: www.mondeca.com
Blog: Leçons de choses <http://mondeca.wordpress.com>



2009/1/10 Daniel O'Connor <daniel.oconnor@gmail.com>

> Hey all,
> I'm Daniel O'Connor, a software engineer from Australia.
>
> At the moment I'm trying to get a lot of food nutrition data together from
> a whole bunch of different sources and create a bit of an ontology; publish
> it as RDF; and make sure its chock full of linked data goodness; and I could
> use your help, advice, pointers and encouragement.
>
> Use cases include things like shopping, diet / fitness applications,
> cooking, and much more.
>
>    - what did you eat today? -> hey, that's only 75% of your recommended
>    daily energy intake
>    - what is the approximate food energy in this recipe?
>    - tell me the fattiest food I'm eating and replace it with one with
>    more protein (but the same energy content)
>
>
> The data sources I've got on my list so far are:
>
>    - USDA's SR21 food nutrients data (public domain)
>    - Australia's NUTTAB 06 data (not so public domain)
>    - Canadia's CNF data (haven't delved into it in depth)
>
> The typical format provided is CSV, so I'm going through and mapping those
> CSV exports back into a RDBMS (php + mysql / pgsql / etc), then providing
> tools to generate RDF out, and publishing the static results.
>
>
> You can see (and get) the code from:
> http://freebase-owl.googlecode.com/svn/trunk/nutrition/
>
> and read a bit more about installing from:
>
> http://clockwerx.blogspot.com/2009/01/generating-nutritional-data-rdf-from.html
>
>
> and view samples of the output:
> USDA:
> http://lauken.com/doconnor/nutrition/usda/1006.rdf
>
> NUTTAB:
> http://lauken.com/doconnor/nutrition/nuttab/01A10027.rdf
>
> Ontology (draft!):
> http://www.lauken.com/doconnor/nutrition/0.1/schema.rdf
>
>
>
> There's a lot of work for me here, and if anyone here has knowledge or a
> helping hand, I'd love to hear from you, especially regarding the ones in
> bold.
>
>    - Resolve licensing agreements with Aust. government for rights to
>    reproduce data (in progress)
>    - Model Canadian data
>    - *Find or create a suitable ontology for Nutrition data* (I would have
>    expected some common terms from the bio-rdf community, but I don't have the
>    background to know what I'm looking for)
>    - Model the USDA, NUTTAB and Canadian extensions as appropriate
>    - Find or create (ick hope not) an ontology for measurements in
>    relation to typical nutrition measurements (again, there's no semantic web
>    concepts for milligrams, kilocalories, etc - not even in dbpedia. timbl did
>    some very high level concepts of what a Gram / etc is; but its not quite the
>    same)
>     - Find or create a list of terms used in nutrition data
>    (shorthand/abbreivations) - ie CBODF = "Carbohydrate by difference", but I
>    can't seem to find a good list of these outside of the USDA data itself.
>    - Find or create a *journal publications ontology* (dublincore might do
>    it though; or some other bibliographic ontology) - suggestions?
>    - Find or create *science terms ontology* (Paper, Subject, Experiment,
>    Samples, etc) - anyone?
>    - Create *owl:sameAs links to DBPedia* topics in some automated fashion
>    - this is tricky, because a lot of the data is written as "Cheese, blue" and
>    is much more granular than wikipedia articles about Cheese.
>    - Create *owl:sameAs links to Freebase* topics in some automated
>    fashion - ditto
>    - *Interlink Canadian, NUTTAB, USDA data* in some automated fashion -
>    similar - different naming schemes make using dc:title as a IFP a bit
>    annoying.
>    - Render full sets of RDF for each
>    - Publish these somewhere - http://lauken.com/doconnor/ is not suitable
>    for anything more than a sandbox
>    - Provide human interfaces as appropriate - if anyone wanted to create
>    *shiny XSLT -> XHTML *perhaps; or PHP glue...
>    - *Setup a SPARQL endpoint* (I have a hell of a time doing this in my
>    development environment, so this might not happen) - HELP!
>     - Provide unit test coverage for all generator tools
>    - Refactor lots
>
>
>
>
>
Received on Sunday, 11 January 2009 11:01:40 UTC