W3C home > Mailing lists > Public > public-lod@w3.org > January 2009

Re: Nutrition / Linked data

From: Semantics-ProjectParadigm <metadataportals@yahoo.com>
Date: Sat, 10 Jan 2009 08:38:21 -0800 (PST)
To: Daniel O'Connor <daniel.oconnor@gmail.com>
Cc: public-lod@w3.org
Message-ID: <464421.23509.qm@web45515.mail.sp1.yahoo.com>
Maybe AGROVOC and related programs are not suited for the narrow focus you have set, but you will want to look at www.health-plus.eu and in particular at the project deliverables to see if you can find some suitable ideas for interfacing with intended end users of your application, e.g. healthy lifestyle choices, diets etc.

The European Union has done some projects in the past, and because agriculture, food, nutrition and health have always been prioritized issues in the European Union, a lot of work has gone into amassing databases of information, possibly in multiple languages, so I would expect this cluster of themes to appear both under the headings Science and Technology and ICT in past and present Framework Programmes (which have all the funding for the European R&D programs).

Milton Ponson
GSM: +297 747 8280
Rainbow Warriors Core Foundation
PO Box 1154, Oranjestad
Aruba, Dutch Caribbean
www.rainbowwarriors.net
Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide
www.projectparadigm.info
NGO-Opensource: Creating ICT tools for NGOs worldwide for Project Paradigm
www.ngo-opensource.org
MetaPortal: providing online access to web sites and repositories of data and information for sustainable development
www.metaportal.info
SemanticWebSoftware, part of NGO-Opensource to enable SW technologies in the Metaportal project
www.semanticwebsoftware.info


--- On Sat, 1/10/09, Daniel O'Connor <daniel.oconnor@gmail.com> wrote:
From: Daniel O'Connor <daniel.oconnor@gmail.com>
Subject: Re: Nutrition / Linked data
To: metadataportals@yahoo.com
Cc: public-lod@w3.org
Date: Saturday, January 10, 2009, 3:46 PM

For the moment, I've decided to KISS - model food name => energy content broken down into fat, carbohydrates, protein; get the terms and relationships / measurements right; and expand out later to food groups, nutrients, and other detail.


At least that way I can get something published and build a few applications ontop of it; and expand it out if there is interest.

Dan Brickley mentioned FAO's AGROVOC, but its not exactly the easiest website to use and he suggests theres more happening behind the scenes than on the public website; particular in the way of vocabularies and the like.


I also came across FAO's infoods content, which provides plain text abbreviation definitions - a bit of a pain and a bit out of scope to deal with.







This is a subject that really allows the full potential of semantic technologies to be unleashed.

I recommend you also look at the WHO, FAO, and European Union programs on agriculture, food and nutrition which deal specifically with application of semantic web technologies.


Of these three you may want to look at the FAO first (Food and Agriculture Organization) of the
 UN first.

Milton Ponson
GSM: +297 747 8280
Rainbow Warriors Core Foundation
PO Box 1154, Oranjestad
Aruba, Dutch Caribbean
www.rainbowwarriors.net

Project Paradigm: A structured approach to bringing the tools for sustainable development to all stakeholders worldwide
www.projectparadigm.info

NGO-Opensource: Creating ICT tools for NGOs worldwide for Project Paradigm
www.ngo-opensource.org
MetaPortal: providing online access to web sites and repositories of data and information for sustainable development

www.metaportal.info
SemanticWebSoftware, part of NGO-Opensource to enable SW technologies in the Metaportal project
www.semanticwebsoftware.info



--- On Sat, 1/10/09, Daniel O'Connor
 <daniel.oconnor@gmail.com> wrote:
From: Daniel O'Connor <daniel.oconnor@gmail.com>

Subject: Nutrition / Linked data
To: public-lod@w3.org
Date: Saturday, January 10, 2009, 10:14 AM

Hey all,
I'm Daniel O'Connor, a software engineer from Australia.


At the moment I'm trying to get a lot of food nutrition data together from a whole bunch of different sources and create a bit of an ontology; publish it as RDF; and make sure its chock full of linked data goodness; and I could use your help, advice, pointers and encouragement.



Use cases include things like shopping, diet / fitness applications, cooking, and much more.
what did you eat today? -> hey, that's only 75% of your recommended daily energy intakewhat is the approximate food energy in this recipe?


tell me the fattiest food I'm eating and replace it with one with more protein (but the same energy content)

The data sources I've got on my list so far are:


USDA's SR21 food nutrients data (public domain)Australia's NUTTAB 06 data (not so public domain)Canadia's CNF data (haven't delved into it in depth)The typical format provided is CSV, so I'm going through and mapping those CSV exports back into a RDBMS (php + mysql / pgsql / etc), then providing tools to generate RDF out, and publishing the static results.




You can see (and get) the code from:
http://freebase-owl.googlecode.com/svn/trunk/nutrition/

and read a bit more about installing from:


http://clockwerx.blogspot.com/2009/01/generating-nutritional-data-rdf-from.html


and view samples of the output:

USDA:

http://lauken.com/doconnor/nutrition/usda/1006.rdf

NUTTAB:
http://lauken.com/doconnor/nutrition/nuttab/01A10027.rdf



Ontology (draft!):
http://www.lauken.com/doconnor/nutrition/0.1/schema.rdf



There's a lot of work for me here, and if anyone here has knowledge or a helping hand, I'd love to hear from you, especially regarding the ones in bold.


Resolve licensing agreements with Aust. government for rights to reproduce data (in progress)

Model Canadian data
Find or create a suitable ontology for Nutrition data (I would have expected some common terms from the bio-rdf community, but I don't have the background to know what I'm looking for)


Model the USDA, NUTTAB and Canadian extensions as appropriateFind or create (ick hope not) an ontology for measurements in relation to typical nutrition measurements (again, there's no semantic web concepts for milligrams, kilocalories, etc - not even in dbpedia. timbl did some very high level concepts of what a Gram / etc is; but its not quite the same)



Find or create a list of terms used in nutrition data (shorthand/abbreivations) - ie CBODF = "Carbohydrate by difference", but I can't seem to find a good list of these outside of the USDA data itself.


Find or create a journal publications ontology (dublincore might do it though; or some other bibliographic ontology) - suggestions?

Find or create science terms ontology (Paper, Subject, Experiment, Samples, etc) - anyone?
Create owl:sameAs links to DBPedia topics in some automated fashion - this is tricky, because a lot of the data is written as "Cheese, blue" and is much more granular than wikipedia articles about Cheese.


Create owl:sameAs links to Freebase topics in some automated fashion - ditto

Interlink Canadian, NUTTAB, USDA data in some automated fashion - similar - different naming schemes make using dc:title as a IFP a bit annoying.
Render full sets of RDF for each 
Publish these somewhere - http://lauken.com/doconnor/ is not suitable for anything more than a sandbox


Provide human interfaces as appropriate - if anyone wanted to create shiny XSLT -> XHTML perhaps; or PHP glue...
Setup a SPARQL endpoint (I have a hell of a time doing this in my development environment, so this might not happen) - HELP!



Provide unit test coverage for all generator toolsRefactor lots






      




      
Received on Saturday, 10 January 2009 16:39:06 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:19 UTC