- From: Jay Luker <lbjay@reallywow.com>
- Date: Sun, 11 Jan 2009 08:14:09 -0500
- To: public-lod@w3.org
- Message-ID: <30292b940901110514u2234b6b6l90115fbecba26388@mail.gmail.com>
Really interesting, Daniel. This hits kind of a sweet spot for me in that it intersects LOD & food. I've been toying with some ideas related more to recipes and cooking, but also with the thought of using the USDA data. For the SPARQL endpoint, since your code is PHP I would think the ARC modules would be a natural fit. There's a good example of how easy it here: http://inkdroid.org/journal/2008/07/07/lcshinfo-sparql-endpoint/. --jay On Sat, Jan 10, 2009 at 5:14 AM, Daniel O'Connor <daniel.oconnor@gmail.com>wrote: > Hey all, > I'm Daniel O'Connor, a software engineer from Australia. > > At the moment I'm trying to get a lot of food nutrition data together from > a whole bunch of different sources and create a bit of an ontology; publish > it as RDF; and make sure its chock full of linked data goodness; and I could > use your help, advice, pointers and encouragement. > > Use cases include things like shopping, diet / fitness applications, > cooking, and much more. > > - what did you eat today? -> hey, that's only 75% of your recommended > daily energy intake > - what is the approximate food energy in this recipe? > - tell me the fattiest food I'm eating and replace it with one with > more protein (but the same energy content) > > > The data sources I've got on my list so far are: > > - USDA's SR21 food nutrients data (public domain) > - Australia's NUTTAB 06 data (not so public domain) > - Canadia's CNF data (haven't delved into it in depth) > > The typical format provided is CSV, so I'm going through and mapping those > CSV exports back into a RDBMS (php + mysql / pgsql / etc), then providing > tools to generate RDF out, and publishing the static results. > > > You can see (and get) the code from: > http://freebase-owl.googlecode.com/svn/trunk/nutrition/ > > and read a bit more about installing from: > > http://clockwerx.blogspot.com/2009/01/generating-nutritional-data-rdf-from.html > > > and view samples of the output: > USDA: > http://lauken.com/doconnor/nutrition/usda/1006.rdf > > NUTTAB: > http://lauken.com/doconnor/nutrition/nuttab/01A10027.rdf > > Ontology (draft!): > http://www.lauken.com/doconnor/nutrition/0.1/schema.rdf > > > > There's a lot of work for me here, and if anyone here has knowledge or a > helping hand, I'd love to hear from you, especially regarding the ones in > bold. > > - Resolve licensing agreements with Aust. government for rights to > reproduce data (in progress) > - Model Canadian data > - *Find or create a suitable ontology for Nutrition data* (I would have > expected some common terms from the bio-rdf community, but I don't have the > background to know what I'm looking for) > - Model the USDA, NUTTAB and Canadian extensions as appropriate > - Find or create (ick hope not) an ontology for measurements in > relation to typical nutrition measurements (again, there's no semantic web > concepts for milligrams, kilocalories, etc - not even in dbpedia. timbl did > some very high level concepts of what a Gram / etc is; but its not quite the > same) > - Find or create a list of terms used in nutrition data > (shorthand/abbreivations) - ie CBODF = "Carbohydrate by difference", but I > can't seem to find a good list of these outside of the USDA data itself. > - Find or create a *journal publications ontology* (dublincore might do > it though; or some other bibliographic ontology) - suggestions? > - Find or create *science terms ontology* (Paper, Subject, Experiment, > Samples, etc) - anyone? > - Create *owl:sameAs links to DBPedia* topics in some automated fashion > - this is tricky, because a lot of the data is written as "Cheese, blue" and > is much more granular than wikipedia articles about Cheese. > - Create *owl:sameAs links to Freebase* topics in some automated > fashion - ditto > - *Interlink Canadian, NUTTAB, USDA data* in some automated fashion - > similar - different naming schemes make using dc:title as a IFP a bit > annoying. > - Render full sets of RDF for each > - Publish these somewhere - http://lauken.com/doconnor/ is not suitable > for anything more than a sandbox > - Provide human interfaces as appropriate - if anyone wanted to create > *shiny XSLT -> XHTML *perhaps; or PHP glue... > - *Setup a SPARQL endpoint* (I have a hell of a time doing this in my > development environment, so this might not happen) - HELP! > - Provide unit test coverage for all generator tools > - Refactor lots > > > > >
Received on Monday, 12 January 2009 09:25:59 UTC