- From: Thomas FRANCART <thomas.francart@mondeca.com>
- Date: Sun, 11 Jan 2009 12:01:00 +0100
- To: "Daniel O'Connor" <daniel.oconnor@gmail.com>
- Cc: public-lod@w3.org
- Message-ID: <2d799a410901110301v29980f18r2c4ad046c639c94d@mail.gmail.com>
Hi Daniel Regarding the SPARQL endpoint you might want to have a look at Joseki ( http://www.joseki.org/), that provides the infrastructure for a SPARQL endpoint; all you need to do, I think, is to re-implement a Java class, but the webservice infrastructure is given to you. Maybe you want to look also to the Bibliographic Ontology ( http://bibliontology.com/) that models "scientific papers related" things. Best, Thomas Thomas Francart CTO Mondeca 3, cité Nollez 75018 Paris France <thomas.francart@mondeca.com> Website: www.mondeca.com Blog: Leçons de choses <http://mondeca.wordpress.com> 2009/1/10 Daniel O'Connor <daniel.oconnor@gmail.com> > Hey all, > I'm Daniel O'Connor, a software engineer from Australia. > > At the moment I'm trying to get a lot of food nutrition data together from > a whole bunch of different sources and create a bit of an ontology; publish > it as RDF; and make sure its chock full of linked data goodness; and I could > use your help, advice, pointers and encouragement. > > Use cases include things like shopping, diet / fitness applications, > cooking, and much more. > > - what did you eat today? -> hey, that's only 75% of your recommended > daily energy intake > - what is the approximate food energy in this recipe? > - tell me the fattiest food I'm eating and replace it with one with > more protein (but the same energy content) > > > The data sources I've got on my list so far are: > > - USDA's SR21 food nutrients data (public domain) > - Australia's NUTTAB 06 data (not so public domain) > - Canadia's CNF data (haven't delved into it in depth) > > The typical format provided is CSV, so I'm going through and mapping those > CSV exports back into a RDBMS (php + mysql / pgsql / etc), then providing > tools to generate RDF out, and publishing the static results. > > > You can see (and get) the code from: > http://freebase-owl.googlecode.com/svn/trunk/nutrition/ > > and read a bit more about installing from: > > http://clockwerx.blogspot.com/2009/01/generating-nutritional-data-rdf-from.html > > > and view samples of the output: > USDA: > http://lauken.com/doconnor/nutrition/usda/1006.rdf > > NUTTAB: > http://lauken.com/doconnor/nutrition/nuttab/01A10027.rdf > > Ontology (draft!): > http://www.lauken.com/doconnor/nutrition/0.1/schema.rdf > > > > There's a lot of work for me here, and if anyone here has knowledge or a > helping hand, I'd love to hear from you, especially regarding the ones in > bold. > > - Resolve licensing agreements with Aust. government for rights to > reproduce data (in progress) > - Model Canadian data > - *Find or create a suitable ontology for Nutrition data* (I would have > expected some common terms from the bio-rdf community, but I don't have the > background to know what I'm looking for) > - Model the USDA, NUTTAB and Canadian extensions as appropriate > - Find or create (ick hope not) an ontology for measurements in > relation to typical nutrition measurements (again, there's no semantic web > concepts for milligrams, kilocalories, etc - not even in dbpedia. timbl did > some very high level concepts of what a Gram / etc is; but its not quite the > same) > - Find or create a list of terms used in nutrition data > (shorthand/abbreivations) - ie CBODF = "Carbohydrate by difference", but I > can't seem to find a good list of these outside of the USDA data itself. > - Find or create a *journal publications ontology* (dublincore might do > it though; or some other bibliographic ontology) - suggestions? > - Find or create *science terms ontology* (Paper, Subject, Experiment, > Samples, etc) - anyone? > - Create *owl:sameAs links to DBPedia* topics in some automated fashion > - this is tricky, because a lot of the data is written as "Cheese, blue" and > is much more granular than wikipedia articles about Cheese. > - Create *owl:sameAs links to Freebase* topics in some automated > fashion - ditto > - *Interlink Canadian, NUTTAB, USDA data* in some automated fashion - > similar - different naming schemes make using dc:title as a IFP a bit > annoying. > - Render full sets of RDF for each > - Publish these somewhere - http://lauken.com/doconnor/ is not suitable > for anything more than a sandbox > - Provide human interfaces as appropriate - if anyone wanted to create > *shiny XSLT -> XHTML *perhaps; or PHP glue... > - *Setup a SPARQL endpoint* (I have a hell of a time doing this in my > development environment, so this might not happen) - HELP! > - Provide unit test coverage for all generator tools > - Refactor lots > > > > >
Received on Sunday, 11 January 2009 11:01:40 UTC