Re: Nutrition / Linked data

Hi, Daniel --

On Jan 10, 2009, at 05:14 AM, Daniel O'Connor wrote:
 > The typical format provided is CSV, so I'm going through and mapping
 > those CSV exports back into a RDBMS (php + mysql / pgsql / etc), then
 > providing tools to generate RDF out, and publishing the static  
results.
[...]
 >  • Create owl:sameAs links to DBPedia topics in some automated
 >    fashion - this is tricky, because a lot of the data is written
 >    as "Cheese, blue" and is much more granular than wikipedia
 >    articles about Cheese.
 >  • Create owl:sameAs links to Freebase topics in some automated
 >    fashion - ditto
 >  • Interlink Canadian, NUTTAB, USDA data in some automated fashion -
 >    similar - different naming schemes make using dc:title as a IFP
 >    a bit annoying.
 >  • Render full sets of RDF for each
 >  • Publish these somewhere - http://lauken.com/doconnor/ is not
 >    suitable for anything more than a sandbox
 >  • Provide human interfaces as appropriate - if anyone wanted to
 >    create shiny XSLT -> XHTML perhaps; or PHP glue...
 >  • Setup a SPARQL endpoint (I have a hell of a time doing this in
 >    my development environment, so this might not happen) - HELP!


I might suggest that you consider using Virtuoso [1] instead of (or,
if you are  willing to consider the commercial version, in addition to)
MySQL/PostgreSQL/etc, because it will let you easily do a few things --

- generate dynamic RDF views of SQL data [2]
   - meaning you can easily load standard tables from the CSV, and
     automagically deliver RDF dynamically based on those tables --
     or export static RDF, if you prefer, in N3 or RDF/XML or whatever
     form you choose (the first 2 are built-in; others would need some
     XSLT or similar transformation work)

- provide a SPARQL endpoint [3]
   - it's built into Virtuoso, for all data in the quad-store, including
     any dynamic RDF views of SQL data

- programmatically insert data set links, once you have figured
   out the logic, using Virtuoso PL or any scripting language that
   supports standard data access mechanisms (e.g., PHP, Perl,
   Python, Ruby, etc.) [4]

Virtuoso can be run locally to you, or on a colo-provider's server,
including Amazon EC2. [5]

It also seems worth noting -- you are not limited to sameAs for links
between data sets or servers.  subPropertyOf, subClassOf, and the
like are also (even *more*) useful -- and if you keep these links in
a separate graph from the data triples, they're easy to keep clean
and accurate -- even up to dropping and rebuilding from scratch.

Be seeing you,

Ted

[1] http://virtuoso.openlinksw.com/wiki/main/Main/
[2] http://virtuoso.openlinksw.com/wiki/main/Main/VOSSQLRDF
[3] http://docs.openlinksw.com/virtuoso/rdfandsparql.html
[4] http://docs.openlinksw.com/virtuoso/rdfinsertmethods.html
[5] http://virtuoso.openlinksw.com/wiki/main/Main/VirtInstallationEC2


-- 
A: Yes.                      http://www.guckes.net/faq/attribution.html
| Q: Are you sure?
| | A: Because it reverses the logical flow of conversation.
| | | Q: Why is top posting frowned upon?

Ted Thibodeau, Jr.           //               voice +1-781-273-0900 x32
Evangelism & Support         //        mailto:tthibodeau@openlinksw.com
OpenLink Software, Inc.      //              http://www.openlinksw.com/
                                  http://www.openlinksw.com/weblogs/uda/
OpenLink Blogs              http://www.openlinksw.com/weblogs/virtuoso/
                                http://www.openlinksw.com/blog/~kidehen/
     Universal Data Access and Virtual Database Technology Providers

Received on Monday, 12 January 2009 15:50:22 UTC