Re: Granular dereferencing ( prop by prop ) using REST + LinkedData; Ideas?

From: Yves Raimond <yves.raimond@gmail.com>
Date: Fri, 2 Jan 2009 23:20:52 +0000
Message-ID: <82593ac00901021520k4ac07237kf88c6c09350426dc@mail.gmail.com>
To: "Richard Cyganiak" <richard@cyganiak.de>
Cc: "Aldo Bucchi" <aldo.bucchi@gmail.com>, "public-lod@w3.org" <public-lod@w3.org>

Hello, and happy new year!

> Happy new year to all LODers! 2009 will certainly be another interesting
> year around here!
> Aldo,
> The issue you describe below has been discussed in a long thread back in
> 2007. We mostly talked about a slightly different problem -- where a
> resource description would include a very large number of values for a
> particular property -- and therefore we want to move those triples into a
> separate document that is somehow reachable from the resource description.
> But a solution to that problem would likely also solve your problem (you
> have just a single value for the property but it's expensive to compute and
> should therefore be retrieved in a separate HTTP call).
> I proposed this solution:
> http://simile.mit.edu/mail/ReadMsg?listName=Linking%20Open%20Data&msgId=20926
> And some refinements here:
> http://simile.mit.edu/mail/ReadMsg?listName=Linking%20Open%20Data&msgId=20962
> This is actually based on some side comments that TimBL made in his original
> Linked Data document -- I think he saw this problem coming.
> I started a page on the ESW wiki back then:
> http://esw.w3.org/topic/SeparateDocumentsForLongPropertyLists
> There was no real conclusion to the thread, and no formal proposal was
> written up, just a problem statement. That's because we couldn't really
> reach consensus. Others were pushing for a more flexible and complex
> solution than the one I describe in the links above. See here -- it's the
> "arcs" proposal:
> http://simile.mit.edu/mail/ReadMsg?listName=Linking%20Open%20Data&msgId=20930
> http://simile.mit.edu/mail/ReadMsg?listName=Linking%20Open%20Data&msgId=21015
> I wonder what people think about this issue now, and if we can get closer to
> a workable proposal. I'm willing to spend some cycles on the writeup and
> preparing the vocabulary if we can reach some rough consensus.

This discussion was indeed interesting :)
I now tend to think that linking to a separate document is a cleaner
way to go, but I am still concerned about auto-discovery. When you see
something like:

:New_York :morePersonsBornHere <http://example.org/persons_nyc.rdf> .

in the representation of :New_York, you still need a way to describe
the fact that :morePersonsBornHere links to a document holding lots of
:birthPlace properties. Just saying that :morePersonsBornHere
rdfs:subPropertyOf rdfs:seeAlso won't do the job properly - how can
you tell before retrieving the whole document?

But perhaps the approach I proposed when we discussed the void:example
property could work, in exactly the same way as in [1].

In the representation of :New_York, we could write something like (in N3):

<http://example.org/persons_nyc.rdf> void:example { :al_pacino
:birthPlace :New_York }.

Then, a SPARQL query like the following could find the documents that
hold information about persons born in New York:

 ?doc void:example ?g .
 GRAPH ?g {
  ?person :birthPlace :New_York .

One of the good thing with this approach is that the "patterns" of
information held in the target document can be arbitrarily complex -
and the only thing you have to do is to provide an example RDF graph,
holding something representative of what you put in that document.


[1] http://blog.dbtune.org/post/2008/06/12/Describing-the-content-of-RDF-datasets
