Re: Three solution designs to the first three Graphs use cases from Sandro Hawke on 2012-02-01 (public-rdf-wg@w3.org from February 2012)

From: Sandro Hawke <sandro@w3.org>
Date: Tue, 31 Jan 2012 19:23:57 -0500
To: Steve Harris <steve.harris@garlik.com>
Cc: Ivan Herman <ivan@w3.org>, Andy Seaborne <andy.seaborne@epimorphics.com>, public-rdf-wg@w3.org
Message-ID: <1328055837.2916.47.camel@waldron>

On Fri, 2012-01-27 at 12:27 +0000, Steve Harris wrote:
> On 2012-01-27, at 10:35, Ivan Herman wrote:
> > On Jan 27, 2012, at 10:33 , Andy Seaborne wrote:
> >> On 27/01/12 03:45, Sandro Hawke wrote:
> >>> On Thu, 2012-01-05 at 11:09 +0000, Andy Seaborne wrote:
> >>>> On 04/01/12 19:23, David Wood wrote:
> >>>>> Thanks, Sandro.  That's very helpful.
> >>>>> 
> >>>>> It might be useful to consider augmenting TriG syntax to support your third solution (explicitly naming relations). I'd be quite happy with that.
> >>>> 
> >>>> What would the data model be?
> >>> 
> >>> I think: an RDF graph which can have other RDF graphs as values of its
> >>> triples.  All these graphs would be subgraphs of some greater graph, so
> >>> they can share b-nodes.
> >>> 
> >>> (This is what cwm has had implemented since 2001, I think.)
> >> 
> >> I thought this WG wasn't going there (graph literals).
> >> 
> >> Personally, I see graph literals as the clean answer but it is RDF 2 (+).  RDF 1.1 is, to me, incremental improvements within the current abstract data model.  Datatyped literals  (e.g. "<s> <p> <o>"^^rdf:graphNTriples) are unwieldy and might block doing graph literals properly in RDF 2+.
> >> 
> > 
> > I am not convinced it is such a huge jump and, if this is the only way to have a clean way forward, we may have to do this. The datatyped literals may be a way forward and, after all, the trig version of using '{' may be considered as a syntactic sugar for a datatyped literal…
> 
> This makes me /extremely/ nervous.
> 
> From the perspective of the indexing/query engine is an enormous difference, and I'm not aware of any commonly used systems that currently follow this model. So, there's a lack of experience in the community of how to deal with these structures efficiently.
> 
> I bought this kind of argument with RDF Lists (collections), and accessor functions - storing the lists natively, and also reflecting them into triples. Coming up with an implementation that was both correct and efficient turned out to be so hard that we gave up, and just elected not to use Lists in production.

I'm sad to hear about this experience with lists.  Sometime I'd like to
hear more about why that was so hard.   (Have you folks
written/presented about it?)

> If we had a critical mass of systems that worked this way I would be enthusiastic about it, but we don't.

I think it's possible to implement graph literals (like in N3, or my
third proposed solution) using a quad store, like the ones you already
use.  That's how at least one version of cwm did it.   The technique is
to map it to TriG/SameAs with minted identifiers:

So, to represent:

  <s> <p> { <a> <b> <c> }

you mint an identifier ( <g1> ) then store these quads:

  <s> <p> <g1> DEFAULT
  <a> <b> <c> <g1>

In this proposal, such a use of quads is a purely internal decision of
the implementer -- what's standard for interchange is the N3-like syntax
with the graph literals.  It's just those documents are stored for easy
access/manipulation in quads using a SameAs relation.  Elsewhere, people
remain free to use quads, internally, however they want.

Wouldn't that solve the implementation burden?

   -- Sandro

Received on Wednesday, 1 February 2012 00:24:01 UTC