Literals as subjects in Turtle (but not in the RDF model) [was: Inverses of RDF and RDFS predicates] from David Booth on 2012-05-03 (public-rdf-comments@w3.org from May 2012)

From: David Booth <david@dbooth.org>
Date: Thu, 03 May 2012 11:53:44 -0400
To: David Wood <david@3roundstones.com>
Cc: Richard Cyganiak <richard@cyganiak.de>, public-rdf-comments@w3.org
Message-ID: <1336060424.2232.20582.camel@dbooth-laptop>
Hi David & Richard,

I like the line of thinking that you suggest, and I agree with the
practical arguments that you make (about not materializing inverse
triples and not maintaining extra vocabulary), but there is currently an
asymmetry in the RDF language that makes this not quite work in the
general case, though it can work in many specific cases.  

In essence, you are suggesting that there is no difference between an
inverse predicate and the expression of that predicate in the opposite
direction of the triple, and therefore the inverse predicate is
unnecessary, because it is redundant.  In other words, if IP is the
inverse of predicate P, then for any X and Y, if I want to express the
following fact in RDF

  X IP Y .

but without using the predicate IP, then instead I can merely represent
that fact in RDF as

  Y P X .

and the exact same information is captured.

I think this is a great way to look at it.  (And I advocated this line
of thinking at the RDF Next Steps workshop two years ago.)  But the
glitch is that for this to always work, RDF must allow literals in the
subject position, and it doesn't.  Furthermore, the RDF WG charter does
not allow this glitch to be fixed in the RDF model:
http://www.w3.org/2011/01/rdf-wg-charter
"3. Out of Scope.  Some features are explicitly out of scope for the
Working Group . . . Removing current restrictions in the RDF model
(e.g., literals not allowed as subjects, or blank nodes as predicates)"

On the other hand, just to toss an idea out there . . . even if the
charter does not allow this to be fixed in the RDF *model*, how about at
least fixing it in the Turtle *syntax*?  The following two, simple
changes to the Turtle grammar would make it easy to express any triple
in the inverse direction.  First, allow literals as subjects:

[10]   subject   ::=   IRIref
                       | blank
                       | literal

Second, allow a predicate to be written in the inverse direction:

[11]   predicate ::=   IRIref
                       | "^" IRIref

Granted, these changes would make it possible to write some things in
Turtle syntax that would not be valid RDF.  (Hmm ... would that be the
case anyway?)  So if one wanted to be certain that some Turtle is valid
RDF, one would have to run it through an RDF validator.

RDF/XML could still merrily reject literals as subjects.   

Tools that people do not want to update could still (rightfully) reject
RDF that had literals as subjects, while those of us who would rather
live without this restriction could use tools that allow them.  Freedom
of choice!  Rah rah rah!  :)

One could argue that this would represent an end run around the intent
of the charter.  But I personally find the restriction against literals
as subjects so silly and onerous that I think this approach could
represent a reasonable balance between those who don't want to modify
old tools and those who don't want this restriction.

Comments?

David


On Thu, 2012-05-03 at 07:27 -0400, David Wood wrote:
> Hi David,
> 
> This is also a personal response, since the WG has not yet discussed
> the issue.
> 
> We have implemented support for reverse link traversal in Callimachus
> specifically to avoid the need for materializing additional triples.
> Other systems have similar functionality.  Thus, I am in agreement
> with Richard that the *standardization* of inverse predicates may be
> detrimental in that it may constitute an encouragement for
> materialization over other traversal mechanisms.  In my opinion, those
> mechanisms are best left to implementors.
> 
> Regards,
> Dave
> 
> 
> 
> 
> On May 3, 2012, at 07:05, Richard Cyganiak wrote:
> 
> > Hi David,
> > 
> > (This is a personal response, not necessarily representing WG opinion.)
> > 
> > I think that introducing inverse properties would be a bad idea
> here, because it leads to an unnecessary proliferation of redundant
> vocabulary terms and makes querying and generally working with the
> data much harder.
> > 
> > Avoiding “incoming” arcs to an RDF node, and wanting only having
> “outgoing” arcs, is a common reflex in the community. This is very
> unfortunate IMO. Both kinds of arcs are equally important and
> essential. We need *graphs* not trees.
> > 
> > I'm not convinced by the reasons you state for introducing inverses.
> See below.
> > 
> > On 30 Apr 2012, at 15:40, David Booth wrote:
> >> 1. It allows one to conveniently distinguish those statements from other
> >> statements in which :C appears in the object position of the triple.
> >> This is what is done in computing the Concise Bounded Description:
> >> http://www.w3.org/Submission/CBD/
> > 
> > The answer is right in this document: Symmetric CBDs.
> > 
> >> It is a common approach taken for DESCRIBE queries in SPARQL.
> > 
> > Generally the DESCRIBE behaviour can be configured in the RDF store
> to SCBDs.
> > 
> >> One could reasonably argue that instead you should put those statements
> >> in a separate graph if you wish to distinguish them from statements in
> >> which :C appears as in the object position of the triple.  Indeed, one
> >> could, but that adds complexity.  And the fact is, it is convenient to
> >> be able to do it this way.
> > 
> > Why would you want to put them into a different graph? Just put them into the same graph.
> > 
> >> 2. It allows one to use certain optimizations that are asymmetric.  In
> >> particular, if I represent my RDF triples using a hash table for each
> >> subject, then I can very quickly and easily lookup the members of
> >> class :C by using rdf:isTypeOf as the hash table index.  
> > 
> > Use two hash tables, one for incoming and one for outgoing triples. Now you can represent nodes in a graph, rather than just nodes in a tree.
> > 
> >> In essence, the ability to use the inverse property gives the author
> >> more flexibility in writing RDF.  
> > 
> > It gives flexibility for RDF authors, but creates headaches for users of the data, and asks vocabulary maintainers to do lots of redundant extra work.
> > 
> > All the best,
> > Richard
> > 
> > 
> > 
> > 
> >> This can be helpful both as a
> >> convenience for the author and to simplify downstream code that
> >> processes that RDF.
> >> 
> >> Let me know if further clarification would help.
> >> 
> >> Thanks!
> >> David
> >> 
> >> On Mon, 2012-04-30 at 08:20 -0400, David Wood wrote:
> >>> Hi David,
> >>> 
> >>> Can you please articulate one or more use cases to accompany this
> >>> feature request?  Thanks.
> >>> 
> >>> Regards,
> >>> Dave
> >>> 
> >>> 
> >>> 
> >>> 
> >>> On Apr 29, 2012, at 19:43, David Booth wrote:
> >>> 
> >>>> If this has already been considered and rejected by the WG then please
> >>>> ignore, but . . . 
> >>>> 
> >>>> It would be helpful if the RDF and RDFS specs defined inverses for the
> >>>> properties that they define.  For example, if
> >>>> 
> >>>> :x  rdf:type  :C .
> >>>> 
> >>>> then one might write:
> >>>> 
> >>>> :C rdf:isTypeOf :x .
> >>>> 
> >>>> and similarly for other properties.
> >>>> 
> >>>> I have resorted to defining my own inverse properties for some of these,
> >>>> but it seems silly to do so, rather than standardizing them, especially
> >>>> since it wouldn't add anything significant to the semantics.
> >>>> 
> >>>> 
> >>>> -- 
> >>>> David Booth, Ph.D.
> >>>> http://dbooth.org/
> >>>> 
> >>>> Opinions expressed herein are those of the author and do not necessarily
> >>>> reflect those of his employer.
> >>>> 
> >>>> 
> >>> 
> >>> 
> >>> 
> >> 
> >> -- 
> >> David Booth, Ph.D.
> >> http://dbooth.org/
> >> 
> >> Opinions expressed herein are those of the author and do not necessarily
> >> reflect those of his employer.
> >> 
> >> 
> > 
> 
> 
> 

-- 
David Booth, Ph.D.
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Thursday, 3 May 2012 15:54:15 UTC