Re: Literals as subjects in Turtle (but not in the RDF model) [was: Inverses of RDF and RDFS predicates] from Nathan on 2012-05-03 (public-rdf-comments@w3.org from May 2012)

From: Nathan <nathan@webr3.org>
Date: Thu, 03 May 2012 19:59:42 +0100
To: David Booth <david@dbooth.org>
CC: Ivan Herman <ivan@w3.org>, David Wood <david@3roundstones.com>, Richard Cyganiak <richard@cyganiak.de>, "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>
Message-ID: <4FA2D59E.2090303@webr3.org>
This all seems very circular, self perpetuating, and arbitrary.

The reason "literal as subjects" do not appear in the serializations is 
because they are not handled in the model. The reason they are not 
handled in the model is because they do not appear in the 
serializations. Thus, circular.

Given a random property y and it's inverse x, then what factors are 
considered when deciding which of x & y to choose as the "primary" 
property, which becomes named and documented by it's creator. Well if 
the domain of y is literal then defining that property is pretty much a 
wasted effort since both the RDF model and serializations won't cater 
for it, thus the property creator will choose to name and document x 
with a range of literal. Thus, self perpetuating.

When working with RDF computationally, or doing any reasoning and 
inference, the code one creates has to handle, at some point, literal 
subjects - this is because as we all know RDF is a graph, not a tree. 
Adding a constraint that one node in that graph can only be 
(un)named-node whilst another can be of any type is entirely arbitrary 
and not reflected by the core mathematics or the shape of the world. 
This is why why Tim & Dan created N3 they naturally allowed literal 
subjects, and similarly why you'll find them in LBase from pat, and why 
you'll find them in the depths of pretty much all our code. Arbitrary.

This is confusing, leads to unexpected functionality, arbitrarily limits 
unexpected reuse, makes the model/semantics more complex than they need 
to be, leads to inconsistencies which rear their heads often, and 
something to which not one sound technical or moral argument been seen.

I'll duck back out now as I've wasted enough of everybody time over the 
years on this issue, and it certainly isn't going to change.

Best,

Nathan

David Booth wrote:
> Drat.  Here I was, first trying to make lemonade from lemons, then
> trying to exchange the lemons for plums, and I get tomatoes thrown back
> both ways.  ;)
> 
> I guess the only real solution is adopt a charter that allows this
> glitch to be fixed properly, in the RDF model.
> 
> David
> 
> On Thu, 2012-05-03 at 19:01 +0200, Ivan Herman wrote:
>> David,
>>
>> (As the others, my reaction is personal. Not an official WG or W3C
>> position...)
>>
>> Just to react on the literal as subject in turtle issue: I would be
>> very much opposed to this. We already have a major issue in the Web
>> community at large to convey the difference between RDF  as a model,
>> and the syntax expressing it. This confusion backfired big time in the
>> past. Creating an RDF serialization that would allow more than what
>> the RDF model allows would add to the confusion. Big time.
>>
>> Sorry...
>>
>> Ivan
>>
>>
>>
>> ---
>> Ivan Herman
>> Tel:+31 641044153
>> http://www.ivan-herman.net
>>
>> (Written on mobile, sorry for brevity and misspellings...)
>>
>>
>>
>> On 3 May 2012, at 17:53, David Booth <david@dbooth.org> wrote:
>>
>>> Hi David & Richard,
>>>
>>> I like the line of thinking that you suggest, and I agree with the
>>> practical arguments that you make (about not materializing inverse
>>> triples and not maintaining extra vocabulary), but there is currently an
>>> asymmetry in the RDF language that makes this not quite work in the
>>> general case, though it can work in many specific cases.  
>>>
>>> In essence, you are suggesting that there is no difference between an
>>> inverse predicate and the expression of that predicate in the opposite
>>> direction of the triple, and therefore the inverse predicate is
>>> unnecessary, because it is redundant.  In other words, if IP is the
>>> inverse of predicate P, then for any X and Y, if I want to express the
>>> following fact in RDF
>>>
>>>  X IP Y .
>>>
>>> but without using the predicate IP, then instead I can merely represent
>>> that fact in RDF as
>>>
>>>  Y P X .
>>>
>>> and the exact same information is captured.
>>>
>>> I think this is a great way to look at it.  (And I advocated this line
>>> of thinking at the RDF Next Steps workshop two years ago.)  But the
>>> glitch is that for this to always work, RDF must allow literals in the
>>> subject position, and it doesn't.  Furthermore, the RDF WG charter does
>>> not allow this glitch to be fixed in the RDF model:
>>> http://www.w3.org/2011/01/rdf-wg-charter
>>> "3. Out of Scope.  Some features are explicitly out of scope for the
>>> Working Group . . . Removing current restrictions in the RDF model
>>> (e.g., literals not allowed as subjects, or blank nodes as predicates)"
>>>
>>> On the other hand, just to toss an idea out there . . . even if the
>>> charter does not allow this to be fixed in the RDF *model*, how about at
>>> least fixing it in the Turtle *syntax*?  The following two, simple
>>> changes to the Turtle grammar would make it easy to express any triple
>>> in the inverse direction.  First, allow literals as subjects:
>>>
>>> [10]   subject   ::=   IRIref
>>>                       | blank
>>>                       | literal
>>>
>>> Second, allow a predicate to be written in the inverse direction:
>>>
>>> [11]   predicate ::=   IRIref
>>>                       | "^" IRIref
>>>
>>> Granted, these changes would make it possible to write some things in
>>> Turtle syntax that would not be valid RDF.  (Hmm ... would that be the
>>> case anyway?)  So if one wanted to be certain that some Turtle is valid
>>> RDF, one would have to run it through an RDF validator.
>>>
>>> RDF/XML could still merrily reject literals as subjects.   
>>>
>>> Tools that people do not want to update could still (rightfully) reject
>>> RDF that had literals as subjects, while those of us who would rather
>>> live without this restriction could use tools that allow them.  Freedom
>>> of choice!  Rah rah rah!  :)
>>>
>>> One could argue that this would represent an end run around the intent
>>> of the charter.  But I personally find the restriction against literals
>>> as subjects so silly and onerous that I think this approach could
>>> represent a reasonable balance between those who don't want to modify
>>> old tools and those who don't want this restriction.
>>>
>>> Comments?
>>>
>>> David
>>>
>>>
>>> On Thu, 2012-05-03 at 07:27 -0400, David Wood wrote:
>>>> Hi David,
>>>>
>>>> This is also a personal response, since the WG has not yet discussed
>>>> the issue.
>>>>
>>>> We have implemented support for reverse link traversal in Callimachus
>>>> specifically to avoid the need for materializing additional triples.
>>>> Other systems have similar functionality.  Thus, I am in agreement
>>>> with Richard that the *standardization* of inverse predicates may be
>>>> detrimental in that it may constitute an encouragement for
>>>> materialization over other traversal mechanisms.  In my opinion, those
>>>> mechanisms are best left to implementors.
>>>>
>>>> Regards,
>>>> Dave
>>>>
>>>>
>>>>
>>>>
>>>> On May 3, 2012, at 07:05, Richard Cyganiak wrote:
>>>>
>>>>> Hi David,
>>>>>
>>>>> (This is a personal response, not necessarily representing WG opinion.)
>>>>>
>>>>> I think that introducing inverse properties would be a bad idea
>>>> here, because it leads to an unnecessary proliferation of redundant
>>>> vocabulary terms and makes querying and generally working with the
>>>> data much harder.
>>>>> Avoiding “incoming” arcs to an RDF node, and wanting only having
>>>> “outgoing” arcs, is a common reflex in the community. This is very
>>>> unfortunate IMO. Both kinds of arcs are equally important and
>>>> essential. We need *graphs* not trees.
>>>>> I'm not convinced by the reasons you state for introducing inverses.
>>>> See below.
>>>>> On 30 Apr 2012, at 15:40, David Booth wrote:
>>>>>> 1. It allows one to conveniently distinguish those statements from other
>>>>>> statements in which :C appears in the object position of the triple.
>>>>>> This is what is done in computing the Concise Bounded Description:
>>>>>> http://www.w3.org/Submission/CBD/
>>>>> The answer is right in this document: Symmetric CBDs.
>>>>>
>>>>>> It is a common approach taken for DESCRIBE queries in SPARQL.
>>>>> Generally the DESCRIBE behaviour can be configured in the RDF store
>>>> to SCBDs.
>>>>>> One could reasonably argue that instead you should put those statements
>>>>>> in a separate graph if you wish to distinguish them from statements in
>>>>>> which :C appears as in the object position of the triple.  Indeed, one
>>>>>> could, but that adds complexity.  And the fact is, it is convenient to
>>>>>> be able to do it this way.
>>>>> Why would you want to put them into a different graph? Just put them into the same graph.
>>>>>
>>>>>> 2. It allows one to use certain optimizations that are asymmetric.  In
>>>>>> particular, if I represent my RDF triples using a hash table for each
>>>>>> subject, then I can very quickly and easily lookup the members of
>>>>>> class :C by using rdf:isTypeOf as the hash table index.  
>>>>> Use two hash tables, one for incoming and one for outgoing triples. Now you can represent nodes in a graph, rather than just nodes in a tree.
>>>>>
>>>>>> In essence, the ability to use the inverse property gives the author
>>>>>> more flexibility in writing RDF.  
>>>>> It gives flexibility for RDF authors, but creates headaches for users of the data, and asks vocabulary maintainers to do lots of redundant extra work.
>>>>>
>>>>> All the best,
>>>>> Richard
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> This can be helpful both as a
>>>>>> convenience for the author and to simplify downstream code that
>>>>>> processes that RDF.
>>>>>>
>>>>>> Let me know if further clarification would help.
>>>>>>
>>>>>> Thanks!
>>>>>> David
>>>>>>
>>>>>> On Mon, 2012-04-30 at 08:20 -0400, David Wood wrote:
>>>>>>> Hi David,
>>>>>>>
>>>>>>> Can you please articulate one or more use cases to accompany this
>>>>>>> feature request?  Thanks.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Dave
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Apr 29, 2012, at 19:43, David Booth wrote:
>>>>>>>
>>>>>>>> If this has already been considered and rejected by the WG then please
>>>>>>>> ignore, but . . . 
>>>>>>>>
>>>>>>>> It would be helpful if the RDF and RDFS specs defined inverses for the
>>>>>>>> properties that they define.  For example, if
>>>>>>>>
>>>>>>>> :x  rdf:type  :C .
>>>>>>>>
>>>>>>>> then one might write:
>>>>>>>>
>>>>>>>> :C rdf:isTypeOf :x .
>>>>>>>>
>>>>>>>> and similarly for other properties.
>>>>>>>>
>>>>>>>> I have resorted to defining my own inverse properties for some of these,
>>>>>>>> but it seems silly to do so, rather than standardizing them, especially
>>>>>>>> since it wouldn't add anything significant to the semantics.
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> David Booth, Ph.D.
>>>>>>>> http://dbooth.org/
>>>>>>>>
>>>>>>>> Opinions expressed herein are those of the author and do not necessarily
>>>>>>>> reflect those of his employer.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> -- 
>>>>>> David Booth, Ph.D.
>>>>>> http://dbooth.org/
>>>>>>
>>>>>> Opinions expressed herein are those of the author and do not necessarily
>>>>>> reflect those of his employer.
>>>>>>
>>>>>>
>>>>
>>>>
>>> -- 
>>> David Booth, Ph.D.
>>> http://dbooth.org/
>>>
>>> Opinions expressed herein are those of the author and do not necessarily
>>> reflect those of his employer.
>>>
>>>
>>
>
Received on Thursday, 3 May 2012 19:00:34 UTC