Re: An from Hugh Glaser on 2013-12-02 (public-lod@w3.org from December 2013)

From: Hugh Glaser <hugh@glasers.org>
Date: Mon, 2 Dec 2013 11:17:58 +0000
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-lod community <public-lod@w3.org>
Message-Id: <1703D089-7B0B-4064-98A2-9D83D78001DE@glasers.org>
Thanks Andy,
Sorry, I had a brain-fart (senior moment?), and forgot that we were dealing with RDF 1.1.
I guess I have suffered the pain of unknown presence of datatypes in the RDF terms for literals for so long it takes a while for me to accept that it has been fixed.
Thanks so much to the people that did it.

Using the bnode solution would be like bringing back the complexity of the optional datatype, which would bring back the pain!
Best
Hugh

On 2 Dec 2013, at 11:04, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:

> 
> 
> On 01/12/13 23:02, Hugh Glaser wrote:
>> Hi.
>> Thanks.
>> A bit of help please :-)
>> On 1 Dec 2013, at 17:36, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:
>> 
>>> 
>>> 
>>> On 01/12/13 12:25, Tim Berners-Lee wrote:
>>>> 
>>>> On 2013-11 -23, at 12:21, Andy Seaborne wrote:
>>>> 
>>>>> 
>>>>> 
>>>>> On 23/11/13 17:01, David Booth wrote:
>>>>>> [...]
>>>>>> This would have been fixed if the RDF model had been changed to
>>>>>> represent the language tag as an additional triple, but whether this
>>>>>> would have been a net benefit to the community is still an open
>>>>>> question, as it would add the complexity of additional triples.
>>>>> 
>>>>> Different.  Maybe better, maybe worse.
>>>>> 
>>>>> 
>>>>> Do you want all your "abc" to be the same language?
>>>>> 
>>>>>   "abc" rdf:lang "en"
>>>>> 
>>>>> or multiple languages:
>>>>> 
>>>>>   "abc" rdf:lang "cy" .
>>>>>   "abc" rdf:lang "en" .
>>>>> 
>>>>> 
>>>>> ?
>>>>> 
>>>>> Unlikely - so it's bnode time ...
>>>>> 
>>>>> :x :p [ rdf:value "abc" ; rdf:lang "en" ] .
>>>> 
>>>> The nice thing about this in a n3rules-like system (where FILTER and WHERE clauses are not distinct and some properties are just builtins)   is that rdf:value and rdf:lang can be made builtins so a datatypes literal can behave just like a bnode with two properties if you want to.
>>>> 
>>>> But I have always preferred it with not 2 extra triples, just one:
>>>> 
>>>> 	:x  :p [ lang:en "cat" ]
>>>> 
>>>> which allows you also to write things like
>>>> 
>>>> 	:x :p  [ lang:en "cat"] , [ lang:fr "chat" ].
>>>> 
>>>> or if you use the  ^  back-path syntax of N3 (which was not taken up in turtle),
>>>> 
>>>> 	:x :p "cat"^lang:en,  "chat"^lang:fr .
>>>> 
>>>> You can do the same with datatypes:
>>>> 
>>>> 	:x :q   "2013-11-25"^xsd:date .
>>>> 
>>>> instead of
>>>> 
>>>> 	:x :q   "2013-11-25"^xsd:date .
>>> 
>>> This seems to bring it it's own issues.  These bnodes seem to be like untidy literals as considered in RDF-2004 WG.
>>> 
>>> :x  :p [ lang:en "cat" ]
>>> :x  :p [ lang:en "cat" ]
>>> :x  :p [ lang:en "cat" ]
>>> 
>>> is 6 triples.
>>> 
>>> :x :p :q .
>>> :x :p :q .
>>> :x :p :q .
>>> 
>>> is 1 triple.  Repeated read in same file - this already causes confusion.
>>> 
>>> :x :p "cat" .
>>> :x :p "cat" .
>>> :x :p "cat" .
>>> 
>>> is 1 triple or is it 3 triples because it's really
>> Is it not 1 triple if you take the first view or 6 triples if you take the second?
>> Or probably I don’t understand bnodes properly!?
>>> 
>>> :x :p [ xsd:string "cat" ].
>>> 
>>> :x :p 123 .
>>> :x :p 123 .
>>> :x :p 123 .
>>> 
>>> It makes it hard to ask "do X and Y have the same value for :p?" - it gets messy to consider all the cases of triple patterns that arise and I would not want to push that burden back onto the application writer. Why can't the app writer say "find me all things which a property value less than 45?
>> I see it makes it hard, but I don’t see it as any harder than what we have now, with multiple patterns that do and don’t have ^^xsd:String
>> As I said before, with the ^^xsd you need to consider a bunch of patterns to do the query - again, it is messy, but is it messier?
>> 
>> Actually I find
>>  { ?s1 ?p [ xsd:string ?str ] . ?s2 ?p [ xsd:string ?str ] . }
>> with a possible also
>>  { ?s1 ?p ?str . ?s2 ?p ?str . }
> 
> Let's talk numbers (strings have a lexical form that looks like the value) and have 123 as shorthand for [ xsd:integer "123 ].  And let's ignore rdf:langString.
> 
> { ?s1 ?p ?x . ?s2 ?p ?x . }
> 
> does not care whether ?x is a URI or a literal at the moment.  Your example is a good one as it's "?p" so the engine does not know whether it's a datatype property or a object property.
> 
> With bnodes this may match, it probably doesn't.  It depends on the micro-detail of the data.
> 
> # No.
> :x1 :p 123 .
> :x2 :p 123 .
> 
> # Yes
> :s1 :p _:a .
> :s2 :p _:a
> _:a xsd:string "abc" .
> 
> Sure, if you know it's an integer
>   ?s1 ?p [ xsd:integer ?str ]
> or even:
> { ?s1 ?p [ ?dt ?str ] . ?s2 ?p [ ?dt ?str ] . }
> 
> { ?s1 ?p [ ?dt ?str ] . ?s2 ?p [ ?dt ?str ] . }
> 
> though I think this is shifting unnecessary cognitive model onto the app writer.
> 
> I didn't say the access language was SPARQL :-)  I meant how people think about accessing the data.  Datatype properties are really very bizarre in this world.
> 
> And this is at the fine grain level.  Now apply to real queries that are 10s of lines long.
> 
> 
> { ?s1 ?p [ xsd:integer "123 ] }
> { ?s1 ?p 123 }
> 
> it might be possible to make that bNode infer to the value 123 which would be a win.  Making literals value-centric not appearance/struct based would be a very nice.
> 
> 
> And counting.  Counting matters to people (e.g. facetted browse)
> 
> 	Andy
> 
> PS I started my first email draft with the argument that it was better to have the more triples form ... but the usability caused me to recreate the tidy literals thing, not that I was there are the time.
> 
>> much easier to work with than something that has this stuff optionally tacked on the end of literals, that isn’t really part of the string but isn’t part of RDF either.
>> Or maybe it is part of the literal but not the string? Surely that should be clear to me?
>> 
>> I just don’t see there is a difference in complexity for querying - it is just that the current situation is genuinely messier for consumers because there are two notations in play, whereas if RDF is so good we should have everything in RDF.
>> Not that I would say anything should change :-) it ain’t actually broken, but it could get fixed.
>> 
>> (Oh dear, Hugh showing his ignorance of the fancy stuff again)
>> 
>> Best
>> Hugh
>>> 
>>> To give that, if we add interpretation of bNodes used in this value form (datatype properties vs object properties ?), so you can ask about shared values, we have made them tidy again.  But then it is little different from structured literals with @lang and ^^datatype.
>>> 
>>> Having the data model and the access model different does not gain anything.  The data model should reflect the way the data is accessed.
>>> 
>>> Like RDF lists, or seq/alt/bag, encoding values in triples is attractive in its uniformity but the "triples" nature always shows through somewhere, making something else complicated.
>>> 
>>> 	Andy
>>> 
>>> PS Graph leaning does not help because you can't add data incrementally if leaning is applied at each addition.
>>> 
>>>> I suggested way back these properties as a way of putting the info into the graph
>>>> but my suggestion was not adopted.  I think it would have made the model
>>>> more complete which would have been a good think, though
>>>> SPARQL would need to have language-independent query matching as a  special case -- but
>>>> it does now too really.
>>>> 
>>>> (These are interpretation properties.  I must really update
>>>> http://www.w3.org/DesignIssues/InterpretationProperties.html)
>>>> 
>>>> Units are fun as properties too. http://www.w3.org/2007/ont/unit
>>>> 
>>>> Tim
>>>> 
>>>>> 
>>>>> 	Andy
>> 
> 

-- 
Hugh Glaser
   20 Portchester Rise
   Eastleigh
   SO50 4QS
Mobile: +44 75 9533 4155, Home: +44 23 8061 5652
Received on Monday, 2 December 2013 11:18:25 UTC