Re: Lang and dt in the graph. Was: Dumb SPARQL query problem from Andy Seaborne on 2013-12-01 (public-lod@w3.org from December 2013)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sun, 01 Dec 2013 17:36:43 +0000
To: public-lod@w3.org
Message-ID: <529B73AB.2010604@epimorphics.com>
On 01/12/13 12:25, Tim Berners-Lee wrote:
>
> On 2013-11 -23, at 12:21, Andy Seaborne wrote:
>
>>
>>
>> On 23/11/13 17:01, David Booth wrote:
>>> [...]
>>> This would have been fixed if the RDF model had been changed to
>>> represent the language tag as an additional triple, but whether this
>>> would have been a net benefit to the community is still an open
>>> question, as it would add the complexity of additional triples.
>>
>> Different.  Maybe better, maybe worse.
>>
>>
>> Do you want all your "abc" to be the same language?
>>
>>    "abc" rdf:lang "en"
>>
>> or multiple languages:
>>
>>    "abc" rdf:lang "cy" .
>>    "abc" rdf:lang "en" .
>>
>>
>> ?
>>
>> Unlikely - so it's bnode time ...
>>
>> :x :p [ rdf:value "abc" ; rdf:lang "en" ] .
>
> The nice thing about this in a n3rules-like system (where FILTER and WHERE clauses are not distinct and some properties are just builtins)   is that rdf:value and rdf:lang can be made builtins so a datatypes literal can behave just like a bnode with two properties if you want to.
>
> But I have always preferred it with not 2 extra triples, just one:
>
> 	:x  :p [ lang:en "cat" ]
>
> which allows you also to write things like
>
> 	:x :p  [ lang:en "cat"] , [ lang:fr "chat" ].
>
> or if you use the  ^  back-path syntax of N3 (which was not taken up in turtle),
>
> 	:x :p "cat"^lang:en,  "chat"^lang:fr .
>
> You can do the same with datatypes:
>
> 	:x :q   "2013-11-25"^xsd:date .
>
> instead of
>
> 	:x :q   "2013-11-25"^xsd:date .

This seems to bring it it's own issues.  These bnodes seem to be like 
untidy literals as considered in RDF-2004 WG.

:x  :p [ lang:en "cat" ]
:x  :p [ lang:en "cat" ]
:x  :p [ lang:en "cat" ]

is 6 triples.

:x :p :q .
:x :p :q .
:x :p :q .

is 1 triple.  Repeated read in same file - this already causes confusion.

:x :p "cat" .
:x :p "cat" .
:x :p "cat" .

is 1 triple or is it 3 triples because it's really

:x :p [ xsd:string "cat" ].

:x :p 123 .
:x :p 123 .
:x :p 123 .

It makes it hard to ask "do X and Y have the same value for :p?" - it 
gets messy to consider all the cases of triple patterns that arise and I 
would not want to push that burden back onto the application writer. 
Why can't the app writer say "find me all things which a property value 
less than 45?

To give that, if we add interpretation of bNodes used in this value form 
(datatype properties vs object properties ?), so you can ask about 
shared values, we have made them tidy again.  But then it is little 
different from structured literals with @lang and ^^datatype.

Having the data model and the access model different does not gain 
anything.  The data model should reflect the way the data is accessed.

Like RDF lists, or seq/alt/bag, encoding values in triples is attractive 
in its uniformity but the "triples" nature always shows through 
somewhere, making something else complicated.

	Andy

PS Graph leaning does not help because you can't add data incrementally 
if leaning is applied at each addition.

> I suggested way back these properties as a way of putting the info into the graph
> but my suggestion was not adopted.  I think it would have made the model
> more complete which would have been a good think, though
> SPARQL would need to have language-independent query matching as a  special case -- but
> it does now too really.
>
> (These are interpretation properties.  I must really update
> http://www.w3.org/DesignIssues/InterpretationProperties.html)
>
> Units are fun as properties too. http://www.w3.org/2007/ont/unit
>
> Tim
>
>>
>> 	Andy
>>
>
>
Received on Sunday, 1 December 2013 17:37:14 UTC