Re: [TTL] Differences between SPARQL and Turtle. from Andy Seaborne on 2011-05-04 (public-rdf-wg@w3.org from May 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Wed, 04 May 2011 15:18:58 +0100
To: Eric Prud'hommeaux <eric@w3.org>
CC: Richard Cyganiak <richard@cyganiak.de>, public-rdf-wg@w3.org
Message-ID: <4DC16052.7080202@epimorphics.com>

Eric - can we reduce the problem space to the issues aroudn prefix names 
and the rest.

Aside from the prefix named issues, is Richard's wording about \u 
escapes acceptable?

What about the other areas of difference between SPARQL and Turtle?

On 04/05/11 14:27, Eric Prud'hommeaux wrote:
> * Richard Cyganiak<richard@cyganiak.de>  [2011-05-03 00:07+0100]
>> On 2 May 2011, at 20:11, Andy Seaborne wrote:
>>> # 4 RDF Collections as triple patterns
>>>
>>> 3 choices:
>>>
>>> A/ Remove from SPARQL.
>>> B/ Add to Turtle
>>> C/ Leave as is.  Discourage use
>>
>> Happy to support whichever of A and B is easier for the editor.
>>
>>> # 8 Escape Processing
>>> Proposal: Adopt Turtle style / Change SPARQL.
>>>
>>> \u escapes can only appear in strings and IRIs
>
> Richard +1'd this on the basis that allowing \u in local names would
> confused users. I'm not convinced and suspect that the RDB2RDF WG
> would want to give their users a way to algorithmically write
> shorthand like:
>
>    @prefix :<http://foo.example/DB/People/>  .
>    # triples for …People/ID=8 :
>    :ID\u003d8 :fname "Bob" ; :lname "Smith" .
>    # triples for …People/ID=9 :
>    :ID\u003d9 :fname "Sue" ; :lname "Jones" .
>
> Andy's counter proposal was to add allowable chars to the local name,
> but I believe that allowing escape chars would be less controversial.

There is a significant overlap in the membership of RDB2RDF WG and RDF WG.

In what way is it less controversial?  The goal has been to maximise the 
alignment of SPARQL and Turtle.  SPARQL (query, update) is now moving to 
last call.

I also have concerns that the mechanism of escaping makes it too easy to 
put spaces into IRIs.

>>> Strict \u-escape in strings (STRING_LITERAL1,2 STRING_LITERAL_LONG1,2) and IRI_REF)
>>>
>>> \u do not appear in the grammar but are described separately as at present.
>>
>> +1 till here.
>
> What's the motivation for having this grammatical construct outside of
> the grammer? It's trivial to include:
>
I have no opinion as to how it is implemented - I was just following 
what is already in the WG working draft document.

>
>
>>> Their use is discouraged:
>>>
>>> "4.3. String Escapes"
>>>
>>> """
>>> \u and \U escapes should be avoided in UTF-8 charset formats. They are retained in the grammar for compatibility with N-triples formats currently deployed with charset US-ASCII.
>>> """
>>
>> Unicode escapes can be a helpful fallback when some piece of the toolchain messes up the encoding; in such situations, they can be the only way to make things interoperate.
>>
>> Suggested rephrasing that doesn't restrict acceptable uses to backwards compatibility, and uses the RFC2119 SHOULD to be precise:
>>
>> """
>> Unicode characters SHOULD be used directly instead of \u and \U escapes.
>> """
>>
>> And in the N-Triples spec (if/wherever we create such a thing):
>>
>> """
>> Note: Older versions of N-Triples required \u and \U escapes for all Unicode characters beyond the US-ASCII charset. Some older N-Triples parsers may still have that restriction and may not support UTF-8 encoded Unicode characters.
>> """

 Andy

Received on Wednesday, 4 May 2011 14:19:30 UTC