Re: [TTL] Differences between SPARQL and Turtle. from Richard Cyganiak on 2011-05-02 (public-rdf-wg@w3.org from May 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 3 May 2011 00:07:11 +0100
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-wg@w3.org
Message-Id: <E2C9F15C-F3E1-4772-9C65-C1747E0A6CC9@cyganiak.de>

On 2 May 2011, at 20:11, Andy Seaborne wrote:
> # 4 RDF Collections as triple patterns
> 
> 3 choices:
> 
> A/ Remove from SPARQL.
> B/ Add to Turtle
> C/ Leave as is.  Discourage use

Happy to support whichever of A and B is easier for the editor.

> # 8 Escape Processing
> Proposal: Adopt Turtle style / Change SPARQL.
> 
> \u escapes can only appear in strings and IRIs
> 
> Strict \u-escape in strings (STRING_LITERAL1,2 STRING_LITERAL_LONG1,2) and IRI_REF)
> 
> \u do not appear in the grammar but are described separately as at present.  

+1 till here.

> Their use is discouraged:
> 
> "4.3. String Escapes"
> 
> """
> \u and \U escapes should be avoided in UTF-8 charset formats. They are retained in the grammar for compatibility with N-triples formats currently deployed with charset US-ASCII.
> """

Unicode escapes can be a helpful fallback when some piece of the toolchain messes up the encoding; in such situations, they can be the only way to make things interoperate.

Suggested rephrasing that doesn't restrict acceptable uses to backwards compatibility, and uses the RFC2119 SHOULD to be precise:

"""
Unicode characters SHOULD be used directly instead of \u and \U escapes.
"""

And in the N-Triples spec (if/wherever we create such a thing):

"""
Note: Older versions of N-Triples required \u and \U escapes for all Unicode characters beyond the US-ASCII charset. Some older N-Triples parsers may still have that restriction and may not support UTF-8 encoded Unicode characters.
"""

Richard

Received on Monday, 2 May 2011 23:07:58 UTC