W3C home > Mailing lists > Public > public-rdf-comments@w3.org > November 2013

Escaped characters in RDF-1.1 N-Triples literals for Canonical documents

From: Peter Ansell <ansell.peter@gmail.com>
Date: Mon, 18 Nov 2013 09:50:07 +1100
Message-ID: <CAGYFOCR+ESkg2OSZapLP3gOxTcXz8aod=3PDpXszCuU+oG1dmw@mail.gmail.com>
To: "public-rdf-comments@w3.org" <public-rdf-comments@w3.org>
The Conformance section (Section 4) of the RDF-1.1 N-Triples Candidate
Recommendation (05 November 2013) specifies that for a canonical
document [1] :

    "Characters not allowed directly in STRING_LITERAL_QUOTE (U+0022,
U+005C, U+000A, U+000D) MUST use ECHAR not UCHAR. "

However, the escape sequences in ECHAR do not seem to include U+005C "\" [2]:

    [153s] ECHAR ::= '\' [tbnrf"']

That is, ECHAR defines escapes for \t \b \n \r \f \" \' , but it
doesn't appear that \\ is allowed for in that grammar. It could be
escaped using UCHAR as \u005C, but that seems to violate the canonical
rule that specifically mentions it.

In addition, is it intentional that the list of characters mentioned
in the canonical section [1] does not include all of the characters
with escapes defined in ECHAR [2]? Should the characters that appear
in ECHAR [2] but not in the list in [1] be escaped using UCHAR in
Canonical documents or be represented using their raw UTF-8 values.

Cheers,

Peter

[1] http://www.w3.org/TR/2013/CR-n-triples-20131105/#conformance
[2] http://www.w3.org/TR/2013/CR-n-triples-20131105/#grammar-production-ECHAR
Received on Sunday, 17 November 2013 22:50:34 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 17 November 2013 22:50:35 UTC