- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 2 Oct 2012 07:22:07 -0400
- To: RDF-WG WG <public-rdf-wg@w3.org>, Internationalization Core Working Group <www-international@w3.org>
- Cc: Gavin Carothers <gavin@carothers.name>
Proposal to NOT address I18N-ISSUE-193: define when escapes are evaluated =============================================================== Issue: Section 6.4, both forms of Unicode escape sequence: The spec doesn't say at what stage the escape sequences are converted to their corresponding characters. Can \u0022 start or end a string literal (as it does in, for example, Java)? Appendix B implies that escapes are replaced with their character equivalents before document processing, but it doesn't appear to say that explicitly anywhere. don't [http://www.w3.org/TR/2012/WD-turtle-20120710/#sec-parsing-terms 7.2] and [http://www.w3.org/TR/2012/WD-turtle-20120710/#sec-iri-references 6.3] cover that? The table in <http://www.w3.org/TR/2012/WD-turtle-20120710/#term2escape> [[ Context where each kind of escape sequence can be used numeric string reserved escapes escapes character escapes IRIs, used as RDF terms or as in @prefix or yes no no @base declarations local names no no yes Strings yes yes no ]] provides an overview of where the different escapes may be used. For excruciating detail, §7 RDF Term Constructors <http://www.w3.org/TR/2012/WD-turtle-20120710/#sec-parsing-terms> provides a mapping from grammatical productions to unicode strings, e.g. for IRIs: [[ production type procedure The characters between "<" and ">" are unescaped¹ to form the unicode IRIREF IRI string of the IRI. Relative IRI resolution is performed per section 6.3 IRI References. The potentially empty unicode string ]] The "unescaped¹" link refers to this text: [[ ¹ section 6.4 Escape Sequences defines a mapping from escaped unicode strings to unicode strings. The following lexical tokens are unescaped to produce unicode strings: IRIREF, STRING_LITERAL_SINGLE_QUOTE, STRING_LITERAL_QUOTE, STRING_LITERAL_LONG_SINGLE_QUOTE and STRING_LITERAL_LONG_QUOTE . ]] I think this covers exactly what to do to map from a string of characters in a Turtle document to the lexical form of either an IRI, RDF Literal or Blank Node in the RDF abstract syntax. Proposal: no change Please indicate whether this address the stated issue. -- -ericP
Received on Tuesday, 2 October 2012 11:22:45 UTC