- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 2 Oct 2012 07:22:07 -0400
- To: RDF-WG WG <public-rdf-wg@w3.org>, Internationalization Core Working Group <www-international@w3.org>
- Cc: Gavin Carothers <gavin@carothers.name>
Proposal to NOT address I18N-ISSUE-193: define when escapes are evaluated
===============================================================
Issue: Section 6.4, both forms of Unicode escape sequence: The spec doesn't say at what stage the escape sequences are converted to their corresponding characters. Can \u0022 start or end a string literal (as it does in, for example, Java)? Appendix B implies that escapes are replaced with their character equivalents before document processing, but it doesn't appear to say that explicitly anywhere.
don't [http://www.w3.org/TR/2012/WD-turtle-20120710/#sec-parsing-terms 7.2] and [http://www.w3.org/TR/2012/WD-turtle-20120710/#sec-iri-references 6.3] cover that?
The table in <http://www.w3.org/TR/2012/WD-turtle-20120710/#term2escape>
[[
Context where each kind of escape sequence can be used
numeric string reserved
escapes escapes character
escapes
IRIs, used as RDF terms or as in @prefix or yes no no
@base declarations
local names no no yes
Strings yes yes no
]]
provides an overview of where the different escapes may be used. For
excruciating detail, §7 RDF Term Constructors
<http://www.w3.org/TR/2012/WD-turtle-20120710/#sec-parsing-terms> provides a
mapping from grammatical productions to unicode strings, e.g. for IRIs:
[[
production type procedure
The characters between "<" and ">"
are unescaped¹ to form the unicode
IRIREF IRI string of the IRI. Relative IRI
resolution is performed per section
6.3 IRI References.
The potentially empty unicode string
]]
The "unescaped¹" link refers to this text:
[[
¹ section 6.4 Escape Sequences defines a mapping from escaped unicode strings
to unicode strings. The following lexical tokens are unescaped to produce
unicode strings: IRIREF, STRING_LITERAL_SINGLE_QUOTE, STRING_LITERAL_QUOTE,
STRING_LITERAL_LONG_SINGLE_QUOTE and STRING_LITERAL_LONG_QUOTE .
]]
I think this covers exactly what to do to map from a string of characters in a Turtle document to the lexical form of either an IRI, RDF Literal or Blank Node in the RDF abstract syntax.
Proposal: no change
Please indicate whether this address the stated issue.
--
-ericP
Received on Tuesday, 2 October 2012 11:22:44 UTC