- From: Dan Connolly <connolly@w3.org>
- Date: Thu, 09 Mar 2006 17:39:30 -0600
- To: Eric Prud'hommeaux <eric@w3.org>
- Cc: public-rdf-dawg@w3.org
On Thu, 2006-03-09 at 17:35 -0500, Eric Prud'hommeaux wrote: > [...]I propose the following change to I reviewed this 1st change, and I support it: > [[ > ... For compatibility with future > versions of Unicode, the characters in this string may include unassigned > Unicode codepoints (see Identifier and Pattern Syntax [UNIID] section 4 > Pattern Syntax). ... > ]] But that's all the energy I have for this sort of thing for today. I leave it to others to check the other change: > Further, I would like to address Bjoern's comments on escape sequences by > modifying > [[ > A.5 Escape sequences in strings > > Strings are used for the lexical form of RDF terms and in expressions. > Within a string, the following escape sequences apply. The escape > character is backslash "\" (#x5C). No other escape sequences are defined > for strings. Names for characters given are the common names. > > These escape sequences apply to all rules making up the rule for string > (rules: STRING_LITERAL1, STRING_LITERAL2, STRING_LITERAL_LONG1, > STRING_LITERAL_LONG2). > > <table> > > where HEX is a hexadecimal character > > HEX ::= [0-9] | [A-F] | [a-f] > > Examples: > ... > ]] > to > [[ > A.5 Escape sequences in strings > > The following escape sequences may be used in any string production > (e.g. STRING_LITERAL1, STRING_LITERAL2, STRING_LITERAL_LONG1, > STRING_LITERAL_LONG2): > > <table> > > Any escaped character in the range #x00 - #xEFFFFF may appear in any > string production. For instance, "\n" may appear in a STRING_LITERAL1 even > though the unescaped form is not valid in that production. > ]] > > This clarifies n points: > - parsers must be able to process currently unassigned Unicode characters. > - SPARQL strings include the character #x00. > - which codepoints can be produced through \uU escape sequences. > - there *is* a difference between escaped characters in strings and > escaped characters in variable names and IRI references. > > I specify the range to be #x00 - #xEFFFFF while XML 1.1 uses #x01 - > #xEFFFFF, citing "Due to potential problems with APIs, #x0 is still > forbidden both directly and as a character reference." I read our LC > document as allowing #x00 - #xEFFFFF and am trying to avoid any > changes to the language at this late date. I don't think the > liberalization will hurt us. -- Dan Connolly, W3C http://www.w3.org/People/Connolly/ D3C2 887B 0F92 6005 C541 0875 0F91 96DE 6E52 C29E
Received on Thursday, 9 March 2006 23:39:36 UTC