.. always UTF8 ... > Unicode code points may also be expressed using an \uXXXX (U+0 to > U+FFFF) or \UXXXXXXXX syntax (for U+10000 onwards) where X is a > hexadecimal digit [0-9A-F] I assume that what is ment here is the use of 7bit safe chars to express unicode code points. This begs the question: -> can this be mixed with true utf8 in the same payload. -> my advise would be NOT to allow this; think cross site scripting for an example of the pain you may get into at some point in the future. -> Is there 'escaping' for the \u and \U sequence itself ? And if there is - can this be mixed in utf8 ? And if not - how does one know for a fact what mode one is ? Or on other words: -> If you really want this - better define it narrower OR -> Drop it altogether. As to give strict parsers in hostile environments a chance. DWReceived on Thursday, 9 March 2006 09:43:54 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 1 October 2009 14:42:05 GMT