- From: Thomas Broyer <t.broyer@gmail.com>
- Date: Wed, 13 Jun 2007 10:26:48 +0200
2007/6/13, Simon Pieters: > On Wed, 13 Jun 2007 09:11:31 +0200, Thomas Broyer wrote: > > > I'd rather change the #tokenisation section to generate more parse > > errors. > > Why? What if you want to pass a paramater to a plugin with non-ASCII > characters using <embed>? What would you do if you had to recode the document into 7bit ASCII? Would you recode the attribute name with a "pseudo-entity" (would the plugin then correctly interpret the parameter name?) Would you drop the non-ASCII character? Would you rather drop the attribute? Btw, we'd have a similar problem if you use non-ASCII characters in CDATA elements... Should they be changed to RCDATA to accept entities? or should the recoder assume that \uNNNN escapes will be understood by the <script> or <style> parser/processor? if so, what should it do with codepoints outside the Basic Multilingual Plane: should it use \UNNNNNNNN escapes or a surrogate pair of \uNNNN escapes? -- Thomas Broyer
Received on Wednesday, 13 June 2007 01:26:48 UTC