- From: Tony Graham <tgraham@mentea.net>
- Date: Wed, 12 Sep 2012 14:16:43 +0100 (IST)
- To: public-microxml@w3.org
On Mon, September 10, 2012 2:49 am, James Clark wrote: ... > So my question for Tony would be: what is the difference between > > - 0xFFFE - 0xFFFF, and > - the other 64 noncharacters > > that justifies forbidding the former but not the latter? Nothing. If I were using the other non-characters instead and had stated that I would find it personally inconvenient if they were eventually disallowed by the tools that I wanted to use, then you could ask the same question the other way around just as easily. > You could argue that the right approach for noncharacters is to recommend > against their use for interchange rather than forbid them, but given that > XML 1.0 has forbidden U+FFFE-U+FFFF, it seems to me that the cleanest > approach is to forbid all noncharacters. Without arguing for or against the inclusion of non-characters, I don't understand the motivation for forbidding them. If the goal is radical simplicity, then it would be simpler to allow the whole slew of characters. If the goal is to "complement rather than replace XML, JSON and HTML" [1] then if one of the three disallows them (I don't know about JSON), they should be forbidden. I don't know whether this has been discussed, but while the current draft specifies UTF-8 only, but another way to simplify the character processing (post-parser) would be to also specify Normalization Form C [2][3], which would mean there would be only one way in MicroXML documents to represent particular characters. Regards, Tony Graham tgraham@mentea.net Consultant http://www.mentea.net Mentea 13 Kelly's Bay Beach, Skerries, Co. Dublin, Ireland -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- XML, XSL-FO and XSLT consulting, training and programming [1] http://www.w3.org/community/microxml/wiki/Editor%27s_Draft [2] http://www.unicode.org/reports/tr15/#Norm_Forms [3] http://www.w3.org/TR/charmod-norm/#sec-ChoiceNFC
Received on Wednesday, 12 September 2012 13:17:07 UTC