- From: John Cowan <cowan@mercury.ccil.org>
- Date: Mon, 10 Sep 2012 14:42:55 -0400
- To: David Lee <David.Lee@marklogic.com>
- Cc: Michael Kay <mike@saxonica.com>, "public-microxml@w3.org" <public-microxml@w3.org>
David Lee scripsit: > What do prevailing (yes I know that not precise) XML parsers and > processors do today when they encounter this range of discouraged > Unicode characters ? Nothing special. Upstream applications may or may not handle them. > Secondly is it a goal that if 1) A microxml parser successfully > parses a microxml document then a "off the shelf" XML parser should > also successfully parse the same document (by "parse" here I mean not > abort or generate fatal errors) Yes, that's a goal. > Conversely 2) An "off the shelf" XML parses a MicroXML document then > all MicroXML parsers should also parse that document without failure. > I guess this one is self referential. By definition, a MicroXML parser parses MicroXML documents and fails to parse, or partially parses, or parses-with-warnings things that are not MicroXML documents. > What I am getting at is 'what would the user see that is bad if we > allowed these discouraged characters" The purpose of the non-characters is to provide a range of codepoints that can be represented as part of Unicode strings but can't appear in the input to a program, and can therefore be used for internal purposes such as string termination, string segmentation, or the representation of application-specific magic. The existence of XSLT, whose programs are XML documents, somewhat blurs the distinction between internal and external uses, but I doubt XLST will ever be ported to MicroXML. -- John Cowan http://ccil.org/~cowan cowan@ccil.org Lope de Vega: "It wonders me I can speak at all. Some caitiff rogue did rudely yerk me on the knob, wherefrom my wits yet wander." An Englishman: "Ay, belike a filchman to the nab'll leave you crank for a spell." --Harry Turtledove, Ruled Britannia
Received on Monday, 10 September 2012 18:43:17 UTC