- From: John Cowan <cowan@mercury.ccil.org>
- Date: Thu, 3 Jul 2003 12:15:07 -0400
- To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- Cc: Paul Grosso <pgrosso@arbortext.com>, www-xml-blueberry-comments@w3.org
Elliotte Rusty Harold scripsit: > I absolutely do not accept this one. I think you have a major problem > here, and I very much would like to record a formal objection. I went > back and reread the currently published draft spec of XML 1.1. The > current published version of this document leaves no room for the > interpretation that parsers may validate and check for > well-formedness against the normalized forms of characters when the > unnormalized forms are present. As written <e'></é> is malformed > (where e' means e followed by combining accent acute). > > This is actually what I think should be the case. However, it appears > that some members of the working group do not believe this is true, > and think it is optional for parsers to report a fatal error when > encountering such an element. This may be what the working group > intended to say, but it is not what the spec does say. If this is > your intent, then you need to change the language of the spec to > indicate that the BNF productions, well-formedness constraints, and > validity rules are verified only after normalization has taken place. You sound as if you think XML 1.1 parsers MAY or MUST normalize their inputs. This is absolutely false, and indeed XML 1.1 says "MUST NOT normalize". What XML 1.1 parsers MAY (indeed, SHOULD) do is check whether their inputs are already normalized, and if not, report. The example you give can't possibly be anything but a fatal error even assuming that the parser reports non-normalized input (as it SHOULD do) and the application elects to continue. -- John Cowan jcowan@reutershealth.com www.ccil.org/~cowan www.reutershealth.com "If he has seen farther than others, it is because he is standing on a stack of dwarves." --Mike Champion, describing Tim Berners-Lee (adapted)
Received on Thursday, 3 July 2003 12:17:42 UTC