Re: ACTION NW xmlChunk-44: Chunk of XML - Canonicalization and equality

Apologies for coming so late to this party.

I followed the thread explaining [version]'s absence from the REC.

I'm still not happy with the fact that in this and at least one other
area, the draft finding judges two infosets equivalent despite the
fact that one is 'well-formed' and the other is not, that is, one
could be serialized as a well-formed XML document and the other could
not.

For example

  * An EII with a [local name] with a e.g. a long S in it (U017F) is
    well-formed only if [version]=1.1;

  * An element with a [namespace name] with a value and a [in-scope
    namespaces] with no declaration for that value

If the answer is that we're only interested in equivalence of infosets
arising from the conformant parsing of well-formed character
sequences, then at the very least this should be made clear in the
finding.

I hope that's not the answer, in which case I'd be interested not only
in a specific explanation of why the [in-scope namespaces] EII
property was not included, but also in the more general question of
the implicit suggestion above that "Everyone knows what 'well-formed'
means when applied to infosets" and "It doesn't make sense to define
equivalence such that a well-formed infoset can be equivalent to a
non-well-formed infoset."

ht
-- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]

Received on Wednesday, 25 August 2004 21:01:40 UTC