Re: tag name state from David Carlisle on 2012-03-03 (public-xml-er@w3.org from March 2012)

From: David Carlisle <davidc@nag.co.uk>
Date: Sat, 03 Mar 2012 17:53:33 +0000
To: public-xml-er@w3.org
Message-ID: <4F525A9D.6060706@nag.co.uk>

On 03/03/2012 16:40, Noah Mendelsohn wrote:

> Still, it's not impossible that because of the advantages in the
> many cases where the output will correspond to well-formed XML, we
> may want to document text output for all cases. Specifically, and
> this is just a proposal for consideration:
>
> 1, When the input is well formed, we specify a 1-to-1 mapping
> (unless someone sees value in using c14n)

I'm not sure exactly what you men by 1-to-1 here as opposed to c14n.
I think that there is inevitably a certain amount of canonicalisation
implied when comparing two xml documents. encoding weirdness and
attribute order at least mean we can't insist that you get byte-for-byte
identical output given well formed document as input.

>
> 2. In the many cases where the fixed up output does correspond to
> well formed XML, we specify a mapping to a well formed XML stream

I really think that this should always be true and we define things so
(3.) below can't happen.

>
> 3. In the cases where the output does not correspond to a well
> formed stream, e.g. because we allow non-XML characters, we specify
> a mapping to a non-well formed stream, e.g. one in which the tags
> contain characters that XML considers illegal.

Despite the fact that there are obvious uses with DOM/CSS based stack I
do not think we should allow this. If xml-er denotes error recovery then
really it should recover from errors and bad element names are just the
same category of WF error as missing end tags, if we recover from one we
should recover from both. In the first iteration of the spec we should
be aiming to limit (and preferably abolish) optional behaviours, so
since the xml stack requires well formed XML XML-ER should produce that,
even though some of the requirements implied by that are not needed for
DOM/CSS usage.

David

Received on Saturday, 3 March 2012 17:53:56 UTC