Re: 2. Effect of normalisation step on the DOM/Infoset from Norman Walsh on 2005-01-20 (public-xml-id@w3.org from January 2005)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Thu, 20 Jan 2005 07:52:01 -0800
To: Ian Hickson <ian@hixie.ch>
Cc: public-xml-id@w3.org
Message-id: <87vf9sxfmm.fsf@nwalsh.com>

/ Ian Hickson <ian@hixie.ch> was heard to say:
| On Wed, 5 Jan 2005, Norman Walsh wrote:
|> | | 4 Processing xml:id Attributes
|> | |
|> | | Each xml:id attribute is processed in the following way:
|> |
|> | It is unclear whether this processing is intended to change the DOM
|> | (or the infoset, for that matter) or not. If one has the following
|> | document:
|> |
|> |    <test xml:id=" test "/>
|> |
|> | ...what would be returned by:
|> |
|> |    document.documentElement.getAttributeNS('http://www.w3.org/XML/1998/namespace', 'id');
|> |
|> | Should it be " test " or "test"?
|> |
|> | I think it should be made clear that the processing mentioned in this
|> | section is merely internal to the xml:id processor and does not affect
|> | the infoset or the DOM. (This comment obviously doesn't apply to the
|> | "ID Assignment" phase, where you definitely do want the infoset and
|> | the DOM to be updated, but that's another matter.)
|> 
|> On the contrary, I think the purpose of attribute value normalization is
|> so that down-stream processes will see the normalized value.
|
| This causes a backwards-compatibility issue. A document processed by a 
| DOM-aware XML processor will create a different DOM than one processed by 
| a DOM-aware XML processor with XML ID support.

This issue already exists. Consider:

<!DOCTYPE test SYSTEM "test.dtd">
<test id=" test "/>

Assuming that test.dtd defines the 'id' attribute as an ID, then some
parsers will see that attribute value as " test " and some will see it
as "test" depending on whether or not they process the external
declaration.

| I think this is a very bad situation to be in. Scripts frequently trip 
| over this kind of problem already, and I feel quite strongly that we 
| should not make it worse.

The xml:id specification improves the situation by encouraging uniform
behavior (irrespective of validation or processing of the external subset)
for attributes named "xml:id".

Adoptiong the resolution that I believe you would prefer, namely that
xml:id processing would use the value presented in the infoset without
any additional normalization, perpetuates the existing
interoperability problems.

| Thus I disagree with this resolution.

Are you persuaded by my observations to change your mind?

                                        Be seeing you,
                                          norm

-- 
Norman.Walsh@Sun.COM / XML Standards Architect / Sun Microsystems, Inc.
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.

Received on Thursday, 20 January 2005 15:52:06 UTC