Re: 4. ID assignment and the empty string from Ian Hickson on 2005-01-20 (public-xml-id@w3.org from January 2005)

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 20 Jan 2005 16:40:42 +0000 (UTC)
To: Norman Walsh <Norman.Walsh@Sun.COM>
Cc: public-xml-id@w3.org
Message-ID: <Pine.LNX.4.61.0501201626231.15038@dhalsim.dreamhost.com>

On Thu, 20 Jan 2005, Norman Walsh wrote:
> |
> | The problem is that invalid values do not stop the value from becoming an 
> | ID according to the current wording as I understand it. However, while it 
> | is not a big problem if "!(&" becomes an ID value that can be looked up 
> | using the DOM getElementById() method or accessed in CSS via the #\!\(\& 
> | selector, it is a more serious problem if the empty string is used as an 
> | element's identifier.
> 
> This problem already exists:
> 
> <!DOCTYPE doc [
> <!ATTLIST p id ID #IMPLIED>
> ]>
> <doc>
> <p id=""/>
> </doc>

XML does not, as far as I can tell, require that the UA treat the <p> 
element as having an ID of "". (In fact, XML does not seem to require 
processors to recognise or process IDs at all, except for correctness 
checking in validating parsers.) Could you point me to where in the XML 
specification there is that requirement?

> | Also, without the proposed modification quoted above, it is unclear that 
> | ID assignment is performed using the normalised attribute value, as 
> | opposed to any other value. (While only one processing makes common sense, 
> | other processings would not be non-conformant without the explicit 
> | statement above).
> 
> That is now clearly specified:
> 
>    1. The attribute's value is normalized according to the rules for
>       attribute-value normalization on attributes of type ID. For more
>       details, see E Attribute Value Normalization on IDs.
> 
>       The infoset [normalized value] property is updated with the
>       normalized value.

(Note that I don't like this new paragraph, as per:
   http://lists.w3.org/Archives/Public/public-xml-id/2005Jan/0022.html
...)

>    2. ID assignment is performed with the normalized value.

Great!

> | Thus I disagree with this resolution.
> 
> Does the observation that the problem already exists persuade you to 
> withdraw your objection?

The change to step 2 above satisfies half of my comment, but as I do not 
see how the problem exists already (that is, I don't see how XML requires 
the empty string to be treated as the element's ID), I am not satisfied 
that the second half of my issue is resolved.

While I agree that existing specifications (XML, DOM, HTML, etc) are vague 
about how empty string IDs should be handled, current practice is to not 
normalise IDs at all, and to consider empty strings to not be IDs. While I 
do not mind the normalisation step for ID procesing (as this is a new 
attribute, the processing can be defined to normalise -- so long as the 
DOM representation doesn't differ between new and old implementations!), I 
firmly believe that the empty string should never be required to be 
treated as an ID. Since you already consider it an error condition, you 
presumably agree -- it is merely the next step that I am asking for, 
namely, explicitly stating that empty strings should not trigger ID 
processing even if errors are otherwise being ignored.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Thursday, 20 January 2005 16:40:45 UTC