Re: xml:id from Henri Sivonen on 2008-01-04 (public-cdf@w3.org from January 2008)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Fri, 4 Jan 2008 12:14:33 +0200
To: Maciej Stachowiak <mjs@apple.com>
Cc: Timur Mehrvarz <timur.mehrvarz@web.de>, "public-cdf@w3.org" <public-cdf@w3.org>, Eric Seidel <eric@webkit.org>
Message-Id: <D3E9DE00-0692-42CE-B188-178C65F74944@iki.fi>
On Jan 4, 2008, at 04:41, Maciej Stachowiak wrote:

> SVG Tiny 1.2 allows both I'd and xml:id, so content using it is  
> conforming.


SVG Tiny 1.2 is actually worse than just allowing both id and xml:id.  
SVG Tiny 1.2 makes the IDness of id conditional depending on the  
presence of xml:id. I think this is a very bad idea and id should be  
unconditionally an ID.
http://lists.w3.org/Archives/Public/www-svg/2007Oct/0086.html

More generally, for deployed Web XML languages (XHTML, SVG and MathML)  
xml:id is a feature with a cost but no payoff. Browsers will have to  
continue to support ID semantics for the id attribute in no namespace  
on XHTML, SVG and MathML attributes forever. Thus, xml:id can only add  
complexity--not remove it. When the IDness of id has to be supported  
anyway, xml:id does not add any functionality--it is additional  
complexity with no upside in the Web context.

Now, one might argue that xml:id is beneficial for generic XML  
processing like making the XPath id() function work with generic XML.  
However, the xml:id spec doesn't merely by existing make pre-existing  
XPath implementations support xml:id. Instead, you have to add an  
xml:id Processor between the XML processor (aka. XML parser) and the  
XPath implementation.

The better way to solve the non-Web XML processing problem would be to  
add a filter that assigns IDness to id instead of an xml:id Processor  
in the pipeline.

I have implemented such a filter *and* an xml:id Processor. In my  
experience, implementing a filter that assigns IDness to to id is  
simpler than implementing an xml:id Processor. (An xml:id Processor is  
required to perform additional, in my opinion, rather useless  
operations.)

Validator.nu uses XPath-based Schematron to implement part of HTML5  
and XHTML5 conformance checking. To make this work on the HTML5 side,  
the HTML5 parser assigns IDness to id. To make this work on the XHTML5  
side, there's the aforementioned filter that assigns IDness to id. It  
works.

Now, one might argue that it is horribly wrong to assign IDness to id  
without DTD processing or to always assign it regardless of the  
element that the attribute is on, because someone out there might have  
an XML vocabulary where the attribute id in no namespace does not have  
IDness. The argument of wrongness of ID assignment in the absence of a  
DTD is without merit: Assigning IDness without a DTD is exactly what  
xml:id does! Moreover, for practical observability like getElementById  
or the CSS # selector, browsers have to implement de facto IDness for  
the id attribute in no namespace for Web XML languages.

The argument that someone out there might have an XML language that  
has an id without IDness is not without merit but is still a mere  
distraction. After I deployed the unconditional IDness assignment in  
Validator.nu, I got a bug report that Validator.nu, which is used as  
the back end of a CML validator (http://cmlcomp.org/validator/) was  
wrongly complaining about duplicate id attribute values in CML. So  
yes, it is true that there exists at least one XML language where id  
is not an ID.

Yet, the existence of a CML as a counter example refuting the  
assumption that id is always an ID is utterly irrelevant to what to do  
with deployed Web languages (XHTML, SVG and MathML) or known-as- 
upcoming Web languages (XBL2). For an app like Validator.nu, it is  
clear that assigning IDness to id on XHTML, SVG, MathML or XBL2  
elements is desirable and on CML elements undesirable. What to do with  
known languages isn't the question. The question really is what to do  
with unknowns. I currently err on the side of assigning IDness for the  
unknowns.

But even this is irrelevant to browsers as long as they don't support  
CML. Browsers could unconditionally assign IDness to id without  
creating real problems. And at that point, xml:id offers no additional  
value--just cost.

P.S. I think making id non-ID is a design bug in CML. Calling the  
attribute 'name' would be more consistent with XML design patterns,  
but second-guessing CML is pointless and it should just be considered  
grandfathered.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Friday, 4 January 2008 10:14:44 UTC