Re: Options for dealing with IDs

Chris Lilley wrote:
> On Wednesday, January 8, 2003, 11:06:08 AM, Robin wrote:
> RB> This is by far my favourite option, it's simple and efficient. I've been using 
> RB> something similar (an id:id attribute) to ease processing of multi-namespace 
> RB> documents and have been happy with it.
> 
> Its simple and efficient and suits those who are happy with it, but
> requires those who currently use a different name or a different
> namerspace or the per-element partition of unqualified names to change
> if they want reliable processing.

True, but that's not much work.

> RB> I'd be in favour of having the declared attribute take precedence
> RB> there as it'll be more backwards compatible.
> 
> Take precedence on that element, or take precedence on all elements?

On that element. xml:id (or whichever other scheme) is new and I'd expect adding 
it to document that already had ID declarations not to modify the behaviour. 
Where all elements are concerned picking the first makes best sense imho. Having 
two DTD declared ID attributes is, hmm, pathological.

> Given the crucial and central nature of URI sand URI references in W3C
> specifications, then I would argue that xml:base should be added as a
> part of the XML spec at the next rev, and be mandatory not optional.

Agreed. I'm not a big fan of xml:base, but eliminating the dozens of intricate 
confomance levels ought to be very high in the priority list for XML++.

>>>  5) Add an inline, per-instance ID declaration method
>>>  6) Add an inline, per subtree ID declaration method
> 
> RB> I could live with these two but I think they open cans of worms
> RB> here and there.
> 
> All of the options open cans of worm someplace (including the "live
> with this mess" option). Its a case of choosing your can.

Then I must say I find that the ones in can #4 wriggle less when you try to 
catch them (and sport that unique mezcal aftertaste). You've convinced me that 
can #6 isn't as bad as it looked at first, though it may be more complicated 
than necessary.

> RB> Using a QName would be an
> RB> option, but I'd rather keep away from QNames-in-content.
> 
> QNames in attribute content is toothpaste that is already out of the
> tube and all over the sink. And it seems to be less of a problem in
> practice than might have been thought.

I wouldn't go that far. I admit that QNames-in-content is often the best/only 
solution and are a necessary evil in some cases. I also think that if option 6 
is picked, then it needs to be able to point to namespaced attributes. Reasons 
for this are 1) it wouldn't be complete otherwise and 2) if the advantage of 6 
over 4 is that it allows people to keep the names of their ID attributes with 
minimal change, we need to cater for people that use namespaced attributes. 
Otherwise, I don't see what 6 has over 4.

One issue with QNames in content stems from the fact that prefixes are not 
usually considered to be first-class information. They can thus be rewritten, 
which in turn would cause one to lose QICs unless typing is involved. I've hit 
that wall myself.

Another issue is that they are normally (in all cases I've seen them used in at 
least) sensitive to the default namespace. This causes an unfortunate mismatch 
because attribute names are not. So the following would not make id an ID attribute:

   <foo xmlns='http://foo.org/' xml:idAttr='id' id='worm'/>

The xml:idAttr here is {http://foo.org/}id, whereas the id attribute is {}id. Of 
course, nothing keeps us from specifying xml:idAttr's content as 
Attribute-QName-In-Attribute-Content but that's adding yet another subtle and 
confusing thing for users to know about, and contradicts common practice from 
XLST, XML Schema, etc. wriggle, wriggle, squirm, wriggle.

This appears to me to add cost for little practical value. Devising a tiny SAX 
tool that can check for existing ID attributes and switch them to xml:id seems 
to me to be a lot simpler than jumping through option 6's hoops.

> RB> A PI of course could work,
> RB> but I suspect I'll be the only one to think
> RB> that ;) (and it would have issues unless it's constrained to
> RB> appear before the root element).
> 
> I assume it would have a global scope, rather than being a
> stream-based directive that apples "from that point on" so the
> constraint on being above the root element would be to avoid multiple
> passes in parsing or backtracking and fixup.

Yes, exactly. And if you could have the PI midstream apply from that point on 
you would always forget to move it around when moving parts of the tree, and 
you'd get a lot of breakage.

-- 
Robin Berjon <robin.berjon@expway.fr>
Research Engineer, Expway        http://expway.fr/
7FC0 6F5F D864 EFB8 08CE  8E74 58E6 D5DB 4889 2488

Received on Thursday, 9 January 2003 07:51:30 UTC