- From: Chris Lilley <chris@w3.org>
- Date: Tue, 7 Jan 2003 19:27:04 +0100
- To: www-tag@w3.org
Hello www-tag,
As requested by Dave Orchard, a listing of the options for dealing
with IDs.
1) Require DTD validation of all instances.
A fully validating XML processor will, almost as a side effect,
result in all attributes of type ID being so noted in the Infoset.
Advantages:
- existing mechanism (DTDs)
Disadvantages:
- existing mechanism is poor,
- not namespace aware,
- can't declare a content model of 'any' that really means 'any',
- can't use with mixed namespace documents easily
- hinders composability
- needlessly conflates validation with decoration
- leaves well formed documents in a backwater
- retrogressive step
2) Steal all attributes of name id in the per-element partition
Declare retrospectively that all attributes whose name is 'id' are of
type ID because this is common practice anyway.
Advantages
- much existing content becomes conformant without change
- easy to explain
Disadvantages
- no help for content that uses a different name for its IDs
- some existing content becomes changed retrospectively
- may clash with declarations in DTDs or Schemas
- user outrage, xml can only control the syntax and the
xml namespace, not other namespaces
- different behavior in validating and non-validating parsers
- requires a change to CSS
- requires a change to Xpath 1.0
- requires a change to DOM levels 1, 2 and 3
- requires a change to XSL-T
- requires a change to (insert your spec here)
3) Steal undeclared attributes of name id
In well formed content that does not have a DTD, or that has a
partial DTD used for decoration (declaring ID, declaring attribute
defaults, etc) if an attribute is called id and has not been
declared in the DTD, it is of type ID.
Advantages
- much existing content becomes conformant without change
- fairly easy to explain
Disadvantages
- no help for content that uses a different name for its IDs
- some existing content becomes changed retrospectively
- may clash with declarations in Schemas
- user annoyance, xml can only control the syntax and the
xml namespace, not other namespaces
- different behavior in validating and non-validating parsers
- requires a change to CSS
- requires a change to Xpath 1.0
- requires a change to DOM levels 1, 2 and 3
- requires a change to XSL-T
- requires a change to (insert your spec here)
4) Add a predeclared id attribute to the xml namespace
In the same way that xml:base added a predeclared attribute to the
existing xml:lang and xml:space attributes, add another one called
xml:id. It is of type ID. It may cannot be declared (or redeclared)
and thus its type cannot be changed. It can be used wherever you
want an reliable, interoperable identifier
Advantages
- easy to explain
- easy to use
- easy to change content to use the new syntax
- no clash with DTDs or Schemas
- existing content not inadvertently affected
Disadvantages
- requires a (small) change to XML spec and XML parsers
- no help for (all existing) content that uses a different
name for its IDs
- requires revision in any content specs that want to make use of it
5) Add an inline, per-instance ID declaration method
In the same way that xml:base added a predeclared attribute to the
existing xml:lang and xml:space attributes, add another one called
xml:idAttr. It takes as value the local name of an attribute. All
attributes of that name in the per-element partition become of type
ID. It may only be used on the root element of the instance.
Advantages
- easy to explain (easier than the DTD syntax, probably)
- easy to use
- existing content not inadvertently affected
- very easy to change content to use the new syntax
Disadvantages
- requires a (small) change to XML spec and XML parsers
- may clash with declarations in DTDs or Schemas
- different behavior in validating and non-validating parsers
- limits composability
6) Add an inline, per subtree ID declaration method
In the same way that xml:base added a predeclared attribute to the
existing xml:lang and xml:space attributes, add another one called
xml:idAttr. It takes as value the local name of an attribute. All
attributes of that name in the per-element partition, on that
element and its children become of type ID. It can be used on any
element. It can also take the value "" in which case, no attributes
on that element or its children are declared to be of type ID (used
when composing multiple namespaces).
Advantages
- fairly easy to explain (easier than the DTD syntax, probably)
- easy to use
- existing content not inadvertently affected
- very easy to change content to use the new syntax
- aids composability
- does not affect well-formed portions of multi-namespace documents
Disadvantages
- requires a (small) change to XML spec and XML parsers
- may clash with declarations in DTDs or Schemas
- different behavior in validating and non-validating parsers
7) Muddle along
Do nothing. Accept weasel wording in the DOM spec about knowledge of
'well known namespaces' and conformance loopholes in the CSS spec
about possible breakage in namespaces other than HTML and accept
that we can't really point into XML documents unless we can be sure
the client uses a validating parser and besides, it works in HTML so
far and no-one really uses XML on the client anyway.
Advantages
- familiar pain
- no changes to existing specs
Disadvantages
- new specs need similar weasel wording
- interoperability headaches
- user confusion about when is it an ID and when is it not
- interoperability depends on the transmission of secret
knowledge among cognoscenti
- multi-namespace document integration not made easier
- cross-namespace XML DOM scriptig still hit and miss
- its a wart, and a readily fixable one
8) Require W3C XML Schema validation of all instances.
A fully validating XML processor will, almost as a side effect,
result in all attributes of type ID being so noted in the Infoset.
Advantages:
- existing mechanism starting to see acceptance
Disadvantages:
- existing mechanism is not fully deployed
- too heavyweight for such a simple problem, will not be
used on mobile platforms or other small devices
- needlessly conflates validation with decoration
- leaves well formed documents in a backwater
An optional variation on 5) and 6) is to accept either a local name or
a qname; if its a qname then resolve to a namespace URI, local name
pair on the element that has xml:idAttr and then all attributes
with that local name in that namespace are of type ID.
In passing, note that the separation of validation from decoration has
an additional benefit: ID uniqueness remains a validation constraint
so in well formed XML, there can be multiple IDs with the same value
and if that happens, well the first one in document order is the
correct one (or some better scheme to be devised, but its not an
error).
If I have omitted a solution, or omitted significant advantages or
disadvantages, I would be glad to hear them.
My personal preference is for option 6) Add an inline, per subtree ID
declaration method. It would require work on what the precedence is
(or what sort of error it is) if the DTD or Schema declares the
designated attribute to be of a type other than ID.
Most (but not all) attributes called id are of type ID. Most (but not
all) attributes of type ID are called id. 100% of single-namespace
documents could be brought into conformance with this proposal by
adding a single attribute to the root element. 99% of them would be
brought into conformance by adding
xml:idAttr="id"
to the root element. Crucially, the 1% that do not atre still catered
for, a big advantage over options 2, 3 and 4.
Requiring DTD validation to get IDs is too big a retrogressive step;
it essentially throws away well formedness as a concept and also XML
namespaces, and needlessly conflates validation with decoration.
Requiring W3C XML Schema validation to get IDs is too big a forwards
step; it adds a lot of machinery to get a simple but crucial step
forward and needlessly conflates validation with decoration.
However, I would prefer that W3C XML Schema be revised so that the
behavior of documents that use xml:idAttr *and* use a W3C XML Schema
is consistent with regards to the attribute declared of type ID in the
instance, whether the Schema is used or not (in other words, an
implicit declaration in the instance is the same in the PSVI as if the
attribute had been declared of type ID in the Schema, except that part
of the PSVI that traces which Schema provided the rule - that part
would report that the instance provided the rule).
--
Chris mailto:chris@w3.org
Received on Tuesday, 7 January 2003 13:27:07 UTC