- From: Chris Lilley <chris@w3.org>
- Date: Tue, 7 Jan 2003 19:27:04 +0100
- To: www-tag@w3.org
Hello www-tag, As requested by Dave Orchard, a listing of the options for dealing with IDs. 1) Require DTD validation of all instances. A fully validating XML processor will, almost as a side effect, result in all attributes of type ID being so noted in the Infoset. Advantages: - existing mechanism (DTDs) Disadvantages: - existing mechanism is poor, - not namespace aware, - can't declare a content model of 'any' that really means 'any', - can't use with mixed namespace documents easily - hinders composability - needlessly conflates validation with decoration - leaves well formed documents in a backwater - retrogressive step 2) Steal all attributes of name id in the per-element partition Declare retrospectively that all attributes whose name is 'id' are of type ID because this is common practice anyway. Advantages - much existing content becomes conformant without change - easy to explain Disadvantages - no help for content that uses a different name for its IDs - some existing content becomes changed retrospectively - may clash with declarations in DTDs or Schemas - user outrage, xml can only control the syntax and the xml namespace, not other namespaces - different behavior in validating and non-validating parsers - requires a change to CSS - requires a change to Xpath 1.0 - requires a change to DOM levels 1, 2 and 3 - requires a change to XSL-T - requires a change to (insert your spec here) 3) Steal undeclared attributes of name id In well formed content that does not have a DTD, or that has a partial DTD used for decoration (declaring ID, declaring attribute defaults, etc) if an attribute is called id and has not been declared in the DTD, it is of type ID. Advantages - much existing content becomes conformant without change - fairly easy to explain Disadvantages - no help for content that uses a different name for its IDs - some existing content becomes changed retrospectively - may clash with declarations in Schemas - user annoyance, xml can only control the syntax and the xml namespace, not other namespaces - different behavior in validating and non-validating parsers - requires a change to CSS - requires a change to Xpath 1.0 - requires a change to DOM levels 1, 2 and 3 - requires a change to XSL-T - requires a change to (insert your spec here) 4) Add a predeclared id attribute to the xml namespace In the same way that xml:base added a predeclared attribute to the existing xml:lang and xml:space attributes, add another one called xml:id. It is of type ID. It may cannot be declared (or redeclared) and thus its type cannot be changed. It can be used wherever you want an reliable, interoperable identifier Advantages - easy to explain - easy to use - easy to change content to use the new syntax - no clash with DTDs or Schemas - existing content not inadvertently affected Disadvantages - requires a (small) change to XML spec and XML parsers - no help for (all existing) content that uses a different name for its IDs - requires revision in any content specs that want to make use of it 5) Add an inline, per-instance ID declaration method In the same way that xml:base added a predeclared attribute to the existing xml:lang and xml:space attributes, add another one called xml:idAttr. It takes as value the local name of an attribute. All attributes of that name in the per-element partition become of type ID. It may only be used on the root element of the instance. Advantages - easy to explain (easier than the DTD syntax, probably) - easy to use - existing content not inadvertently affected - very easy to change content to use the new syntax Disadvantages - requires a (small) change to XML spec and XML parsers - may clash with declarations in DTDs or Schemas - different behavior in validating and non-validating parsers - limits composability 6) Add an inline, per subtree ID declaration method In the same way that xml:base added a predeclared attribute to the existing xml:lang and xml:space attributes, add another one called xml:idAttr. It takes as value the local name of an attribute. All attributes of that name in the per-element partition, on that element and its children become of type ID. It can be used on any element. It can also take the value "" in which case, no attributes on that element or its children are declared to be of type ID (used when composing multiple namespaces). Advantages - fairly easy to explain (easier than the DTD syntax, probably) - easy to use - existing content not inadvertently affected - very easy to change content to use the new syntax - aids composability - does not affect well-formed portions of multi-namespace documents Disadvantages - requires a (small) change to XML spec and XML parsers - may clash with declarations in DTDs or Schemas - different behavior in validating and non-validating parsers 7) Muddle along Do nothing. Accept weasel wording in the DOM spec about knowledge of 'well known namespaces' and conformance loopholes in the CSS spec about possible breakage in namespaces other than HTML and accept that we can't really point into XML documents unless we can be sure the client uses a validating parser and besides, it works in HTML so far and no-one really uses XML on the client anyway. Advantages - familiar pain - no changes to existing specs Disadvantages - new specs need similar weasel wording - interoperability headaches - user confusion about when is it an ID and when is it not - interoperability depends on the transmission of secret knowledge among cognoscenti - multi-namespace document integration not made easier - cross-namespace XML DOM scriptig still hit and miss - its a wart, and a readily fixable one 8) Require W3C XML Schema validation of all instances. A fully validating XML processor will, almost as a side effect, result in all attributes of type ID being so noted in the Infoset. Advantages: - existing mechanism starting to see acceptance Disadvantages: - existing mechanism is not fully deployed - too heavyweight for such a simple problem, will not be used on mobile platforms or other small devices - needlessly conflates validation with decoration - leaves well formed documents in a backwater An optional variation on 5) and 6) is to accept either a local name or a qname; if its a qname then resolve to a namespace URI, local name pair on the element that has xml:idAttr and then all attributes with that local name in that namespace are of type ID. In passing, note that the separation of validation from decoration has an additional benefit: ID uniqueness remains a validation constraint so in well formed XML, there can be multiple IDs with the same value and if that happens, well the first one in document order is the correct one (or some better scheme to be devised, but its not an error). If I have omitted a solution, or omitted significant advantages or disadvantages, I would be glad to hear them. My personal preference is for option 6) Add an inline, per subtree ID declaration method. It would require work on what the precedence is (or what sort of error it is) if the DTD or Schema declares the designated attribute to be of a type other than ID. Most (but not all) attributes called id are of type ID. Most (but not all) attributes of type ID are called id. 100% of single-namespace documents could be brought into conformance with this proposal by adding a single attribute to the root element. 99% of them would be brought into conformance by adding xml:idAttr="id" to the root element. Crucially, the 1% that do not atre still catered for, a big advantage over options 2, 3 and 4. Requiring DTD validation to get IDs is too big a retrogressive step; it essentially throws away well formedness as a concept and also XML namespaces, and needlessly conflates validation with decoration. Requiring W3C XML Schema validation to get IDs is too big a forwards step; it adds a lot of machinery to get a simple but crucial step forward and needlessly conflates validation with decoration. However, I would prefer that W3C XML Schema be revised so that the behavior of documents that use xml:idAttr *and* use a W3C XML Schema is consistent with regards to the attribute declared of type ID in the instance, whether the Schema is used or not (in other words, an implicit declaration in the instance is the same in the PSVI as if the attribute had been declared of type ID in the Schema, except that part of the PSVI that traces which Schema provided the rule - that part would report that the instance provided the rule). -- Chris mailto:chris@w3.org
Received on Tuesday, 7 January 2003 13:27:07 UTC