RE: Options for dealing with IDs

Chris,

This is awesome!  Wow, I was going to volunteer to help, but this is great.

I don't see any mention of IDREF.  Seems a little strange to have idref be a
different mechanism.  For example, say xml:id was used.  Would idref be
defined in each vocabulary?  I think I can live with that, but probably it
needs to be called out.

Another potential aspect to look at is how XPointer would deal with these
various approaches.  Perhaps show how XPointer is affected by each,
particularly for bare names.  For example, Option #1, #8 has the
disadvantage that it means XPointer requires DTD or Schema validation.
Option #5, #6 has the disadvantage that the XPointer parser has to look at
the xml:idattr to figure out what the id attribute name is.

#4) I see that xforms is working on an xml:id spec at
http://www.w3.org/MarkUp/Forms/Group/2002/10/xml-id.  Isn't another
disadvantage that the user can't use different names for IDs?  Not just
existing content, but new content.

#5, #6) another disadvantage is that it's more complicated than #4.  There
may be potentially many subtrees that change the name, so many idattr
declarations.

I lean towards #4.  But then I wanted include to be in the xml ns too :-)

Cheers,
Dave


> -----Original Message-----
> From: www-tag-request@w3.org
> [mailto:www-tag-request@w3.org]On Behalf Of
> Chris Lilley
> Sent: Tuesday, January 07, 2003 10:27 AM
> To: www-tag@w3.org
> Subject: Options for dealing with IDs
>
>
>
> Hello www-tag,
>
>   As requested by Dave Orchard, a listing of the options for dealing
>   with IDs.
>
>   1) Require DTD validation of all instances.
>   A fully validating XML processor will, almost as a side effect,
>   result in all attributes of type ID being so noted in the Infoset.
>
>   Advantages:
>   - existing mechanism (DTDs)
>
>   Disadvantages:
>   - existing mechanism is poor,
>   - not namespace aware,
>   - can't declare a content model of 'any' that really means 'any',
>   - can't use with mixed namespace documents easily
>   - hinders composability
>   - needlessly conflates validation with decoration
>   - leaves well formed documents in a backwater
>   - retrogressive step
>
>   2) Steal all attributes of name id in the per-element partition
>   Declare retrospectively that all attributes whose name is
> 'id' are of
>   type ID because this is common practice anyway.
>
>   Advantages
>   - much existing content becomes conformant without change
>   - easy to explain
>
>   Disadvantages
>   - no help for content that uses a different name for its IDs
>   - some existing content becomes changed retrospectively
>   - may clash with declarations in DTDs or Schemas
>   - user outrage, xml can only control the syntax and the
>     xml namespace, not other namespaces
>   - different behavior in validating and non-validating parsers
>   - requires a change to CSS
>   - requires a change to Xpath 1.0
>   - requires a change to DOM levels 1, 2 and 3
>   - requires a change to XSL-T
>   - requires a change to (insert your spec here)
>
>   3) Steal undeclared attributes of name id
>   In well formed content that does not have a DTD, or that has a
>   partial DTD used for decoration (declaring ID, declaring attribute
>   defaults, etc) if an attribute is called id and has not been
>   declared in the DTD, it is of type ID.
>
>   Advantages
>   - much existing content becomes conformant without change
>   - fairly easy to explain
>
>   Disadvantages
>   - no help for content that uses a different name for its IDs
>   - some existing content becomes changed retrospectively
>   - may clash with declarations in Schemas
>   - user annoyance, xml can only control the syntax and the
>     xml namespace, not other namespaces
>   - different behavior in validating and non-validating parsers
>   - requires a change to CSS
>   - requires a change to Xpath 1.0
>   - requires a change to DOM levels 1, 2 and 3
>   - requires a change to XSL-T
>   - requires a change to (insert your spec here)
>
>   4) Add a predeclared id attribute to the xml namespace
>   In the same way that xml:base added a predeclared attribute to the
>   existing xml:lang and xml:space attributes, add another one called
>   xml:id. It is of type ID. It may cannot be declared (or redeclared)
>   and thus its type cannot be changed. It can be used wherever you
>   want an reliable, interoperable identifier
>
>   Advantages
>   - easy to explain
>   - easy to use
>   - easy to change content to use the new syntax
>   - no clash with DTDs or Schemas
>   - existing content not inadvertently affected
>
>   Disadvantages
>   - requires a (small) change to XML spec and XML parsers
>   - no help for (all existing) content that uses a different
>     name for its IDs
>   - requires revision in any content specs that want to make use of it
>
>   5) Add an inline, per-instance ID declaration method
>   In the same way that xml:base added a predeclared attribute to the
>   existing xml:lang and xml:space attributes, add another one called
>   xml:idAttr. It takes as value the local name of an attribute. All
>   attributes of that name in the per-element partition become of type
>   ID. It may only be used on the root element of the instance.
>
>   Advantages
>   - easy to explain (easier than the DTD syntax, probably)
>   - easy to use
>   - existing content not inadvertently affected
>   - very easy to change content to use the new syntax
>
>   Disadvantages
>   - requires a (small) change to XML spec and XML parsers
>   - may clash with declarations in DTDs or Schemas
>   - different behavior in validating and non-validating parsers
>   - limits composability
>
>   6) Add an inline, per subtree ID declaration method
>   In the same way that xml:base added a predeclared attribute to the
>   existing xml:lang and xml:space attributes, add another one called
>   xml:idAttr. It takes as value the local name of an attribute. All
>   attributes of that name in the per-element partition, on that
>   element and its children become of type ID. It can be used on any
>   element. It can also take the value "" in which case, no attributes
>   on that element or its children are declared to be of type ID (used
>   when composing multiple namespaces).
>
>   Advantages
>   - fairly easy to explain (easier than the DTD syntax, probably)
>   - easy to use
>   - existing content not inadvertently affected
>   - very easy to change content to use the new syntax
>   - aids composability
>   - does not affect well-formed portions of multi-namespace documents
>
>   Disadvantages
>   - requires a (small) change to XML spec and XML parsers
>   - may clash with declarations in DTDs or Schemas
>   - different behavior in validating and non-validating parsers
>
>   7) Muddle along
>   Do nothing. Accept weasel wording in the DOM spec about knowledge of
>   'well known namespaces' and conformance loopholes in the CSS spec
>   about possible breakage in namespaces other than HTML and accept
>   that we can't really point into XML documents unless we can be sure
>   the client uses a validating parser and besides, it works in HTML so
>   far and no-one really uses XML on the client anyway.
>
>   Advantages
>   - familiar pain
>   - no changes to existing specs
>
>   Disadvantages
>   - new specs need similar weasel wording
>   - interoperability headaches
>   - user confusion about when is it an ID and when is it not
>   - interoperability depends on the transmission of secret
>     knowledge among cognoscenti
>   - multi-namespace document integration not made easier
>   - cross-namespace XML DOM scriptig still hit and miss
>   - its a wart, and a readily fixable one
>
>
> 8) Require W3C XML Schema validation of all instances.
>   A fully validating XML processor will, almost as a side effect,
>   result in all attributes of type ID being so noted in the Infoset.
>
>   Advantages:
>   - existing mechanism starting to see acceptance
>
>   Disadvantages:
>   - existing mechanism is not fully deployed
>   - too heavyweight for such a simple problem, will not be
>     used on mobile platforms or other small devices
>   - needlessly conflates validation with decoration
>   - leaves well formed documents in a backwater
>
> An optional variation on 5) and 6) is to accept either a local name or
> a qname; if its a qname then resolve to a namespace URI, local name
> pair on the element that has xml:idAttr and then all attributes
> with that local name in that namespace are of type ID.
>
> In passing, note that the separation of validation from decoration has
> an additional benefit: ID uniqueness remains a validation constraint
> so in well formed XML, there can be multiple IDs with the same value
> and if that happens, well the first one in document order is the
> correct one (or some better scheme to be devised, but its not an
> error).
>
> If I have omitted a solution, or omitted significant advantages or
> disadvantages, I would be glad to hear them.
>
> My personal preference is for option 6) Add an inline, per subtree ID
> declaration method. It would require work on what the precedence is
> (or what sort of error it is) if the DTD or Schema declares the
> designated attribute to be of a type other than ID.
>
> Most (but not all) attributes called id are of type ID. Most (but not
> all) attributes of type ID are called id. 100% of single-namespace
> documents could be brought into conformance with this proposal by
> adding a single attribute to the root element. 99% of them would be
> brought into conformance by adding
>
> xml:idAttr="id"
>
> to the root element. Crucially, the 1% that do not atre still catered
> for, a big advantage over options 2, 3 and 4.
>
> Requiring DTD validation to get IDs is too big a retrogressive step;
> it essentially throws away well formedness as a concept and also XML
> namespaces, and needlessly conflates validation with decoration.
>
> Requiring W3C XML Schema validation to get IDs is too big a forwards
> step; it adds a lot of machinery to get a simple but crucial step
> forward and needlessly conflates validation with decoration.
>
> However, I would prefer that W3C XML Schema be revised so that the
> behavior of documents that use xml:idAttr *and* use a W3C XML Schema
> is consistent with regards to the attribute declared of type ID in the
> instance, whether the Schema is used or not (in other words, an
> implicit declaration in the instance is the same in the PSVI as if the
> attribute had been declared of type ID in the Schema, except that part
> of the PSVI that traces which Schema provided the rule - that part
> would report that the instance provided the rule).
>
>
> --
>  Chris                          mailto:chris@w3.org
>
>

Received on Tuesday, 7 January 2003 16:46:45 UTC