Request for new TAG issue: Adding terms to a namespace from Norman Walsh on 2005-02-09 (public-xml-core-wg@w3.org from February 2005)

From: Norman Walsh <Norman.Walsh@Sun.COM>
Date: Wed, 09 Feb 2005 15:28:15 -0500
To: www-tag@w3.org
Cc: public-xml-core-wg@w3.org, w3c-xml-cg@w3.org
Message-id: <87u0olbhr4.fsf@nwalsh.com>
On behalf of both the XML Core WG and the XML Coordinate Group, I have
been asked to present an issue to the TAG. This issue has actually
surfaced on this list already[1] and there is a long and sometimes
heated discussion of it underway on the public-xml-id mailing list[2].

  [In the interest of full disclosure, I am the editor of the
   xml:id specification and I have strong feelings about this
   issue, however, I have attempted to compose this message in
   an impartial manner.]

At its most philosophical level, the question is about the identity of
a namespace, in particular, the xml: namespace. One perspective is
that the xml: namespace consists of xml:space, xml:lang, and xml:base
(and no other names) because there was a point in time in which those
where the only three names from that namespace that had a defined
meaning. Another perspective is that the xml: namespace consists of
all possible local names and that only a finite (but flexible) number
of them are defined at any given point in time.

The question has practical ramifications and is biting rather hard
just at the moment because of a particular set of issues surrounding
xml:id[3].

The Canonical XML[4] specification describes a process by which a
canonicalizer copies attributes from the xml: namespace. In
particular, it says:

   The processing of an element node E MUST be modified slightly when
   an XPath node-set is given as input and the element's parent is
   omitted from the node-set. ... All element nodes along E's ancestor
   axis are examined for nearest occurrences of attributes in the xml
   namespace ... From this list of attributes, remove any that are in
   E's attribute axis (whether or not they are in the node-set). Then,
   lexicographically merge this attribute list with the nodes of E's
   attribute axis that are in the node-set. the attribute nodes in
   this merged attribute list.

This process interacts badly with the xml:id attribute because it may
copy that attribute onto arbitrarily many elements in the resulting
node set and that will surely violate the uniqueness constraints of
xml:id. (Note that this is not, in fact, a fatal error and xml:id aware
processors will be able to process the resulting document, though it
may have somewhat ambiguous semantics.)

Many people view this as a bug in the C14N specification. C14N
anticipated the semantics of all future attributes that might be added
to the xml: namespace and assumed that they would be inheritable. The
C14N specification, one can argue, had no authority to predict the
semantics of attributes in a namespace that it does not control.

However, it is clear that this problem cannot be addressed in C14N as
an erratum. Existing C14N processors (used at a very low level in
application stacks for processes such as digital signatures and
encryption) will continue to do what they do. The XML Core WG is in
the process of getting its charter revised so that it will be able to
address this defect in C14N, but the result will be a new
specification and will not immediately address the legacy problem.

There is a proposed solution to the xml:id problem: name the attribute
xmlid (without a colon). In fact the XML Recommendation reserves all
identifiers that begin with "[xX][mM][lL]" so this is a technically
possible solution. This solution has some support in the Core WG,
although it is by no means a consensus position at this time. Many
members of the Core WG (and the XML CG) prefer xml:id.

Beyond the particular issues of xml:id, the Core WG and the XML CG
look to the TAG to provide some architectural guidance for the
maintainance of namespaces in general. Can a working group define
previously undefined names in a namespace, or is a namespace, once
published, a closed, immutable set?

Proponents of the position that the WG is free to define new names can
point to xml:base as providing a precedent in this direction, although
I have not found any clear record of this particular issue being
discussed at any length during the development of xml:base, so
proponents of the other position can argue that xml:base was an
accident and does not set a precedent.

We hope that the TAG will agree to accept this issue and provide an
architectural principle which can be used to guide the development not
only of xml:id but also of all future names in the xml: and other
namespaces.

On behalf of the XML Core and XML CG,
Norman Walsh

[1] http://lists.w3.org/Archives/Public/www-tag/2005Feb/0014.html
[2] http://lists.w3.org/Archives/Public/public-xml-id/2005Feb/thread.html
[3] http://www.w3.org/TR/xml-id/
[4] http://www.w3.org/TR/2001/REC-xml-c14n-20010315

-- 
Norman.Walsh@Sun.COM / XML Standards Architect / Sun Microsystems, Inc.
NOTICE: This email message is for the sole use of the intended
recipient(s) and may contain confidential and privileged information.
Any unauthorized review, use, disclosure or distribution is prohibited.
If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
Received on Wednesday, 9 February 2005 20:28:19 UTC