last-call comments on Namespaces in XML 1.1 of 5 September 2002

The following are personal comments on Namespaces in XML 1.1 from
C. M. Sperberg-McQueen.  I apologize for being so late with them.
They arise out of my work preparing draft comments for the XML Schema
Working Group, but since it looks as if that WG may take a while to
reach consensus on some issues, I send these now as is, as personal
comments.

I think the technical modifications made in version 1.1 are not
problematic.

I do think your plan to issue a new version of the specification
without addressing some outstanding problems is problematic.  As the
comments below show, I think many of the outstanding problems in the
spec -- particularly the large and unnecessary discrepancies between
what it says its mechanism does and what the mechanism actually does
-- can be eliminated very easily.  Unlike some other people, I believe
the outright false and misleading statements can be removed from the
spec even without achieving consensus on the 'philosophy' of
namespaces.

Within sections of the document, I indicate locations by paragraph (p)
and sentence (s) number; negative numbers are counted from the end of
the section or paragraph.

Section 1: Motivation and Summary

p1 s2 typo: for 'modularity;' read 'modularity:'

p1 s-1 editorial: "if ... a markup vocabulary exists ..., it is better
to re-use this markup rather than re-invent it."  I believe this
statement in its current form, expressed without qualification or
exception, is too strong.  It does not in fact mirror the views or
practice of all experienced language designers.  For "it is better" I
suggest "it may be better" or "some designers will prefer".

p2 s2 editorial: for "tags" I suggest "elements".

p3 editorial, SEVERE: The phrase "universal names", in conjunction
with similar phrases (e.g. "universally unique" in p8) suggests to
some readers names which are universally unambiguous.  Given a
"universal name", such readers expect to be able to identify, without
further information, a single object denoted by the universal name.

Since (in the view of some observers including me) the specification
could, in fact, if written differently, have provided identifiers with
such globally unique denotation, it will not be immediately obvious to
all readers that the interpretation just given of the phrase
'universal names' is erroneous.  Since the spec does not, however, in
fact provide globally unambiguous identifiers, it is unacceptable to
describe it as if it did.

The misleading description in paragraph 3 cost the XML Schema Working
Group a substantial amount of time -- my estimate at the time was that
it cost us six months, but it may have been more -- owing to
misconceptions about the nature of the Namespace Recommendation caused
more or less directly by this paragraph.  I believe this paragraph
should be deleted and replaced by one which accurately describes what
the specification does.  Possible replacement text:

     These considerations require that document constructs should have
     names constructed so as to avoid name clashes between names
     assigned by different designers, specifications, or naming
     authorities.  This specification describes a mechanism, XML
     namespaces, which accomplishes this.

     It should be noted that the namespace-qualified names described by
     this specification are not guaranteed to have globally unique
     denotations; because this specification does not constrain the
     construction or internal structure of namespaces, it is possible
     for the same qualified name to denote more than one object.  (For
     example, in a namespace for a typical XML vocabulary, an element
     type and a global attribute may have the same qualified name.)
     Such names must be disambiguated by means not prescribed by this
     specification; in practice, they are often disambiguated by
     reference to the context in which they are used.

p4 s-2/-1 editorial: These last two sentences have proven more
misleading than they seem to me to be worth.  They are in any case
false, since (1) XML namespaces need not have any particular structure
and (2) the namespaces used in conventional programm languages
(e.g. the variable and function-name namespaces of Algol 60, C, Lisp,
Pascal, etc.) are not sets.

On (1): Nothing in this specification or in the XML 1.0 or 1.1
specifications constrains the internal organization of a namespace,
requires those responsible for a namespace to follow any particular
discipline, or makes any guarantees about the nature of namespaces.
The description of 'namespace partitions' in Appendix B is not
normative and does not provide an adequate account of the naming
discipline of XML vocabularies defined by any known schema language.
XML 1.0 DTDs have naming constraints not described there, and newer
schema languages including XML Schema 1.0 do not place all elements
into the same symbol space.

On (2): in many conventional programming languages, the namespace used
for variables (for example) is not guaranteed to be a set.  There may
be arbitrarily many distinct variables with the same name in a
program; in Algol and its descendants they are distinguished by their
lexical scope, and in some other languages by their dynamic scope.

On the whole, I think it may be wisest simply to delete these two
sentences.  I would be happy if it were possible to replace them with
a coherent account of the notion of 'namespace' as used in this
specification and how it relates to other uses of the term.  But I
recognize that clear, coherent descriptions of namespaces as defined
by this specification have proven remarkably difficult to construct or
to elicit.  The best I have managed is:

     The namespaces described by this specification differ from the
     'namespaces' conventionally used in computing disciplines in that
     this specification does not define any particular internal
     structure for namespaces, nor prescribe any rules for resolving
     name clashes; such rules are the responsibility of specifications
     which use namespaces as part of their naming rules.

p5 s2 editorial: delete 'therefore'; in 'escaping , ' delete excess
blank before comma.

p6 editorial, medium serious: for 'cannot be' I think 'must not be'
should probably be substituted.  I think, that is, that this is a
normative prohibition, not the statement of a fact which could be
established as a logical consequence of rules elsewhere in the spec; I
realize I may be wrong in this thought.

p8 editorial, serious: I are slightly alarmed to find what appears to
be a serious divergence between our terminology and that defined here.
The XML Schema WG has found it exceedingly helpful, both in the XML
Schema specification and in the internal discussions of the Working
Group, to have several different pairs of terms, which denote
respectively:

   (a) names which are allowed to have a colon, vs. names which
       are not allowed to have a colon
   (b) names which are associated with a namespace by the rules
       of the Namespace Recommendation, vs. names which are not
       assigned directly to any namespace
   (c) names which in fact have a colon, vs. names with no colon

Distinction (a) is conveyed by the terms QName and NCName, both in
this spec and in ours.  Distinction (b) we have often made by means of
the terms 'namespace-qualified name' vs. 'unqualified name'.  We
denote distinction (c) with the terms 'prefixed name' and 'unprefixed
name'. It would be useful, I think, for all of those who must discuss
the ins and outs of namespaces in XML, if the Namespaces in XML
specification defined terms which conveyed distinctions (b) and (c).
The failure of the 1.0 version to do so is an annoying but easily
reparable flaw.

p8 s-2 SEVERE: The term 'universally unique' is undefined and
misleading.  It suggests to some readers that the identifiers so
described will or must have universally unique denotations (see the
note on p3 above), but such universally unique denotation is neither
guaranteed nor required by the mechanism defined in this spec.

It is true that a qualified name is universally unique in that it is
necessarily distinct from any OTHER qualified name.  But this is true
of unqualified names as well: the identifier 'p' is necessarily
distinct from any other identifier, i.e. from any identifier which is
not 'p', and it is thus universally unique.

Neither qualified names nor unqualified names are guaranteed to have
unique denotations; what is achieved in practice by the Namespaces
specification is that different naming authorities can assign names
without the risk of name clashes between names assigned by different
authorities.

(I note in passing that the use of namespaces can only guarantee
freedom from name collisions if different naming authorities can be
relied on to choose different URIs to serve as namespace names; in
practice, they do, but I see nothing in the Namespaces in XML
specification which provides any guidance on the matter.  If two
different naming authorities were to attempt to define names for the
namespace 'http://ecommerce.org/schema', for example, there is no
guarantee that they would successfully avoid name collisions.  But I
are unable to identify any rule in the Namespaces specification which
would make their practice non-conforming.)

p9 s3 substantive, SEVERE: This sentences describes the syntax of
namespace declarations as "attribute-based", but this is incompatible
with the decision on this matter made by the XML Core Working Group in
the development of the XML Information Set Recommendation.  I recall
that the Core WG decided then, over the protests of some commentators,
that namespace declarations are not attributes. The XML Schema spec
has followed the lead of the Infoset spec in this matter and I do not
propose that the decision should be reversed or revisited; that means,
however, that the text here must be revised to conform to it.  The
syntax of namespace declarations may be described as attribute-like,
or namespace declarations may be described as pseudo-attributes, but
they MUST NOT be defined here as attributes and in the infoset spec as
non-attributes.

Section 2 Declaring Namespaces

p1 s1 substantive, SEVERE: according to the infoset spec, namespaces
are NOT declared using a family of attributes, but using namespace
declarations (see above on section 1, p9).  For 'attributes' perhaps
read 'pseudo-attributes' or for 'using a family of reserved
attributes' perhaps read 'using a special attribute-like syntax'.

Section 4 Using Qualified Names

p2 s1 (after production [11]) editorial, medium serious: for "a
qualified name serving as an element type" read "a qualified element
type name" or "a qualified name serving to identify an element type".
Element types are not identical to their names.

p5 Namespace constraint: Prefix defined, s-1. For "empty" (which is
undefined and prone to confusion) I recommend substituting "the empty
string".

Section 5.1 Namespace Scoping

p-1, example: the start-tag for the third 'x' element is not aligned
properly; should be aligned with the illegal 'n1:a' element above it.

Section 5.2 Namespace Defaulting

p2, example: I believe it would be desirable to use the actual XHTML
namespace in this esxample.

p3 s-1, editorial: For "the same effect ... of there being" read "the
effect ... of there being" or "the effect ... as there being".

Section 7 Internationalized Resource Identifiers (IRIs)

p2, ordered list, item 1, editorial: for 'bytes' read 'octets', to
avoid confusion with the historic meaning of 'byte' as 'the number of
bits needed to represent a single character'. (et passim)

Appendix B The Internal Structure of XML Namespaces

This appendix may have seemed a good idea when Namespaces 1.0 was
issued; it has not worn well, and has caused more confusion than it
has avoided.  I believe it should either be rewritten or deleted.  At
the very least, the description of 'partitions' should be related
explicitly to XML 1.0 and 1.1 DTDs, and it should be pointed out that
other schema languages use different naming disciplines.

Received on Wednesday, 13 November 2002 11:52:46 UTC