- From: C. M. Sperberg-McQueen <cmsmcq@acm.org>
- Date: Wed, 13 Nov 2002 09:51:56 -0700
- To: xml-names-editor@w3.org
The following are personal comments on Namespaces in XML 1.1 from C. M. Sperberg-McQueen. I apologize for being so late with them. They arise out of my work preparing draft comments for the XML Schema Working Group, but since it looks as if that WG may take a while to reach consensus on some issues, I send these now as is, as personal comments. I think the technical modifications made in version 1.1 are not problematic. I do think your plan to issue a new version of the specification without addressing some outstanding problems is problematic. As the comments below show, I think many of the outstanding problems in the spec -- particularly the large and unnecessary discrepancies between what it says its mechanism does and what the mechanism actually does -- can be eliminated very easily. Unlike some other people, I believe the outright false and misleading statements can be removed from the spec even without achieving consensus on the 'philosophy' of namespaces. Within sections of the document, I indicate locations by paragraph (p) and sentence (s) number; negative numbers are counted from the end of the section or paragraph. Section 1: Motivation and Summary p1 s2 typo: for 'modularity;' read 'modularity:' p1 s-1 editorial: "if ... a markup vocabulary exists ..., it is better to re-use this markup rather than re-invent it." I believe this statement in its current form, expressed without qualification or exception, is too strong. It does not in fact mirror the views or practice of all experienced language designers. For "it is better" I suggest "it may be better" or "some designers will prefer". p2 s2 editorial: for "tags" I suggest "elements". p3 editorial, SEVERE: The phrase "universal names", in conjunction with similar phrases (e.g. "universally unique" in p8) suggests to some readers names which are universally unambiguous. Given a "universal name", such readers expect to be able to identify, without further information, a single object denoted by the universal name. Since (in the view of some observers including me) the specification could, in fact, if written differently, have provided identifiers with such globally unique denotation, it will not be immediately obvious to all readers that the interpretation just given of the phrase 'universal names' is erroneous. Since the spec does not, however, in fact provide globally unambiguous identifiers, it is unacceptable to describe it as if it did. The misleading description in paragraph 3 cost the XML Schema Working Group a substantial amount of time -- my estimate at the time was that it cost us six months, but it may have been more -- owing to misconceptions about the nature of the Namespace Recommendation caused more or less directly by this paragraph. I believe this paragraph should be deleted and replaced by one which accurately describes what the specification does. Possible replacement text: These considerations require that document constructs should have names constructed so as to avoid name clashes between names assigned by different designers, specifications, or naming authorities. This specification describes a mechanism, XML namespaces, which accomplishes this. It should be noted that the namespace-qualified names described by this specification are not guaranteed to have globally unique denotations; because this specification does not constrain the construction or internal structure of namespaces, it is possible for the same qualified name to denote more than one object. (For example, in a namespace for a typical XML vocabulary, an element type and a global attribute may have the same qualified name.) Such names must be disambiguated by means not prescribed by this specification; in practice, they are often disambiguated by reference to the context in which they are used. p4 s-2/-1 editorial: These last two sentences have proven more misleading than they seem to me to be worth. They are in any case false, since (1) XML namespaces need not have any particular structure and (2) the namespaces used in conventional programm languages (e.g. the variable and function-name namespaces of Algol 60, C, Lisp, Pascal, etc.) are not sets. On (1): Nothing in this specification or in the XML 1.0 or 1.1 specifications constrains the internal organization of a namespace, requires those responsible for a namespace to follow any particular discipline, or makes any guarantees about the nature of namespaces. The description of 'namespace partitions' in Appendix B is not normative and does not provide an adequate account of the naming discipline of XML vocabularies defined by any known schema language. XML 1.0 DTDs have naming constraints not described there, and newer schema languages including XML Schema 1.0 do not place all elements into the same symbol space. On (2): in many conventional programming languages, the namespace used for variables (for example) is not guaranteed to be a set. There may be arbitrarily many distinct variables with the same name in a program; in Algol and its descendants they are distinguished by their lexical scope, and in some other languages by their dynamic scope. On the whole, I think it may be wisest simply to delete these two sentences. I would be happy if it were possible to replace them with a coherent account of the notion of 'namespace' as used in this specification and how it relates to other uses of the term. But I recognize that clear, coherent descriptions of namespaces as defined by this specification have proven remarkably difficult to construct or to elicit. The best I have managed is: The namespaces described by this specification differ from the 'namespaces' conventionally used in computing disciplines in that this specification does not define any particular internal structure for namespaces, nor prescribe any rules for resolving name clashes; such rules are the responsibility of specifications which use namespaces as part of their naming rules. p5 s2 editorial: delete 'therefore'; in 'escaping , ' delete excess blank before comma. p6 editorial, medium serious: for 'cannot be' I think 'must not be' should probably be substituted. I think, that is, that this is a normative prohibition, not the statement of a fact which could be established as a logical consequence of rules elsewhere in the spec; I realize I may be wrong in this thought. p8 editorial, serious: I are slightly alarmed to find what appears to be a serious divergence between our terminology and that defined here. The XML Schema WG has found it exceedingly helpful, both in the XML Schema specification and in the internal discussions of the Working Group, to have several different pairs of terms, which denote respectively: (a) names which are allowed to have a colon, vs. names which are not allowed to have a colon (b) names which are associated with a namespace by the rules of the Namespace Recommendation, vs. names which are not assigned directly to any namespace (c) names which in fact have a colon, vs. names with no colon Distinction (a) is conveyed by the terms QName and NCName, both in this spec and in ours. Distinction (b) we have often made by means of the terms 'namespace-qualified name' vs. 'unqualified name'. We denote distinction (c) with the terms 'prefixed name' and 'unprefixed name'. It would be useful, I think, for all of those who must discuss the ins and outs of namespaces in XML, if the Namespaces in XML specification defined terms which conveyed distinctions (b) and (c). The failure of the 1.0 version to do so is an annoying but easily reparable flaw. p8 s-2 SEVERE: The term 'universally unique' is undefined and misleading. It suggests to some readers that the identifiers so described will or must have universally unique denotations (see the note on p3 above), but such universally unique denotation is neither guaranteed nor required by the mechanism defined in this spec. It is true that a qualified name is universally unique in that it is necessarily distinct from any OTHER qualified name. But this is true of unqualified names as well: the identifier 'p' is necessarily distinct from any other identifier, i.e. from any identifier which is not 'p', and it is thus universally unique. Neither qualified names nor unqualified names are guaranteed to have unique denotations; what is achieved in practice by the Namespaces specification is that different naming authorities can assign names without the risk of name clashes between names assigned by different authorities. (I note in passing that the use of namespaces can only guarantee freedom from name collisions if different naming authorities can be relied on to choose different URIs to serve as namespace names; in practice, they do, but I see nothing in the Namespaces in XML specification which provides any guidance on the matter. If two different naming authorities were to attempt to define names for the namespace 'http://ecommerce.org/schema', for example, there is no guarantee that they would successfully avoid name collisions. But I are unable to identify any rule in the Namespaces specification which would make their practice non-conforming.) p9 s3 substantive, SEVERE: This sentences describes the syntax of namespace declarations as "attribute-based", but this is incompatible with the decision on this matter made by the XML Core Working Group in the development of the XML Information Set Recommendation. I recall that the Core WG decided then, over the protests of some commentators, that namespace declarations are not attributes. The XML Schema spec has followed the lead of the Infoset spec in this matter and I do not propose that the decision should be reversed or revisited; that means, however, that the text here must be revised to conform to it. The syntax of namespace declarations may be described as attribute-like, or namespace declarations may be described as pseudo-attributes, but they MUST NOT be defined here as attributes and in the infoset spec as non-attributes. Section 2 Declaring Namespaces p1 s1 substantive, SEVERE: according to the infoset spec, namespaces are NOT declared using a family of attributes, but using namespace declarations (see above on section 1, p9). For 'attributes' perhaps read 'pseudo-attributes' or for 'using a family of reserved attributes' perhaps read 'using a special attribute-like syntax'. Section 4 Using Qualified Names p2 s1 (after production [11]) editorial, medium serious: for "a qualified name serving as an element type" read "a qualified element type name" or "a qualified name serving to identify an element type". Element types are not identical to their names. p5 Namespace constraint: Prefix defined, s-1. For "empty" (which is undefined and prone to confusion) I recommend substituting "the empty string". Section 5.1 Namespace Scoping p-1, example: the start-tag for the third 'x' element is not aligned properly; should be aligned with the illegal 'n1:a' element above it. Section 5.2 Namespace Defaulting p2, example: I believe it would be desirable to use the actual XHTML namespace in this esxample. p3 s-1, editorial: For "the same effect ... of there being" read "the effect ... of there being" or "the effect ... as there being". Section 7 Internationalized Resource Identifiers (IRIs) p2, ordered list, item 1, editorial: for 'bytes' read 'octets', to avoid confusion with the historic meaning of 'byte' as 'the number of bits needed to represent a single character'. (et passim) Appendix B The Internal Structure of XML Namespaces This appendix may have seemed a good idea when Namespaces 1.0 was issued; it has not worn well, and has caused more confusion than it has avoided. I believe it should either be rewritten or deleted. At the very least, the description of 'partitions' should be related explicitly to XML 1.0 and 1.1 DTDs, and it should be pointed out that other schema languages use different naming disciplines.
Received on Wednesday, 13 November 2002 11:52:46 UTC