RE: reference to element, elementFormDefault unqualified from noah_mendelsohn@us.ibm.com on 2006-04-16 (xmlschema-dev@w3.org from April 2006)

From: <noah_mendelsohn@us.ibm.com>
Date: Sun, 16 Apr 2006 11:37:39 -0400
To: "Michael Kay" <mike@saxonica.com>
Cc: "'Oliver Kusche'" <oli@trip.net>, xmlschema-dev@w3.org
Message-ID: <OFCCD6B794.ED34C98B-ON85257152.00536142-85257152.0055D856@lotus.com>
Michael Kay writes:

> elementFormDefault="unqualified" means that a locally-declared 
> element (one declared with <element name="x"> as part of a complex 
> type) will be in no namespace. (I've always thought this was a weird
> thing to want to do.)

It seems weird to me too, but since you brought it up I thought it might 
be worth reminding folks of an argument made by those who do advocate the 
use of unqualified local names: I.e., that attributes give us a precedent. 
 In the realm of attributes, it's very common to have in the same 
instance:

        <people:name title="Mr." 
                xmlns:people="...peopleURI...">
                Smith
        </person>
        <library:book 
                xmlns:book="...bookURI...">
                title="War and Peace"/>

In other words, even unqualified attributes that are scoped to their 
potentially qualified parent elements.  This usage is indirectly 
encouraged by the Namespaces recommendation, insofar as it declines to 
apply default namespaces to attributes.  So, the following are equivalent 
to the samples above.

        <name title="Mr." 
                xmlns="...peopleURI...">
                Smith
        </person>
        <book 
                xmlns="...bookURI...">
                title="War and Peace"/>

So, the elements pick up the qualification and the attributes do not; 
Namespaces thus encourages or at least facilitates use of unqualified 
attributes with qualified parents, and indeed this usage is now quite 
idiomatic in XML.

Another and perhaps more compelling argument is that you can make the case 
that qualified names aspire to a uniqueness and perhaps a presence on the 
Web that unqualified names do not.  From that perspective, it's disturbing 
to see:

        <root xmlns:n="...sampleURI..."> 
          <n:personName n:title="Mr." >
            Smith
          </n:personName>
          <n:book 
                n:title="War and Peace"/>
          </n:book>
        </root>

In the above example, the same fully qualified QName is used to refer to 
two semantically different things, the book title and the person title. By 
the same reasoning, the following is a bad thing to do in XML, 
particularly in the context of the Web:

        <root xmlns:n="...sampleURI..."> 
          <n:person>
            <n:title>Mr.</n:title>
            <n:name>Smith</n:title
          </n:personName>
          <n:book>
            <n:title>War and Peace<.n:title>
          </n:book>
        </root>

and that's the example in question in this thread.  If someone wants to 
publish at ...sampleUri...#title, information about this qualified name 
(perhaps in a RDDL document), which semantic should it document?   So, 
from that perspective, the following is preferable, on the theory that the 
unqualified TAG names raise fewer expectations of global uniqueness:

        <root xmlns:n="...sampleURI..."> 
          <n:person>
            <title>Mr.</n:title>
            <n:name>Smith</n:title
          </n:personName>
          <n:book>
            <title>War and Peace<.n:title>
          </n:book>
        </root>

Those are the reasons I've heard why some people like the unqualified 
names that Mike finds mysterious.

I suspect that many workgroups can point to one or two issues that 
consumed months of time, on which there were strongly held opinions on 
more than one side, and that would just not resolve.  For the Schema WG 
perhaps  years ago, this was a famous one.  The provision for elementForm 
and especially elementFormDefault, was in essence an acknowledgement that 
we could not get agreement from the community as to which idiom 
represented the 80/20 tradeoff.  I think that many people on both sides 
felt it was an ugly compromise, and we all knew that it would cause both 
complexity and confusion down the road.  Still, it was the best we could 
manage.  It was, I think, in part a reflection of the lack of deep 
architecture in XML and namespaces for any sort of local scoping;  when 
schemas tried to add local scoping without active cooperation from those 
other parts of the stack, the results were perhaps unavoidably messy.

FWIW:  my own positions are (1) in retrospect we probably should have kept 
it simple by avoiding local scoping at all ? we could still have allowed 
element declarations syntactically within other element declarations, 
while treating all as global  ? some users would no doubt miss the ability 
to create conflicting definitions of the same QName, but the language 
would have been much simpler  and the uniqueness questions would have been 
avoided;  (2) given that we have local scoping, I like Mike feel that 
using qualified names for locals makes more sense ? to do otherwise 
clumsily reflects in the instance what is really an artifact of the 
schema, which is whether the definitions were packaged as locally scoped.

Anyway, I hope this bit of history sheds some light on how things came to 
be as they are.  I often use this as an example of the downsides of 
freezing one part of your stack (XML and Namespaces)  before you fully 
architect closely related ones (Schema and maybe Query.)  It's also an 
example of the main lesson I claim to have learned form XML Schema: 
"there's no such thing as a simple feature."  We heard early in our design 
that local scoping was widely used in other languages (it is) and that the 
implications for XML Schema would therefore be straightforward (they 
weren't).

Noah

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------
Received on Sunday, 16 April 2006 15:38:12 UTC