Re: Attribute uniqueness test: a radical proposal from Paul W. Abrahams on 2000-05-28 (xml-uri@w3.org from May 2000)

From: Paul W. Abrahams <abrahams@valinet.com>
Date: Sun, 28 May 2000 14:44:14 -0400
To: David Carlisle <david@dcarlisle.demon.co.uk>
CC: abrahams@acm.org, XML-uri@w3.org
Message-ID: <393168FE.BE8CA1CB@valinet.com>
David Carlisle wrote:

> So if x: and y: are bound to the same namespace  x:x="1" and y:x="2"
> are two settings of the same attribute to different values. It makes
> no more sense than saying that you should allow
> x="1" and x='2' in the same element start tag, as they use different
> quote forms so can be distinguished.
> Certainly in my own namespace processor, if the namespace rec was
> changed to drop this restriction that an attribute could appear more
> than once I would have to completely re-write it, it is built
> assuming that an attribute only is used once (and it really doesn't
> keep the prefix information, that is lost at the same time that
> the " ' distinction or white space around the = is lost.
>
> If x: and y: were bound to a namespace in the document and z: was
> bound to the same namespace in the stylesheet
>
> then given
>
> <xxxx  x:x="1" and y:x="2"/>
>
> in the document what would you want
>
> <xsl:value-of xxxx/@z:x />
>
> to be?
>
> > Can you give an example showing how a
> > document that is well-formed according to the modified uniqueness test
> > is turned into a document that is not well-formed?
>
> Yes, the above document, to a namespace parser has the same element
> attribute structure as
>
> <xxxx  x:x="1" and x:x="2"/>
>
> but this isn't well formed XML.

I agree, that's a compelling argument for retaining the uniqueness test.

> While namespaces were being discussed on xml-dev there were proposals
> that already at the level of sax it would be the case that the
> prefix info was dropped. In fact a sax2 parser will keep some other
> information about names, and about entity references and other stuff
> but that is essentially extra info not forced by the model.
> As I say, if x: and y: are bound to the same prefix then my own system
> will produce exactly the same internal structure for these two
> <xxxx  x:x="1" and y:x="2"/>
> <xxxx  x:x="1" and x:x="2"/>
> which actually is unspecified behaviour since the second isn't well
> formed XML.

Your parser, it seems, depends very explicitly on the assumption of literal
string comparison.  Fair enough, since that's what the namespace spec says.
But it's then necessary to carry that assumption to all other specs that use
the namespace spec.

For example (you've seen this one before):

<gobble xml:base="http://www. jones.org/docA"
      xmlns:a="./bar"
      <squawk xml:base="http://www.smith.org/docB"
              xmlns:b="./bar"
            <peep a:x="1" b:x="2"> </peep>
      </squawk>
</gobble>

Your parser, I gather, will reject this document even though the apparent
intent is that a:x and b:x, though referring to the same literal namespace
name, refer to two different namespaces.   So the literal comparison rule
implies that the apparent intent must be ignored.

If namespace names never used the path syntax and never used schemes such as
http, then the question of intent would never arise.   We're stuck with some
of those namespace names because they appear in existing specs.  But there's
no need to continue down that path, and deprecation offers a way of following
a different path, e.g., using something like

   xmlns:pics="data:www.w3.org.TR.xxxx.WD-PICS-labels"

or something like

   xmlns:pics="nsid:19990807.120348.w3.org'

There's no problem with the denotation of URIs that have the form of URLs;
the problem is with the connotation.  They enticingly suggest something that
the namespace spec says isn't there, namely, retrievable information.   And
it seems that existing software believes that connotation: witness the
Microsoft use of relative URIs to locate schemas.

TimBL initiated this whole brouhaha because he was disturbed about the
failure of namespace names to identify useful retrievable information.
There's a vital function here that's remaining unfulfilled.  But I think now
that the best way to fulfill it is to add another attribute that locates
metadata for the namespace, whatever form that metadata might take.

That, of course, would not break your parser or anyone else's.  It would
require redoing the bit of XPath that defines expanded names, together
perhaps with other specs that I'm not so familiar with.

Paul Abrahams
Received on Sunday, 28 May 2000 14:46:08 UTC