RE: Determining attribute uniqueness seems to require namespace prefix in Infoset

Dear Don,

I can appreciate that you would like to characterize my explorations of this
topic as those of a "narrow-minded programmer" who should simply learn to
"be happy with the under-specified fuzzyness [sic] of it all".

However, as you well know, there are few things in this world that are more
narrow in acceptance of change than a sha-1 hash.  Ultimately, our
processing model boils down to taking a digital digest of some information.

The job of the implementers is to cut through that fuzziness and tell the
computer to do something specific.  If two implementers cut through that
fuzziness in different ways, the result is non-interoperable
implementations.

Perhaps I've spent too long at the business end of the whip wielded by QA
teams who do things like

...
<dsig:Reference xmlns:dsig="&dsig;" URI="#X" dsig:URI="#Y>
...

If my implementation picks up URI and your implementation picks up dsig:URI,
then only one of us will correctly validate this beastly signature.

Below you state, "If the text documenting N and/or E doesn't tell you what
to do here, you get to toss a coin or something."  Tossing a coin undermines
the interoperability that is supposedly the cornerstone of XML. The 'if'
part of your statement should not be true for any application of XML, and
therefore not for XML DSig in particular.

As for the issue of why unqualified attributes do not simply default to the
namespace of their parent element, I have been trying to find out from "the
XML authorities" why they did not do it this way because the way they did it
is so specific that it belies intent.  If we toss a coin and pick a
processing model, it may be in violation of their intent. We don't know
unless one of them gives the reason.

Finally, I would add that everything you say below justifies a design in XML
Names that applies an element's namespace to its unqualified attributes.
The fact that it was not this way is the original point that took both
Gregor and me by surprise.  Why is it characterized as 'narrow-minded' to
ask a public working consortium to explain why a very specific feature was
added when it is somewhat contrary to reason and when noone can explain it
except to accuse the interrogator of narrow-mindedness for not processing it
in the way that it would be processed if the design were as we expected it?

Further specific points and answers to your questions are below.

Thanks for your continued patience,
John Boyer
Development Team Leader,
Distributed Processing and XML
PureEdge Solutions Inc.
Creating Binding E-Commerce
v: 250-479-8334, ext. 143  f: 250-479-3772
1-888-517-2675   http://www.PureEdge.com <http://www.pureedge.com/>



-----Original Message-----
From: Donald E. Eastlake 3rd [mailto:dee3@torque.pothole.com]
Sent: Wednesday, August 16, 2000 9:47 AM
To: John Boyer
Cc: xml-names-editor@w3.org; www-xml-infoset-comments@w3.org; XML DSig
Subject: Re: Determining attribute uniqueness seems to require namespace
prefix in Infoset


Hi John,

From:  "John Boyer" <jboyer@PureEdge.com>
To:  "Donald E. Eastlake 3rd" <dee3@torque.pothole.com>,
            "Joseph Reagle" <reagle@w3.org>
Cc:  <xml-names-editor@w3.org>, <www-xml-infoset-comments@w3.org>,
            "XML DSig" <w3c-ietf-xmldsig@w3.org>
Date:  Fri, 11 Aug 2000 17:13:09 -0700
Message-ID:  <BFEDKCINEPLBDLODCODKEEFCCEAA.jboyer@PureEdge.com>
In-Reply-To:  <200008112057.QAA06998@torque.pothole.com>

>Hi Don,
>
>I see where you are coming from now, and it is both interesting and
correct,
>but it means we need to be saying something more in DSig, if not in XML
>Names.
>
>The appendix refers to the attributes as 'unqualified'.  The section is not
>saying that unqualified attributes are qualified by per element partitions
>(because then it wouldn't be unqualified).  Instead, because it is
>non-normative, it is saying that one is free to look at it this way if it
>makes one feel better.
>
>The implication is that "per element partitions" don't really exist and
need
>not be supported by conforming applications.  In other words, the material
>on per element partitions is a convoluted way of saying that the only thing
>which matters is that an attribute appears within an element and should
>therefore be processed as though it were in that element's namespace unless

But the attribute is NOT "directly" in the element's namespace.  It is
in orbit around that element where the full coordinates of the element
includes the elements namespace.

>it has a non-empty namespace other than the element's.  As well, I think
>this is what you are saying.
>
>Unfortunately, it actually does matter quite a bit.  For starters, suppose
>you have an HTML anchor with two hrefs, one unqualified and the other in
the
>HTML namespace.  Which does one use?  Put in another more general way,
given

Why would there be an href qualified by a namespace?

<john>
It was a specific example of this more general case.
It doesn't matter why someone would do it; what matters is that they can do
it, which opens the question of what our software should do about it.
</john>

>element E in namespace N, if one wants to obtain attribute A from E, which
>of the following processing models is correct?
>
>1) Attr = (N:E).get(N, A)
>2) if Attr is null, Attr = (N:E).get("", A)
>
>or
>
>1) Attr = (N:E).get("", A)
>2) if Attr is null, Attr = (N:E).get(N, A)

The attitude of the XML authorities seems to be "Most of the time it
doesn't matter and where it does the documentation for N:E should tell
you and if it doesn't mention it, then maybe the second processing
model is better, but it really doesn't matter much and you should stop
being such a narrow minded programmer and being so algorithmic and be
happy with the under-specified fuzzyness of it all."

<john>
Fortunately, none of the other XML authorities choose to be so directly
insulting.
Hopefully the message above helps to clear up why this issue remains a
concern.
</john>

>It makes a difference if the XML is <N:E xmlns:N="N" A="2" N:A="1">.  The
>first model gives the attribute whose value is 1, and the second model
gives
>the attribute whose value is 2.

The fuzzy handwaving implications of the wording used in the XML
Namespaces specification is that N:A should be some sort of global
attribute which can appear in many elements while plain A is some sort
of normal local attribute to N:E.  If the text documenting N and/or E
doesn't tell you what to do here, you get to toss a coin or something.

<john>
Coin tosses undermine interoperability.

The problem here is what to do when the local A and the global N:A are both
allowed in the same element.
</john>

>I don't think this can be explained away as a simple differing viewpoint
>because I don't agree that all applications process attributes solely
within
>the context of the parent element.  The prime example of this is XSLT,
which
>uses XPath for template matching.

What I meant to say was that it is a matter of point of view if you
consider attributes without prefixes to be in no namespace and to be
defined relative to their element or you considered them to have a
globally unique name composed of their name, their element's local
name, and their element's namespace if any.  The first point of view
says they are not in any namespace while the second point of view
implies they are, in some sense, in their element's namespace,
although not directly in that namespace.

<john>
There are some XML authorities like Tim B-L who feel that namespace
qualification is an important method for assigning unambiguous meaning to
XML.  With both viewpoints, the unqualified attribute is NOT DIRECTLY in the
namespace.

Still, this is beside the point.  The point remains that the processing
model for an element can govern the type of XPath expression one writes to
match nodes based on the effect they will have when processed.
</john>

>Lets compare the following four occurences of an element E by considering
>the xsl-template's XPath expression that must match E based on whether the
>value of an attribute A is 'test':
>
>1) <N1:E xmlns:N1="http://www.w3.org" xmlns:N2="http://www.PureEdge.com"
>A="test"/>
>2) <N1:E xmlns:N1="http://www.w3.org" xmlns:N2="http://www.PureEdge.com"
>N2:A="test"/>
>3) <N1:E xmlns:N1="http://www.w3.org" xmlns:N2="http://www.PureEdge.com"
>N1:A="Yikes" N2:A="test"/>
>4) <N1:E xmlns:N1="http://www.w3.org" A="test" N1:A="Yikes"/>
>
>The xsl-template cannot simply try to match on the local name A or else
>element E in #2 would be matched, which is clearly not intended. If we add
>to the XPath test a call to namespace-uri() to match the URI for N1, then
#3
>works but #1 doesn't.  If we add a test that finds the local name of A and
a
>namespace-uri() that is either blank or equal to the URI for N1, then #1 -
>#3 work as expected, but #4 has two attributes that match the test.

Aren't XPath node tests by QName?  If you specify the namespace, it
will only look at N1:A, if you don't specify any namespace, it will
only look at A, right?

<john>
Yes, exactly.  It's not a question of how an XPath expression should be
written.  It's a question of which XPath expression one should write.  The
XPath expression is dependent upon the processing rules for the target
application.  As you so eloquently put it above, the processing rules may be
based on a coin toss, which means that creating an XPath expression is
non-deterministic.
</john>

>If the claim is true that attributes should be processed within the context
>of the parent, then A="test" would indicate that the xsl-template should
>match N1:E in #4.  However, N1:A is clearly the attribute A in element E's
>namespace, so for the purposes of the primary processing application for
the
>N1 namespace, N1:A and A are duplicates.

N1:A is A directly in namespace N1.  It is qualified only by N1.  A is
defined to be different because, while it can be viewed as qualified
by both E and N1 and it is defined as not being directly in namespce
N1.

<john>
I understand this.  Hopefully it is now evident that this has nothing to do
with the problem.
</john>

>Personally, I don't believe in duplicate attributes.  The normative part of
>XML Names makes it clear that the attributes are not duplicates in name
>(which is a well-formedness requirement of XML), but if the non-normative
>part of the spec invents some concept that makes the two attributes equal
>for the purposes of processing, then a precedence must be specified. Which
>one has precedence?  Until we know this, the proper xsl-template cannot be
>written.

Why do you say that the non-normative part "makes the two attributes
equal for the purposes of processing"?  It distinguishes them by
saying that one is in the Per-Element-Type attribute partition
associated with that element of the namesapce while the other is in
the Global attribute partition of the namesapce.

<john>
It would be helpful if you could specify how this partitioning changes the
way the attributes should be processed.  Based on your discourse above, N1:A
and A are differentiated by nothing more than a coin toss unless a
particular processing behavior is specified by the application.  Are you now
saying that there is a standard XML technique for processing attributes in
differing partitions?
</john>

>If I had to guess on a processing precedence, I would guess that the
>namespace qualified version should take precedence.  Here's why:
>
>I've been sitting here trying to figure out why the default namespace does
>not apply to attributes, and the only thing that I can think of is that
>unqualified attributes can then be used to apply an attribute value across
>all namespaces.  In other words, a sort of default value to be used when no
>namespace specific attribute is given. Consider the following bit of XML:

As I read it under the namesapces spec, "unqualified" attributes are
super-local, not super-global as you suggest above.

<john>
I was only trying to make a guess at why the authors of XML Names decided
that the default namespace would not apply to unqualified attributes.  Despi
te all of the dancing around and text messages that have flung far and wide,
noone has given an explanation.

Clearly, there was an intent, but in the absence of any of those XML
authorities coming out and saying what the intent was, the only thing we can
do is guess.

Everything you have said so far would justify a design that applies the
element's namespace to unqualified attributes.  If A is super-local, then it
does not seem to make sense to have

<N1:E A="" N1:A=" "/>

so why did the authors of XML names seemingly go out of their way to make
this construction legal?  This is not narrow-mindedness.  I am asking a
question of the language designers about their design.  Why can they not
answer it?  If they did answer it, then the answer might have a bearing on
the processing model that we *should* specify for dsig.
</john>

><E xmlns:N1="http://www.w3.org" xmlns:N2="http://www.PureEdge.com"
>A="default" N1:A="Value for App that processes N1" N2:A="Value for App that
>processes N2"/>
>
>The information in element E can be used by one application that conforms
to
>specification given at the URI for N1, or it could be processed by a second
>application that conforms to specifications given at N2, or it could be
>processed by some other generic XML processing application.  If, by
>convention, namespace qualified attributes took precedence over unqualified
>attributes for processing, then the value for A would be
>application-dependent.  If unqualified attributes took precedence, then A
>would always equal default.

You are assuming that E is defined in both N1 and N2?  Then I think
you need to read the doucmentation of E in N1 and the documentation of
E in N2 to figure this out.  XML is just plain under-specified and no
amount of thought will change its ambiguous parts into an un-ambiguous
specifiction.

<john>
Agreed, but this is such a specific concept that they seem to have gone out
of their way to talk about (e.g. per-element partitions).  In this case,
it's not a matter of adding more thought.  It's a matter of one of those
authorities saying why they did this so that we can choose a processing
model that doesn't violate their intent.
</john>

>Perhaps XML names will not be changed or clarified, and perhaps they will
>make no recommendation, but for a particular application, esp. a security
>conscious one such as DSig, we should be saying what we expect our
>processors to do when there are qualified and unqualified attributes with
>matching local names.

Sure, probably a good idea to add a few words to our specification to
say what should be done for such "duplicate" attributes for elements
we define.

<john>
Good.  Why so much resistance if you end up agreeing that we should add a
few words to address the problem?
</john>

Received on Wednesday, 16 August 2000 14:19:49 UTC