RE: Use cases from Michael Rys on 2000-05-18 (xml-uri@w3.org from May 2000)

From: Michael Rys <mrys@microsoft.com>
Date: Thu, 18 May 2000 05:46:27 -0700
To: "'Tim Berners-Lee'" <timbl@w3.org>, xml-uri@w3.org
Message-ID: <783D93998201D311B0CF00805FEAA07B08AEAC3B@RED-MSG-42>
Thanks Tim for answering.

> I'd like to look in more detail at the question as to whether 
> any actual
> damage would occur were the NS spec to changed to require
> absolutizing for comparison. I contend that, because in fact that is
> consistent with URI behavior, it will not cause problems with existing
> documents in practice.

The problem really is that nobody knows what the praxis w.r.t. relative
namespaces is after 15 months. However, people had enough time to use the
current namespace spec for 15 months, so there may be a large or small
number of documents that use relative namespaces and that rely on literal
comparison. 

> >The issue is not really an MS issue. The issue is that a relatively old
rec
> >exists that requires literal interpretation of namespaces for equality.
Any
> >change to this interpretation, in particular introducing additional
> >processing of namespace URIs to determine equality will break current
> >documents and their processing.
> 
> Convince me.  If the URIs are actually resolved then they must be valid
> relative URIs. They must absolutize to a valid and correct absolute URI.

All the relative URIs that our tools generate (AFAIK) are intradocument
URIs. An example for inline schemas is:

<doc>
<Schema name="Schema" xmlns="urn:schemas-microsoft-com:xml-data"
xmlns:dt="urn:schemas-microsoft-com:datatypes">
<ElementType name="customer" content="empty" model="closed">
	<AttributeType name="CustomerID" dt:type="string"/>
	<AttributeType name="CompanyName" dt:type="string"/>
	<AttributeType name="ContactName" dt:type="string"/>
	<attribute type="CustomerID"/>
	<attribute type="CompanyName"/>
	<attribute type="ContactName"/>
</ElementType>
</Schema>
<customer xmlns="x-schema:#Schema" CustomerID="ALFKI" CompanyName="Alfreds
Futterkiste" ContactName="Maria Anders" />
</doc>

Here the customer element refers to the schema with the name Schema that is
local to the document. So in this case (disregarding copies), absolutization
or not does not make a difference. However, as soon as I make a copy to a
different server, absolutization will change the identity of the namespace.

Now assume that people author their inline schemas such that they use unique
inline schema names. For example, the customer schema is called
CustomerSchema, the publication schema is called PubSchema etc. For some
reason, these schemas are always provided inline but it is understood, that
the names of two CustomerSchema::Customer elements from different document
instances have to be considered to be the same. This is possible with the
current reading of literal string equality , it will break if we require
absolutization (since the two elements are located in two different
documents, their absolute URI will always differ).

Since this is perfectly legal according to the current namespace spec, we
would break anybody that was using a naming scheme as described in this
example. I do not think that this acceptable without giving clear
alternatives and a long enough transition period.

> If comparing the relative URIs didn't give misleading 
> results, comparing the
> absolute URIs certainly won't.  

See scenario above.

> > While we as tool implementers have control
> >over the tools we write, we do not have control over our customers'
> >documents.
> 
> However, I would expect your customer to use relative URIs in the normal
> spec-consistent way rather than try to construct tortured proofs by
example
> which we have had on this list.

I don't quite understand this statement. Our customers that understand the
namespace spec are well aware of the basic difference between name equality
(literal) and schema resolution (fetching a resource).

> >In general retroactive spec changes would be acceptable "if possible",
> >namely:
> >
> >1. retroactive changes have virtually no impact on the conformance of
> >existing documents (e.g. loosen constraints, not tighten),
> 
> I would suggest that anything which passed well-formedness before
> and fails after was bound to fail at a later date anyway when 
> dereferencing occurred.

I do not understand this statement either (I guess, I am just too
jetlagged). Any document that currently is parsed and even validated using
inline schemas, will continue to do so. I do not see this as a problem. The
problem is that some of these documents are used with applications (e.g.,
DOM based applications) that perform comparison of namespace-qualified names
and assume a literal interpretation of the namespaceuri. If this is changed,
the documents that use relative namespaceuris will not conform to the
assumptions made by these applications (and the guarantees given by the
namespace 1.0 spec), their information content will change and data will
break the applications based on the new DOM behaviour.

> 
> >2. retroactive changes can be introduced by vendors with minimal customer
> >disruption,
> 
> That I would think would be the case. Much larger changes 
> have been made.

The problem is really that we write tools and not their complete
applications. All customer applications that use the DOM and the customer
documents are outside of our control.

> >3. that changes larger than these employ a versioning mechanism,

That a new version of the spec may be ok (subject to point 4).

> >4. that a new version have compelling feature benefits to drive adoption
by
> >vendors and customers.
> 
> We are talking about a move to the way Microsoft customers have been
> using relative URIs in other contexts for years. This would IMHO go under
> the heading of bug fix rather than new feature.

Please note that the current namespace spec uses namespaceuris to determine
equality of names - nothing more and nothing less. In addition, by using
uris, it allows for certain applications/processors that are layered on top
of the namespaces (such as schema), to retrieve resources at the namespace
uri that provide semantics to the namespace. These are in principle two
orthogonal concepts (you could define name equality based on the provided
semantics, an expensive undertaking in the general case, or you could define
semantics with other mechanisms than namespace uris). I do not see, why
changing the name equality definition can be considered a bug fix. In my
opinion, it more looks like a design change request...

> >In the specific case being considered, none of these conditions appear to
> >obtain, and thus changes to the NS recommendation should not be
considered
> >as a possible option.
> 
> Your current software is quite inconsistent in that it uses 
> them as relative URIs at one moment and strings the next.

No. It layers. Basic name equality is decoupled from the schema resolution
interpretation of certain namespaceuris. There may have been better
mechanisms to do the later (I was not there, when the decision was made to
utilize namespace uris for schema references), but the mechanism is not
inconsistent and follows the letter and a reasonable interpretation of the
namespace spec.

> In fact, it only runs 
> because none of your users have tried the rather obscure test cases which
have 
> been generated to show this inconsistency. But one day they will. One day,
some 
> document will fail a well-formedness test even though it has been quite
properly 
> constructed with pointers to completely valid URIs of real schemas.  Two 
> namespaces will have different URIs but happen to be declared in contexts
such that 
> the relative URIs are in fact the same. The two namespaces will have to
happen to
> have attributes of the same name and some actual instance will have to
happen to
> validly use both attributes on the same element.  I bet it hasn't happened
> yet.  

I am confused about your well-formedness remark. If documents are not
well-formed, they will not parse, period. 

This scenario is the main scenario why I think that relative namespace URIs
should be discouraged (but not forbidden) for the purpose of defining
namescoping. However, that problem can also happen with absolute namespace
URI that are not globally unique (e.g., file://localhost/foo).

If two relative schema references in different documents are refering to
different inline schemas that have the same name, e.g., Schema in both
documents, then the above can happen. The current namespace spec addresses
this problem when referring to the definition of namespace uris that they
should be distinct if used for name referencing. People have a workaround,
do not rely on arbitrary relative URIs (or non globally unique absolute
URIs) for name equivalence. Either use globally-unique absolute uris or to
take extra care in your naming of relative URIs.  

> But one day. The document will fail the
> well-formedness test though quite valid. That will be a bug.
> If you don't fix it now then you will have to explain this 
> problem in great detail
> to your poor users, or just explain that there is a bug when using
relative
> URIs sometimes. 

Currently, this is not a bug, this is according to spec.

Best regards
Michael

PS: Tim, feel free to contact me in Amsterdam for a direct discussion (I
will be giving a talk tomorrow in the Web Publishing track).
Received on Thursday, 18 May 2000 09:16:49 UTC