Re: "canonical" URIs from noah_mendelsohn@us.ibm.com on 2002-03-20 (www-tag@w3.org from March 2002)

From: <noah_mendelsohn@us.ibm.com>
Date: Tue, 19 Mar 2002 21:21:33 -0500
To: Mark Baker <distobj@acm.org>
Cc: david.orchard@bea.com (David Orchard), www-tag@w3.org
Message-ID: <OFA24D20CE.D9077D55-ON85256B82.000D6A25@lotus.com>
Mark Baker asks:

>> Why is it (comparing anyUri's) any of XML's business?

Because schema has an "enumeration" facet which allows you to build 
"enumerated" subtypes.  The question is, if I build a subtype of anyUri 
that accepts only:

"http://www.ibm.com"

will it also accept uppercase?  That's surely a question for which schema 
needs an answer.  The other place this comes up is in the so-called 
key/keyref constructions, where we can assert that the value of one 
element or attribute must match that of another somewhere else in the 
document. 

David's proposal, with which I concur is "no", schema shouldn't have to 
know about every URI scheme and its rules.  For purposes of the 
enumeration facet, string compare it is.  I think anything else is 
impractical within the sorts of settings where schema validation is done. 
Schema is explicit that the notion of equality that we have for our 
datatypes is not always that which might be wanted for applications (let's 
not start a schema flame war here.)  We have some related problems in 
dealing with -0.0 and +0.0 in IEEE floating point, for example, though I 
don't remember the details.  We fully expect that applications using 
schema validated data will use exactly the IEEE rules of negative and 
positive zero, as we would expect web applications to respect the scheme's 
rules for URI comparison.  When computing enumerations, we need to make 
some practical compromises.

------------------------------------------------------------------
Noah Mendelsohn                              Voice: 1-617-693-4036
IBM Corporation                                Fax: 1-617-693-8676
One Rogers Street
Cambridge, MA 02142
------------------------------------------------------------------







Mark Baker <distobj@acm.org>
Sent by: www-tag-request@w3.org
03/19/2002 09:19 PM

 
        To:     david.orchard@bea.com (David Orchard)
        cc:     www-tag@w3.org, (bcc: Noah Mendelsohn/Cambridge/IBM)
        Subject:        Re: "canonical" URIs


Dave,

> TAG members,
> 
> I don't see URI comparison officially listed as a TAG issue.  I'd like
> Joseph/Stephen's issue added to the TAG issues list.
> 
> Equivalence rules for URIs are defined by the URI scheme.

Not all of them.  DAML can be used to assert equivalence about any
resources.  HTTP redirection can be used for the same purposes, for
resources other than those using the HTTP URI scheme.

I recall a TimBL message where he enumerated the possible layers of
equivalence.  Can't find it though.

>  HTTP has a
> section on URI comparison.
>
> However, XML does not have a default comparison function for the XML 
Schema
> anyURI data type.  I think a reasonable approach would be to say that 
the
> default comparision function for anyURI is to use the HTTP URI 
comparison
> algorithm, but that it is overridable by any scheme.

Why is it any of XML's business?  RFC 2396 already says everything that
needs saying; that syntactic comparison is a function of the URI scheme.
If you don't recognize the URI scheme, you can only compare for an exact
match.

Of course, anyURI is a URI reference, not a URI.  In order to compare
two URI refs that aren't syntactically identical, I believe you would
have to dereference them, as the media type is presumably the authority
on whether the frag ids are case sensitive or not.  Oh joy. 8-/

MB
-- 
Mark Baker, Chief Science Officer, Planetfred, Inc.
Ottawa, Ontario, CANADA.      mbaker@planetfred.com
http://www.markbaker.ca   http://www.planetfred.com
Received on Tuesday, 19 March 2002 21:37:47 UTC