Re: xsd:anyURI, rdf URIs, information resources from Jonathan Rees on 2008-07-03 (public-awwsw@w3.org from July 2008)

From: Jonathan Rees <jar@creativecommons.org>
Date: Thu, 3 Jul 2008 09:29:36 -0400
To: "Alan Ruttenberg" <alanruttenberg@gmail.com>
Cc: "Booth, David (HP Software - Boston)" <dbooth@hp.com>, "public-awwsw@w3.org" <public-awwsw@w3.org>, "Stasinos Konstantopoulos" <konstant@iit.demokritos.gr>, "Ivan Herman" <ivan@w3.org>, "Dan Connolly" <connolly@w3.org>, "Phil Archer" <parcher@icra.org>, "W3C SW Coordination Group" <w3c-semweb-cg@w3.org>, "Matt Womer" <mdw@w3.org>, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Message-ID: <760bcb2a0807030629t61617458gd10b76c8a274dda@mail.gmail.com>

> The point is that in this case they are the same responses by definition,
> because it is the same server responding to the same (from its point of
> view) message. It's not possible that there is some different process that
> happens to be responding the same way.
>

True. However I see no reason the two http:-noncanonical URIs couldn't
*denote* different things (following OWL-DL model theory, for example) even
if they must *identify* the same thing according to the HTTP protocol.

One could easily write OWL that says inconsistent things about the referents
of the two http:-noncanonical URIs, and not run into any inconsistency or
any other trouble with any spec. You would only run afoul of common sense,
that's all.

RDF and OWL might have come out and said that URIs *must* denote (or be
interpreted to mean) what they are constrained to identify according to
protocol (when anything is said about that in an RFC). This would support
your case since the HTTP RFC says that http:-equivalent URIs identify the
same resource (IIRC... I'd have to check this). However, as far as I know
the denotation (RDF/OWL) / identification (http:/mid:/ etc.) relationship
isn't stated anywhere except in some email somewhere from Pat Hayes. (I
don't think such a rigid rule linking protocol to denotation is even
desirable, but that's another story.) Again, one could apply common sense,
which I think is what you're doing, but I'm not convinced it works here.

The POWDER draft starts down a path that might lead to a reasonable theory
relating protocol to denotation, but (a) it's not a rec yet, and (b) it is
not very explicit about the theory it espouses. From my reading it appears
to be purely operational ("normalize the URI in this way for this particular
purpose within POWDER"), not principled ("we canonicalize in this way for
this reason, with interoperability with the following RFCs and recs"). This
is not a particular criticism of POWDER, since all the other RFCs and recs
work pretty much the same way.

In short: I fully sympathize, and I fully agree that the situation is
confusing. I'm just not sure it's quite as clear-cut as you suggest. Of
course I am happy to be proven otherwise.

Personally I think I would try to keep scheme-specific canonicalization out
of all the recs, as it is now (mostly), and make a very clear distinction
between denotation and identification (the latter being aware of
protocol-specific canonicalization, the former not). Maybe bring
scheme-insensitive canonicalization, such as case insensitivity of scheme
name, into *all* of the recs uniformly, including XSD, RDF, SPARQL, ....
POWDER may be a special case that would know about http:-specific
canonicalization because it needs to for its own purposes.

Another approach would be to make scheme-specific canonicalization rules
discoverable through some kind of follow-your-nose mechanism; maybe the
scheme registry could lead to some RDF that could lead to some... perl
code???  Another approach would be to say that no future URI scheme can
specify a new canonicalization method, limiting what needs to be understood
to the scheme-specific canonicalization rules in use as of such and such a
date... both of these approaches sound nutty to me, but if we conclude they
promote safety and/or consistency they have to be considered.

Jonathan

Received on Thursday, 3 July 2008 13:30:12 UTC