Re: xsd:anyURI, rdf URIs, information resources from Alan Ruttenberg on 2008-07-03 (public-awwsw@w3.org from July 2008)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Thu, 3 Jul 2008 11:23:38 -0400
To: Jonathan Rees <jar@creativecommons.org>
Cc: "Booth, David (HP Software - Boston)" <dbooth@hp.com>, "public-awwsw@w3.org" <public-awwsw@w3.org>, "Stasinos Konstantopoulos" <konstant@iit.demokritos.gr>, "Ivan Herman" <ivan@w3.org>, "Dan Connolly" <connolly@w3.org>, "Phil Archer" <parcher@icra.org>, "W3C SW Coordination Group" <w3c-semweb-cg@w3.org>, "Matt Womer" <mdw@w3.org>, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Message-Id: <4EF5605A-3A79-493C-B685-EC2FE15B8C6E@gmail.com>
On Jul 3, 2008, at 9:29 AM, Jonathan Rees wrote:

>
> The point is that in this case they are the same responses by  
> definition, because it is the same server responding to the same  
> (from its point of view) message. It's not possible that there is  
> some different process that happens to be responding the same way.
>
> True. However I see no reason the two http:-noncanonical URIs  
> couldn't *denote* different things (following OWL-DL model theory,  
> for example) even if they must *identify* the same thing according  
> to the HTTP protocol.
>
> One could easily write OWL that says inconsistent things about the  
> referents of the two http:-noncanonical URIs, and not run into any  
> inconsistency or any other trouble with any spec.

Indeed. I think I've emphasized that.

> You would only run afoul of common sense, that's all.

I don't think I'd put it that way. Rather, I'd say that our advise  
that using the http transport uniformly to retrieve documentation  
about resources doesn't work if you do that. I think that needs to be  
fixed, one way or another. There's more than one way to do that. Seems  
to me that's within the scope of our work here.

> RDF and OWL might have come out and said that URIs *must* denote (or  
> be interpreted to mean) what they are constrained to identify  
> according to protocol (when anything is said about that in an RFC).  
> This would support your case since the HTTP RFC says that http:- 
> equivalent URIs identify the same resource (IIRC... I'd have to  
> check this).

That is not my case. My case is that there there is a mismatch between  
RDF's use of URI References, and the actual semantics of those URIs as  
defined by http. I then posit that to the extent that we want them to  
match, such as when we are giving advise to use http URIs - something  
based primarily on the utility of the protocol interpretation of those  
strings - otherwise we could use some other kind of string - it's in  
our interest to repair that mismatch, since otherwise people will  
rightfully point out that our argument for using http uris as  
identifiers is weakened.

> However, as far as I know the denotation (RDF/OWL) / identification  
> (http:/mid:/ etc.) relationship isn't stated anywhere except in some  
> email somewhere from Pat Hayes. (I don't think such a rigid rule  
> linking protocol to denotation is even desirable, but that's another  
> story.)

I haven't claimed there is.

> Again, one could apply common sense, which I think is what you're  
> doing, but I'm not convinced it works here.

In this forum, I thought our job was to stop relying on common sense  
and instead document the common sense so that it becomes something  
that people can build reliable systems on. In its status as common  
sense it is too subject to misunderstanding and confusion.

The only thing that doesn't work, at the moment, is our advise that  
http URIs are good identifiers for semantic web use because it is  
possible to use them to retrieve documentation about resources. It's  
not that it completely doesn't work, it's that there is a set of cases  
where it doesn't work even if you try.

> The POWDER draft starts down a path that might lead to a reasonable  
> theory relating protocol to denotation, but (a) it's not a rec yet,  
> and (b) it is not very explicit about the theory it espouses. From  
> my reading it appears to be purely operational ("normalize the URI  
> in this way for this particular purpose within POWDER"), not  
> principled ("we canonicalize in this way for this reason, with  
> interoperability with the following RFCs and recs"). This is not a  
> particular criticism of POWDER, since all the other RFCs and recs  
> work pretty much the same way.
>
> In short: I fully sympathize, and I fully agree that the situation  
> is confusing. I'm just not sure it's quite as clear-cut as you  
> suggest. Of course I am happy to be proven otherwise.

There's not much that I've claimed is clear cut. Let me restate, in  
the hope that we can agree:
- there is no reason to believe that <HTTP://purl.org/obo/obi.owl> is  
the same as <http://purl.org/obo/obi.owl> according to the RDF  
semantics.
- There is reason to believe that <http://purl.org/obo/obi.owl>  and <http://purl.org/obo/obi.owl 
 >  are the same thing.
- According to the http protocol, the semantics of http://purl.org/obo/obi.owl 
  and HTTP://purl.org/obo/obi.owl are the same.
- These two semantics differ.
- Moreover there is a proliferation of escaping and canonicalization  
rules in various of the W3C specs that are not, upon quick inspection,  
obviously compatible. Therefore there may be further distinct  
semantics beyond the RDF an HTTP cases.
- There are some options for what to do about this. Following are  
some, there may be others.
   - Nothing (not an option I support)
   - Define a set of rules/assertions that formalize the http  
semantics in RDF and then promote their use in the context of semantic  
web protocols. (a similar exercise to that which David undertook re:  
IRs, and which Tim wants for tabulator)
   - Work cross organization towards harmonizing the variations in the  
specifications
   - Work to make more fundamental changes in the underlying  
specifications.


> Personally I think I would try to keep scheme-specific  
> canonicalization out of all the recs, as it is now (mostly), and  
> make a very clear distinction between denotation and identification  
> (the latter being aware of protocol-specific canonicalization, the  
> former not).
Not sure about this one.
> Maybe bring scheme-insensitive canonicalization, such as case  
> insensitivity of scheme name, into *all* of the recs uniformly,  
> including XSD, RDF, SPARQL, ....
I'm all for any uniformity we can get.
> POWDER may be a special case that would know about http:-specific  
> canonicalization because it needs to for its own purposes.
I think it's an option to make all the specs aware of http: specific  
canonicalization because of its importance across all of our work, and  
because we promote its use so much.

> Another approach would be to make scheme-specific canonicalization  
> rules discoverable through some kind of follow-your-nose mechanism;  
> maybe the scheme registry could lead to some RDF that could lead to  
> some... perl code???  Another approach would be to say that no  
> future URI scheme can specify a new canonicalization method,  
> limiting what needs to be understood to the scheme-specific  
> canonicalization rules in use as of such and such a date... both of  
> these approaches sound nutty to me, but if we conclude they promote  
> safety and/or consistency they have to be considered.

They don't sound nutty to me.

-Alan
Received on Thursday, 3 July 2008 15:24:48 UTC