Re: owl:sameAs use/misuse/abuse Re: homonym URIs from Jacek Kopecky on 2007-06-25 (semantic-web@w3.org from June 2007)

From: Jacek Kopecky <jacek.kopecky@deri.org>
Date: Mon, 25 Jun 2007 16:39:42 +0200
To: Richard Cyganiak <richard@cyganiak.de>
Cc: Bernard Vatant <bernard.vatant@mondeca.com>, Tim Berners-Lee <timbl@w3.org>, semantic-web@w3.org
Message-Id: <1182782382.3417.51.camel@localhost>
Richard, 
thanks for the response. I have to admit I'm not yet sure I got it.
I have below a chain of reasoning that I assume you will see as broken
at some point. Please let me know where it's broken and why. 8-)

On Mon, 2007-06-25 at 12:06 +0200, Richard Cyganiak wrote:
> 
> On 25 Jun 2007, at 01:01, Jacek Kopecky wrote:
> 
> > Wow. It's a nice description of the difference between the symbol and
> > its referent. But I guess I should be able to state that
> >
> > @prefix ex <http://example.org/>
> > ex:A http:redirects303To <http://example.org/A/doc>
> > ex:B http:redirects303To <http://example.org/B/doc>
> > ex:A owl:sameAs ex:B
> >
> > (which I think was Bernard's point)
> >
> > Or is it wrong to create http:redirects303To as a statement about the
> > HTTP resource? If so, why?
> 
> It's wrong indeed. Because RDF statements always are about the  
> referents, and never about the identifier. The redirection is a  
> property of the identifier system (the URI), and not of the  
> identified thing. If I say:
> 
> <http://dbpedia.org/resource/Berlin> http:redirectsTo <http:// 
> dbpedia.org/page/Berlin> .
> 
> Then I have said “the city of Berlin redirects to a web page about  
> the city of Berlin.” Which is nonsense.
> 
> Same with things like:
> 
> <http://dbpedia.org/resource/Berlin> str:numOfCharacters 33 .

It's not the same. The redirection is not a property of the URI (you
can't tell if a URI will redirect just by looking at the URI), it's a
property of the dereferencing mechanism and the server setting.

I expect we can say for an information resource something like this:

<http://example.com/>  http:representation "<HTML>...</HTML>" .

Because information resources do have representations. Let's assume that
http:representation means "at one point in time had this
representation", or it could be timestamped and conneg-qualified etc.

But IMO the representation is as worthy of being had by an information
resource as are the other HTTP properties, e.g. the status code when GET
is done on the resource:

<http://example.com/> http:getStatusCode "200"^^xs:int .

Or is this not allowed because of something similar to the difference
between "response headers" and "entity headers" in the HTTP response? 

If http:getStatusCode is allowed, what exactly is the line between this
and the following?

<http://example.com/foo> http:getStatusCode "303"^^xs:int .

Especially if this triple is asserted by an automated crawler that tries
to dereference URIs and records the status codes *returned by the
resources*.

And my http:redirects303To is IMO on par with http:getStatusCode.

You see, I'm not trying to talk about the URI (e.g. being 33 chars long)
but about the resource. An HTTP information resource is available for
dereferencing (communication) over HTTP, so it should have HTTP
properties. And if so, any resource identified by a URI starting with
http:// with no fragID gives me the license to talk to it over HTTP, so
it should also have HTTP properties.

> [snip] 

> > I've always been uneasy about the 303 approach to having http: URIs
> > denote non-information resources; I guess I'd be in the 'hash' camp.
> > Basically, my feeling is that 303 does not fully solve the issue,  
> > so it
> > should be a softer recommendation than a W3C Recommendation MUST.
> 
> It isn't a MUST, and I've never seen anyone suggest that it should be.
> 
> Hash URIs and 303 URIs are both perfectly fine as identifiers for non- 
> information resources, both with their pros and cons (discussed at  
> length in e.g. [1], [2] and [3]).

Well, the HttpRange draft [3] says: 

        According to the HTTP specification, when a code of 200 is
        received in response to an HTTP GET request, it indicates that
        "an entity corresponding to the requested resource" has been
        returned in the response. The contents of this entity is what we
        understand as a representation of the resource. This
        correspondence between a resource and a representation is
        defined in [AWWW] as characterising an information resource.
        Consequently, we can assume that if we receive this particular
        response code in response to an HTTP GET request, we have also
        received a representation and that the URI references an
        information resource.

This is a chain of statements not qualified to be less than true (e.g.
SHOULD-level recommendations). I interpret MUST as "it just is so", same
as factual statements. MUST is used in specese to make sure the reader
understands it, but in my reading, "the client send a message" is the
same as "the client MUST send a message". 

So this is where I get the "if you get 200, the URI MUST identify an
information resource", and this is what I'm not comfy with.

Best regards,
Jacek

> 
> Richard
> 
> [1] http://www.w3.org/TR/swbp-vocab-pub/
> [2] http://www.dfki.uni-kl.de/~sauermann/2006/11/cooluris/
> [3] http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14
> 
>
Received on Monday, 25 June 2007 14:40:23 UTC