Re: The Case For Redirection

>From: Sandro Hawke <sandro@w3.org>
>Subject: The Case For Redirection (was Re: Some Requirements)
>Date: Thu, 25 Sep 2003 14:08:06 -0400
>
>>  > > On behalf of RDF authors everywhere, I'd like to make a URI which is
>>  > > usable in several, simultaneous ways:
>>  > >
>>  > >   r1.  As a not-overloaded name, essentially a logical constant term,
>>  > >        as specified in RDF Semantics [1].   The URI will be my name
>>  > >        for something; others may make other names for the thing or
>>  > >        reuse my name for it.
>>  >
>>  > Seems already covered.
>>  >
>>  > >   r2.  As a web page address for human-readable content, working in
>>  > >        currently-deployed browsers, giving users direct access to web
>>  > >        content which I supply associated with the URI;
>>  > >
>>  > >   r3.  As a web address for RDF/XML content, allowing simple systems
>>  > >        to fetch a small to medium size knowledge base, which I supply,
>>  > >        associated with the URI
>>  >
>>  > Can't you use content negotiation for this? 
>>
>>  Sort of.  See below.
>>
>>  > >   r4.  As the address of a query answering service, allowing more
>>  > >        complex systems faster access to larger knowledge bases, which
>>  > >        I also supply associated with the URI
>>  >
>>  > I don't know how this would work.  Which 
>>knowledge base?  How is it related
>>  > to the URI reference with optional fragment identifier?
>>
>>  r3 and r4 both involve access to the RDF knowledge base that the host of
>>  the URI provides for people asking about it; the difference is that
>>  with r3 you get a dump of the whole thing; with r4 you get to pose
>>  your query, eg with DQL.
>>
>>  > It seems to me that you can already do all this.  Use a URI reference with
>>  > a fragment identifier as the name.  The web page and the RDF/XML content
>>  > (and maybe OWL content (and maybe FOL content (and ...))) can all live at
>>  > the URI address.  The query answering system could also live there, I
>>  > guess.
>>  >
>>  > So you have
>>  >
>  > >	http://sandro.org/foaf#sandro		for the name
>>  >	http://sandro.org/foaf.html		for the human readable page
>>  >	http://sandro.org/foaf.rdf		for the RDF/XML content
>>  >	(http://sandro.org/foaf.owl		for the OWL content)
>>  >	http://sandor.org/foaf#sandro?...	for other purposes
>>  >
>>  > All, except the last, would be accessible via
>>  > http://sandro.org/foaf#sandro.  The last, I think, needs extra parameters,
>>  > and thus can't be just http://sandro.org/foaf#sandro. 
>>  >
>>  > This has the advantage that related names can share URIs.
>>
>>  With the right view of content negotiation, this might work.  But I'm
>>  not sure it's possible.  This brings up the issue of what
>>  "representation" means in HTTP, since content negotiation allows
>>  selecting one representation from among many, according to its
>>  media-type (aka MIME type, aka content type).
>
>Sure.  Sounds good to me.  You write an RDF-only tool and get RDF content.
>I write an OWL tool and get OWL content.  Netscape writes a web browser and
>gets HTML content.

I agree this picture makes short-term pragmatic 
sense. What bothers me about it is that the 
RDF/OWL kind of distinction doesn't seem anything 
like a media type distinction. Maybe Im being 
overly fussy, but to subsume semantic depth under 
media type feels like a hack, and I am sure that 
if we do this then it will come back to bite us 
fairly soon.  And we can do it in a cleaner way 
by using our own technology, see below.

>
>>  If you tried to fetch <http://sandro.org/foaf#sandro> (which would be
>>  truncated to <http://sandro.org/foaf> before going out) with "Accept:
>>  text/html" you would presumably get back a representation of
>>  http://sandro.org/foaf.html (that is, you would receive a
>>  serialization of a hypertext doument) and would probably think your
>>  original URI identified an anchor in that document.
>
>Sounds fine to me.  If I have a tool that wants to display HTML then URI
>references with fragment identifiers should be somehow related to HTML
>anchors.
>
>>  If you tried to fetch the same URI with "Accept: application/rdf+xml"
>>  you would presumably get back a representation of
>>  http://sandro.org/foaf.rdf (that is, you would receive a serialization
>>  of an RDF graph) and would probably think your original URI identified
>>  an instance of foaf:Person.
>
>Again this sounds great to me.  If I have a tool that prefers RDF then this
>seems to be just what the doctor ordered - URI references with fragment
>identifiers are names, and it is possible to get information related to
>that name by accessing an RDF document at the URI that is the URI reference
>with the fragment identifier stripped off.


You are assuming that we will be using the same 
kind of tools we have now, and browsing and SW 
inference have nothing to do with one another. 
But surely we can hope for a future superBrowser 
which can use SW markup in some way to provide a 
more useful browsing experience. I have always 
assumed that we should be thinking about ways of 
including SW markup *inside* web pages, linked to 
the current visible HTML content. Then the 
conflation of HTML-# with RDF-# seems like a 
problem.  Like, if I can refer to a document, why 
can't I refer to a place in a document? Or to the 
anchor at that place in the document? I should be 
able to refer to anything, and it might be very 
useful to have an SW ontology which is about HTML 
documents.

>
>>  So this doesn't seem so good.  I've been over this and over this, and
>>  have been unable to find a way to get both r2 and r3, with URIs like
>>  http://foo#bar.
>
>I don't see what is missing from the above approach.
>
>>  > You could even use redirection to use a URI reference without a fragment
>>  > identifier.  I think, however, that this is a mistake.
>>
>>  With redirection and no fragment identifiers, I can get r2 and r3:
>>
>>  1.  You fetch http://sandro.org/foaf/sandro with "Accept: text/html"
>>      and you are redirected to http://sandro.org/foaf/sandro.html, a web
>>      page about sandro (me).
>>
>>  2.  You fetch http://sandro.org/foaf/sandro with "Accept:
>>      application/rdf+xml" and you are redirected to
>>      http://sandro.org/foaf/sandro.rdf, an RDF/XML document which
>>      probably says some formal things about me, referring to me as
>>      <http://sandro.org/foaf/sandro>.
>>
>>  The redirection is important if we interpret "representation"
>>  strictly, since it lets us give an answer without claiming the
>>  returned byte string is "a representation" of the thing you asked
>>  about.  The "303 See Other" redirection says "I'm not going to give
>>  you a representation; why don't you ask over there...."
>
>I agree that using redirection gets some of the same benefits.  I just
>worry about the problems of uniformly using URI references without fragment
>identifiers as names.  What happens if there is a document at that URI?
>Is the URI reference then required to denote that document?

Better say not, given what you find at the URIs 
already out there, eg the RDF, RDFS and XSD 
no-frag URIs.  But why do we need to specify what 
they *denote* at all? What they *fetch* is one 
thing, what they denote can be another. I don't 
see any reason to impose general conditions on 
denotation in this case.  If we have to say 
something, it would make more sense to say that 
IF there is a document at that URI which asserts 
that they denote something, THEN that is what 
they denote: ie what the document says, not the 
document. That is, they SHOULD denote something 
that makes that document true.  We don't have to 
go into details about what kind of 'true' that 
is.  That has the merit of making a rough kind of 
semantic sense and also covering the actual 
examples which are out there, eg XSD asserts that 
'XSD' denotes a namespace.

It occurs to me that this all follows from a 
simple principle: an URI should not Web-retrieve 
anything that is SW-false of itself.  So if A 
fetches a document which says that A denotes, 
say, a certain namespace, then it has to denote 
that namespace. Again, never mind for now about 
HOW it says that.

>
>>  It's also important if you want to help users keep straight the
>>  difference between the URI of a person and the URI of a web page about
>  > the person.  That is, I'd like people to distinguish between saying
>>
>>       <http://sandro.org/foaf/sandro> a Liar.      # that person lies
>>
>>  and
>  >      <http://sandro.org/foaf/sandro.html> a Liar. # that page lies

I think a better way of handling this would be to 
use a convention in RDF itself which treats the 
various documents as properties of a single thing 
denoted by the URI.  Think of a bare-ass URI as 
denoting an intensional thing called a 'webdoc'. 
Never mind what webdocs 'really are'; all that 
matters about them is that they have properties 
you can detect, such as their HTML rendering, 
their RDF rendering, whatever. Not all of these 
may exist, but when they do, they are genuine 
syntactic thingies that you can express as 
strings and look at. Now you can say things like

http://sandro.org/foaf/sandro form:HTML _:x .

and maybe some tool knows enough to automatically 
bind _:x to an HTML document, which it can show 
you if you like; but you can also assert things 
like

_:x rdf:type Liar .

which is not the same as

http://sandro.org/foaf/sandro rdf:type Liar .

or as

http://sandro.org/foaf/sandro form:RDF _:y .
_:y rdf:type Liar .

or  indeed as

http://sandro.org/foaf/sandro#sandro rdf:type Liar .

and to just say

http://sandro.org/foaf/sandro rdf:type Liar  .

is probably a category error, since webdocs just 
aren't the kind of thing that can be said to be 
Liars.

(BTW, these bnodes can always be skolemized into 
'new' URIs by some binding tool if you wanted to 
do that, and if there were some URI-form 
conventions then it would be easy to arrange that 
the appropriate forms were used, so this is quite 
compatible with your other style of doing things. 
BUt it doesn't require it.)

Basic idea is to utilize the fact that we now 
have ways of *describing* things and denoting 
them by names, so that we can distinguish them: 
so why not use it, rather than trying to hack it 
by using old tricks like file extensions or media 
types, which are already loaded with legacy 
meaning from other technologies anyway. We can 
write specs for RDF ontologies like this pretty 
easily, I'm sure.  Then we don't have to muck up 
or re-write existing Web low-level conventions, 
and we can use XML to convey all the information, 
and it's automatically accessible to the SW 
reasoners if we want to try getting fancy. For 
example, it occurs to me that one way to tell a 
simple-minded OWL engine to just use a bare URL 
to denote the HTML document, say, would be to 
tell it work out the consequences of

form:RDF rdf:subPropertyOf owl:sameAs .

which is kind of sneaky, but a lot simpler than 
reprogramming the low-level MIME handling code, 
or tweaking the file extensions in PERL, or some 
other Unix-geekish hack.

Pat



>  >
>>  and the redirection helps keep that clear, by changing what's
>>  displayed in the "Address" or "Location" bar.
>
>Perhaps it would be possible to use only URIs without file extensions to
>denote non-pages.  This would however go against the RDF way of doing
>thing.
>
>>  Also, we can get r4, too: the RDF/XML document at
>>  http://sandro.org/foaf/sandro.rdf can describe some query answering
>>  services and declare them to be as authoritative as itself.
>
>Perhaps.
>
>[...]
>
>>       -- sandro
>
>peter


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes

Received on Friday, 26 September 2003 00:32:33 UTC