Re: The Case For Redirection from pat hayes on 2003-09-26 (public-sw-meaning@w3.org from September 2003)

From: pat hayes <phayes@ihmc.us>
Date: Fri, 26 Sep 2003 14:22:43 -0500
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: public-sw-meaning@w3.org
Message-Id: <p06001f27bb9a0c9e1b62@[10.0.100.25]>
>From: pat hayes <phayes@ihmc.us>
>Subject: Re: The Case For Redirection
>Date: Thu, 25 Sep 2003 23:32:13 -0500
>
>[...]
>
>>  >>  > It seems to me that you can already do 
>>all this.  Use a URI reference with
>>  >>  > a fragment identifier as the name.  The 
>>web page and the RDF/XML content
>>  >>  > (and maybe OWL content (and maybe FOL 
>>content (and ...))) can all live at
>>  >>  > the URI address.  The query answering system could also live there, I
>>  >>  > guess.
>>  >>  >
>>  >>  > So you have
>>  >>  >
>>  >>  >	http://sandro.org/foaf#sandro		for the name
>>  >>  >	http://sandro.org/foaf.html 
>>	for the human readable page
>>  >>  >	http://sandro.org/foaf.rdf		for the RDF/XML content
>>  >>  >	(http://sandro.org/foaf.owl		for the OWL content)
>>  >>  >	http://sandor.org/foaf#sandro?...	for other purposes
>>  >>  >
>>  >>  > All, except the last, would be accessible via
>>  >>  > http://sandro.org/foaf#sandro.  The 
>>last, I think, needs extra parameters,
>>  >>  > and thus can't be just http://sandro.org/foaf#sandro. 
>>  >>  >
>>  >>  > This has the advantage that related names can share URIs.
>>  >>
>>  >>  With the right view of content negotiation, this might work.  But I'm
>>  >>  not sure it's possible.  This brings up the issue of what
>>  >>  "representation" means in HTTP, since content negotiation allows
>>  >>  selecting one representation from among many, according to its
>>  >>  media-type (aka MIME type, aka content type).
>>  >
>>  >Sure.  Sounds good to me.  You write an RDF-only tool and get RDF content.
>>  >I write an OWL tool and get OWL content.  Netscape writes a web browser and
>>  >gets HTML content.
>>
>>  I agree this picture makes short-term pragmatic
>>  sense. What bothers me about it is that the
>>  RDF/OWL kind of distinction doesn't seem anything
>>  like a media type distinction. Maybe Im being
>>  overly fussy, but to subsume semantic depth under
>>  media type feels like a hack, and I am sure that
>>  if we do this then it will come back to bite us
>>  fairly soon.  And we can do it in a cleaner way
>>  by using our own technology, see below.
>
>I don't see that there is any problem here.  Why shouldn't the difference
>between an RDF document and on OWL document that attempt to represent the
>same intuitive notions be just like the difference between a GIF document
>and a JPEG document that attempt to represent the same picture?

Because it just isn't like it. In the second case 
it is reasonably clear what the single thing is 
that these are alternative renderings of, viz. 
the actual picture; the same is not true for the 
RDF/OWL contrast. I think that to think of RDF 
and OWL as representing the "same intuitive 
notions" might be a very risky intuition to rely 
on.  Second, in the case of the OWL/RDF, there 
are no issues of presentation involved, ie how a 
document is best shown to a reader. That is what 
media type is centrally concerned with, however. 
Right now we havnt really dealt with the issues 
of media type for SW content, but one could 
easily imagine them arising: suppose someone 
wants to provide for blind SW users by reading 
OWL/RDF/XML in some special acoustic mode, for 
example? (BTW, the RDF WG ran into a late 
roadblock which arose from just this kind of 
concern, about the international readability of 
RDF/XML; these issues are out there waiting to 
cause trouble.) Third, media type just hasn't got 
anything to do with entailment or semantic 
content. I don't know how to explain this in more 
depth, it just seems kind of obvious.

Look, I can see that it would work with the 
present and immediately envisioned uses for 
RDF/OWL and HTML and so on. But it feels like 
using a piece of string to tie together some 
stuff that really needs to be tied with wire. It 
will work for a while and then it will break.

>
>>  >>  If you tried to fetch <http://sandro.org/foaf#sandro> (which would be
>>  >>  truncated to <http://sandro.org/foaf> before going out) with "Accept:
>>  >>  text/html" you would presumably get back a representation of
>  > >>  http://sandro.org/foaf.html (that is, you would receive a
>>  >>  serialization of a hypertext doument) and would probably think your
>>  >>  original URI identified an anchor in that document.
>>  >
>>  >Sounds fine to me.  If I have a tool that wants to display HTML then URI
>>  >references with fragment identifiers should be somehow related to HTML
>>  >anchors.
>>  >
>>  >>  If you tried to fetch the same URI with "Accept: application/rdf+xml"
>>  >>  you would presumably get back a representation of
>>  >>  http://sandro.org/foaf.rdf (that is, you would receive a serialization
>>  >>  of an RDF graph) and would probably think your original URI identified
>>  >>  an instance of foaf:Person.
>>  >
>>  >Again this sounds great to me.  If I have a tool that prefers RDF then this
>>  >seems to be just what the doctor ordered - URI references with fragment
>>  >identifiers are names, and it is possible to get information related to
>>  >that name by accessing an RDF document at the URI that is the URI reference
>>  >with the fragment identifier stripped off.
>>
>>  You are assuming that we will be using the same
>>  kind of tools we have now, and browsing and SW
>  > inference have nothing to do with one another.
>>  But surely we can hope for a future superBrowser
>>  which can use SW markup in some way to provide a
>>  more useful browsing experience. I have always
>>  assumed that we should be thinking about ways of
>>  including SW markup *inside* web pages, linked to
>>  the current visible HTML content.
>
>Sounds nice.  Propose a content type that allows this.  Browsers and RDF
>tools alike could employ these sorts of document, after suitable upgrades
>of course.

Right, exactly.  But I don't feel competent to 
suggest a content type. Why do we need one? 
Couldn't we find a way to include RDF/XML markup 
inside a XHTML header, for example, so that it 
had no effect on the browser rendering but could 
use all the HTML anchors? Then the RDF tools 
would have to know where to look, I guess.

>
>>  Then the
>>  conflation of HTML-# with RDF-# seems like a
>>  problem.  Like, if I can refer to a document, why
>>  can't I refer to a place in a document? Or to the
>>  anchor at that place in the document? I should be
>>  able to refer to anything, and it might be very
>>  useful to have an SW ontology which is about HTML
>>  documents.
>
>I don't see anything wrong with having a theory of HTML documents in RDF.
>This theory would provide denotations for certain URI references, tied to
>implementation using World Wide Web protocols.  If you have a URI reference
>that has been determined to denote an HTML document or a portion of an HTML
>document then the above situation would seem to still work.  The URI
>reference still has a denotation, which is a document or document section.
>Browsers can still use the URI reference as before.  Semantic Web tools can
>still use the URI reference as before, including accessing relevant
>information by retrieving a Semantic Web document at the related location.
>
>Where this does begin to break down is if you want to be able to access
>Semantic Web documents in a similar fashion, because there is no easy way
>to extend this to have Semantic Web documents that relate to other Semantic
>Web documents in the same way that Semantic Web documents relate to HTML
>documents.

Lets say that what we want is a way to refer to 
parts of an XML document. Web pages are XHTML and 
the SW docs are RDF/XML.  What is the essential 
difference, document-ontology-wise?  Documents 
are all kind of the same in general outline, they 
consist of text and markup arranged into 
recursive structures with anchors and embedded 
thingies like images. (I would like to be able to 
embed images directly into the RDF, actually, but 
no doubt we will have to use URIs as anchors to 
refer to them.)

>  Some (new) redirect mechanism would be needed to handle this
>case, but I don't think that it is needed right now.
>
>[...]
>
>>  >I agree that using redirection gets some of the same benefits.  I just
>>  >worry about the problems of uniformly using URI references without fragment
>>  >identifiers as names.  What happens if there is a document at that URI?
>  > >Is the URI reference then required to denote that document?
>>
>>  Better say not, given what you find at the URIs
>>  already out there, eg the RDF, RDFS and XSD
>>  no-frag URIs.  But why do we need to specify what
>>  they *denote* at all?
>
>Well at some stage of the game we would like to be able to have a Semantic
>Web theory of World Wide Web documents.  I think that at this stage we need
>to be able to specify that a particular URI reference denotes a particular
>document, just as the typed literal "3"^^xsd:integer denotes the integer 3.

At that stage, yes.  But I would like this 
denoting to follow the same kind of general rules 
as all the other denoting that goes on, rather 
than have to be tied to special media-type or 
file-extension machinery under the hood. And I 
forsee that there will be many schemes defined 
which set out to provide RDF/OWL vocabularies for 
the express purpose of referring to things like 
documents (pictures, football teams, state 
highways, whatever); so we could join this future 
club and set the ball rolling in the right 
direction by doing it that way ourselves.  Here 
we are defining a new way to describe things, so 
lets use it ourselves to describe what we want to 
describe. Dog food, right?

Right now, however, I don't think that the SW 
machinery requires us to specify any particular 
denotation for bareass URIs. (Except possibly for 
the object of owl:imports, but there in fact we 
really only need the HTML-accesed document, 
strictly, and the denotation is only to keep the 
semantics looking slightly decent: it plays no 
real role in anything.)

>
>>  What they *fetch* is one
>>  thing, what they denote can be another. I don't
>>  see any reason to impose general conditions on
>>  denotation in this case.  If we have to say
>>  something, it would make more sense to say that
>>  IF there is a document at that URI which asserts
>>  that they denote something, THEN that is what
>>  they denote: ie what the document says, not the
>>  document. That is, they SHOULD denote something
>>  that makes that document true.  We don't have to
>>  go into details about what kind of 'true' that
>>  is.  That has the merit of making a rough kind of
>>  semantic sense and also covering the actual
>>  examples which are out there, eg XSD asserts that
>>  'XSD' denotes a namespace.
>
>I don't follow this at all.

See my long 'notes' rant about the difference 
between denotation and network identification. 
For the examples, just look at what you get when 
you give these to your browser:
http://www.w3.org/2001/XMLSchema
http://www.w3.org/2000/01/rdf-schema
http://www.w3.org/1999/02/22-rdf-syntax-ns

>
>>  It occurs to me that this all follows from a
>>  simple principle: an URI should not Web-retrieve
>>  anything that is SW-false of itself.  So if A
>>  fetches a document which says that A denotes,
>>  say, a certain namespace, then it has to denote
>>  that namespace. Again, never mind for now about
>>  HOW it says that.
>
>I think that this is incredibly prone to problems.  This seems to be the
>``Use Implies Consent'' notion in another guise. If the meaning of any URI
>in the Semantic Web is inextricably tied up with a document that can be
>fetched at that URI,

Just for the record, I didn't say that. I said IF 
there is a document there which also says 
something about the referent of the URI, THEN.... 
This isn't likely to happen by accident, and in 
many cases it won't happen at all.  I don't see 
that this is prone to any special problems; in 
many ways its just a way of saying that a URI 
points to something that says something.

>  then I would argue that Lucent should not use any URIs
>that are not controlled by Lucent or organizations that Lucent trusts.
>This would make it very hard for Lucent to communicate with its suppliers
>or customers, many of which do not belong to this select group.

Seems to me that Lucent management should 
consider such a step very carefully; but in any 
case, I don't see the problem or why you would 
make such a recommendation.  Lucent uses URIs 
right now that point to documents that are not 
under their control, I am sure.  Why would the SW 
change this? Nobody is suggesting that Lucent 
would be in any way *responsible* for the 
meanings of URIs not under their control; but if 
Lucent is suffering from such corporate paranoia 
that it is unwilling to use, for example, 
W3C-defined protocols like XML, then Lucent will 
go down the tubes fairly rapidly.  Commerce 
requires *some* degree of trust.

If the reason for this feeling, by the way, is 
that there would be nothing to stop some careless 
or evil guy changing the document at the URI, 
thereby sabotaging the Lucent content which 
points to it, I think this is a genuine issue 
which the SW needs to face up to, but I don't see 
any grounds for paranoia. Either the URI 
references can be archived and dated, or else the 
general Web conventions will absolve the agent 
who points to another agent's document from being 
responsible for changes made to it.  Either way, 
Lucent can cover its ass appropriately and still 
use the machinery of the Web along with everyone 
else. This is a general issue on the Web, but the 
Web seems to get along reasonably well without 
having officially solved it.

Pat


>[...]
>
>peter


-- 
---------------------------------------------------------------------
IHMC	(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32501			(850)291 0667    cell
phayes@ihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 26 September 2003 15:22:55 UTC