Re: [URI vs. URIViews] draft-frags-borden-00.txt from Jonathan Borden on 2002-02-25 (www-rdf-logic@w3.org from February 2002)

From: Jonathan Borden <jonathan@openhealth.org>
Date: Mon, 25 Feb 2002 00:55:04 -0500
To: "Pat Hayes" <phayes@ai.uwf.edu>
Cc: <www-rdf-logic@w3.org>
Message-ID: <000701c1bdc1$05cc56e0$0301a8c0@ne.mediaone.net>
Pat,

Followup here to rdf-logic,

> >
> >Does the current MT say that a "subject is a URIref" ? If so this seems
to
> >be a significant change rather than a clarification.
>
> I believe the MT has always said this. Certainly that is my
> understanding of the basic graph syntax: triples consist of a
> subject, a property and an object, all of which can be urirefs. We
> are talking about the actual graph syntax here, right? Not what it
> denotes. So in this sense of 'subject' the subject of a sentence is a
> word, not what the word names.


Sigh... well we've been over the issue of the conflation of abstract syntax
and semantics before, eh? But read RDF 1 (again) it literally does say that
the subject is a resource (not URIref) and that the object may be either a
resource or literal.

>
> (Now, of course, urirefs are themselves resources, since everything
> is a resource....)

Nice try, but again the "data" scheme is used to quote:

data:text/uri;http://example.org/Unicorn

>
> Good question. I will respond for myself, not in the name of the WG.
>
> Answer: Yes and no.
[snip]

> So to the extent that RDF inference depends on this ability to
> cross-identify urirefs in various documents, the answer is No.

Note that this ultimately gets back to my own issue with how RDF constructs
a URIref from a namespace and a property name (XML element local name).

RDF concatenates. This isn't correct and usually works _only_ because RDF
namespaces are essentially required to end in '#'. Of course the local name
_should_ refer to part of an RDF document (particularly an RDF Schema that
declares the property). What _should_ be done is to acknowledge this
explicitly and insert a '#' between the namespace name (e.g.
http://www.w3.org/TR/REC-rdf-syntax and the local name. This way the local
name refers to the part of the RDF Schema identified by rdf:ID and which is
HTTP GETable at the namespace URI.

>
> Notice that this is not a contradiction, but it is an equivocation
> upon 'meaning'. As far as RDF *meaning* is concerned, urirefs are
> opaque. But as far as what might be called the RDF global *syntax* is
> concerned, they are not opaque. RDF (and all web ontology languages)
> depend on a global agreement about the ability to recognize identity
> of *symbols* across documents, and that in turn - although simply
> considered a 'primitive' feature of the syntax and hence of the model
> theory - depends on the internal structure of urirefs being treated
> in a certain coherent way.

I agree. Again this depends on RDF employing HTTP GET, which, err, means
that RDF/SW and the (sic) REST of the Web need to coexist.
...
>
> And although this is not specified formally, I would expect to be
> able to use the absolute URL as a likely place to locate RDF
> assertions which use the uriref. However, the rest of the WG might
> shoot me down on that.

There is a big tension on the SW about where to find definative information
'about' a URIref, on one hand HTTP GET tells you something that the owner of
the URIref intends to say about the URI. On the other hand RDF allows anyone
to say anything about anything. How is one to know what to believe? RDF has
a big shrug on that one -- and a heck of alot more handwaving about trust
than I see about any of the 'resource' stuff.

> >
> >[[
> >Resources
> >
> >All things being described by RDF expressions are called resources. A
> >resource may be an entire Web page; such as the HTML document
> >"http://www.w3.org/Overview.html" for example. A resource may be a part
of a
> >Web page; e.g. a specific HTML or XML element within the document source.
A
> >resource may also be a whole collection of pages; e.g. an entire Web
site. A
> >resource may also be an object that is not directly accessible via the
Web;
> >e.g. a printed book. Resources are always named by URIs plus optional
anchor
> >ids (see [URI]). Anything can have a URI; the extensibility of URIs
allows
> >the introduction of identifiers for any entity imaginable.
> >]]
> >
> >Note in particular: "A resource might be part of a Web page e.g. a
specific
> >HTML or XML element ..." This seems to indicate that a URIref _when used
by
> >RDF_ is NOT intended to point to ONLY RDF documents.
>
> We have to distinguish here between two senses of 'point to'. The
> quoted passage is talking about the sense 'mean' or 'refer to' (AKA
> 'denote'), which is the RDF semantic notion of naming. I was
> referring to the notion of 'point to' meaning 'indicate the source of
> (the name)'
>
> >Are URIrefs used in RDF statements assumed to point to locations in RDF
> >documents? If so this is a big change.
>
> The convention that I have been talking about is implicit in every
> use of RDF in every document on the web. Why else would one include
> things like this in RDF headers?
>
> <RDF
>    xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>    xmlns:s="http://www.w3.org/2000/01/rdf-schema#">
>
> Those URL's don't *denote* anything in RDF, but it is sure important
> not to type them wrong.

Those URLs identify namespaces. I don't grok why that makes your convention
implicit.

Also, note that if you leave off the '#', RDF becomes hopelessly confused,
however a browser could care less.

> >
> >This confuses me. Does an RDF application need to follow each URIref.
What
> >about non "http" URI schemes, e.g. "urn"s. Are non resolvable URI refs
> >illegal in RDF?
>
> No, sorry if I gave that impression. But those that are resolvable
> are often used in a way that presupposes that they are resolved
> 'properly'.

This is where we disagree, if by 'resolve properly' you mean to a piece of
RDF.

> >
> >Right. Hmm. Perhaps you are using the term "mean" in a technical sense
and I
> >am using it in an English sense. The URIref http://example.org/Unicorn
> >doesn't 'mean' Unicorn, but the URIref may be used to name the concept
> >"Unicorn". When dereferencing the URI a document entity of type
text/plain
> >may be returned reading: "Unicorns are mythical creatures ..."
>
> If that ever happens on the semantic web, it ought to generate an
> error. Plain text is meaningless to software.


Ouch, your semantic web, not mine. I _explicitly_ need to be able to make
assertions _about_ non-RDF containing resources on the Web. In particular
medical documents -- which have already been standardized to non-XML
formats. I need to be able to make assertions regarding pieces of these
documents -- don't worry they have schemas, so they are syntactically
constrained.

If the "semantic web" won't allow that, then I want a different semantic
web.

> >
> >Well I guess what is important is that such assumptions may not be
> >reasonable. Because my reading of the current RDF REC says that I can
make
> >statements about parts of XML or HTML documents. I interpret this to mean
> >that the URIref http://example.org/Unicorn#LeftButtock either may not
> >resolve at all, else may resolve to a piece of HTML
>
> I agree this is an ambiguity which we have not resolved or even
> discussed properly (since Ive been on the WG, maybe they did
> earlier.) Of course it MAY resolve to a piece of HTML, and indeed
> that would not make it unusable in RDF as a name; but it would not
> automatically make it into the RDF name of that piece of HTML. We
> could adopt this as a convention, I guess, but then we would have
> serious problems with use/mention ambiguities.
>
> [Later. It occurs to me that there is a quick-and-dirty way around
> the use/mention problem that might actually be just what we need. A
> URL-plus-fragID uriref is assumed to *denote* the relevant part of
> the document (where the fragID is interpreted according to the mime
> type), except when that part of the document consists of RDF, in
> which case it is interpreted as being the same identifier as that
> identified by the fragId in the document. In other words, RDF *uses*
> all the RDF it can find, but it treats all other fragIds as *names
> of* parts of documents. The only thing this can't do is refer to RDF
> in RDF, but that's what we have reification for, right? Highly
> unofficial proposal, needless to say.]

We are getting to the heart of the issue. I don't agree that every URIref of
non RDF/XML document need be interpreted as denoting part of a document. For
example RDDL (http://www.rddl.org/), suppose I state that RDDL (which uses
XLink) has a defined interpretation under the RDF MT. This is quite easy
because XLink maps to RDF very well, in fact I have an XSLT stylesheet that
easily transforms RDDL or any XLink, into RDF.

Why can't a URIref which references a RDDL document (that is to say an HTTP
GET on the URI returns a RDDL document), why can't this be interpreted in
the same way as if it were RDF? It could be interpreted exactly as if one
does an HTTP GET, applies the XSLT to the result, and interprets the URIref
with respect to this transform.

>
> >
> ><div id="LeftButtock">
> >     <p>This is a description of the Left Buttock of the mythical Unicorn
> ></div>
> >
> >(note use of non-well formed i.e. SGML based HTML)
> >
> >Now of course one
> >>  might want to say something in RDF about a document with a URL, and
> >>  it allows one to do that. But that use of an absolute URI as an RDF
> >>  name is a very special use.
> >
> >Why is that a special case? Where does it say that? I assert it is not a
> >special case.
>
> Its special because in all the examples Ive seen, such use has been
> taken to mean that the *document* is the thing named by the URI.  I
> agree this is not stated anywhere, but it seems to be universally
> understood.

No. This is the resource/representation distinction. According to REST:
http://www1.ics.uci.edu/~fielding/pubs/dissertation/evaluation.htm#sec_6_2

6.2
[[
6.2.1 Redefinition of resource
Early web architecture defined URI as document identifiers...
REST accomplishes this by defining a resource to be the semantics of what
the author intends to identify, rather than the value corresponding to those
semantics at the time the reference is created. It is then left to the
author to ensure that the identifier chosen for a reference does indeed
identify the intended semantics.

6.2.2 Manipulating Shadows
Defining resource such that a URI identifies a concept rather than a
document leaves us with another question: how does a user access,
manipulate, or transfer a concept such that they can get something useful
when a hypertext link is selected? REST answers that question by defining
the things that are manipulated to be representations of the identified
resource, rather than the resource itself. An origin server maintains a
mapping from resource identifiers to the set of representations
corresponding to each resource. A resource is therefore manipulated by
transferring representations through the generic interface defined by the
resource identifier.
]]

At the very least the identification of URI and document is not universally
understood, rather universally misunderstood.

>
> >
> >>
> >>  >This
> >>  >is the whole argument. Should RDF treat URI references as opaque or
not?
> >>  >Should all URIs that use the "http" scheme identify _documents_ or
might
> >not
> >>  >the URI http://example.org/Unicorn identify a Unicorn..
> >>
> >>  I would say that if someone wants to try to use it in that way, then
> >>  nothing should prevent them from doing so, but they should be ready
> >>  to take the consequences of doing something that makes such fragile
> >>  semantic sense. Probably what they write will have ludicrous
> >>  consequences.
> >
> >I dearly hope that RDF is not designed to make such usage ludicrous,
> >otherwise we may have huge problems for RDF's usability. At the very
least
> >this would be a large architectural hole.
>
> Well, as I understand it, it would amount to saying that a unicorn
> had an http URL. (After all, that URI *is* a URL, right?) And that is
> ludicrous, right?

Err, no. Exactly the point, see above. Pat, just sit back and learn to love
the bomb. It will be alright :)

> >
> >Well that is the issue. I will argue strongly that OWL be able to make
> >statements about parts of arbirtary XML and HTML documents.
>
> I agree that would be great. Also parts of images, sound files, parts
> of all kinds of things.
>
> But hold on a second. You want it to be able to REFER TO parts of
> documents. OK, fine: but what I was talking about earlier was a
> global convention that allows RDF/DAML/OWL to USE names which are
> USED in other OWL documents. I wasn't talking about *reference to*
> the documents at all, which is another issue altogether. As far as I
> know, RDF has no official means for referring to documents (though
> absolute URLs are often interpreted that way) let alone parts of
> documents . We seem to have a use/mention disconnect here.
>
> BTW, I would predict that most of OWL isn't going to be ABOUT
> documents, but its all going to be WRITTEN IN documents.
>
> >
> >True, but the network entity returned by an HTTP GET on a URI _is not the
> >same as the resource identified by the URI_.
> >
> >This needs to be totally clear.
>
> Agreed in principle, though in many cases they might well be the
> same. Certainly that would seem to be a useful and harmless
> convention: how else is one supposed to refer to a web document,
> other than by using its URL? I agree this isn't formally stated
> anywhere in the RDF specs, but its often assumed, eg in the 'Ora
> said' examples in the original M&S.

That is because the 'early Web architecture' did it that way, or so they
say.

>
> BUt now you have me puzzled, by the way. You seem to be *wanting* to
> use urirefs to identify parts of web documents, yet you are insistent
> that they do not refer to them. (Or is your point that RDF doesnt
> provide a way to re

No, I am saying that the author of a Web document can define a concept and
give it a name (URIref), what is identified by the URIref is the concept,
what is referenced is the representation of the concept, which might be an
HTML description.

>
> Just as a general point, RDF is a very 'weak' language in a strict
> logical sense, but it can be used in the context of what might be
> called extra-logical assumptions which if mutually understood by all
> users of the RDF, can impose a much more precise 'meaning'. The use
> of fragIds to refer to parts of documents might be one such
> convention, and datatyping conventions are another.
>
> >A URIref which _identifies_ a network resource would use the "data"
scheme:
> >e.g.
> >
> >data:text/plain,A "Unicorn" is a mythical creature ...
>
> I fail to follow this. How does plain text identify, say, my CV, or
> the front page of the NYT for 13 October 1989?

That's the point. Let's assume that you name your resume with a URI that you
own:

http://pathayes.com/resume

When I do an HTTP GET on it, I get back a representation of your resume.
Let's say this is an HTML document. You may wish to publish a URI that
identifies the _document_ that describes your resume, just give it a new
URI, e.g.

http://pathayes.com/resume/index.html

and there is even a URI for the network entity that is returned by an HTTP
GET at a particular point in time:

data:text/html,<html><head><title>Pat's resume</title></head><body> ...

> >
> >...So in your example, the document fragment obtained
> >>  by resolving http://example.org/Unicorn#LeftButtock had better be a
> >>  piece of XML (well, RDF in any case).
> >
> >again, no it could be (non XML) HTML for example.
>
> I meant, if it is not a piece of RDF, then an RDF inference engine
> might get very confused trying to figure out where the identifiers
> are in it. There is certainly no official RDF assumption that the
> intermediate hash is in any way concerned with *referring to* a part
> of a document.

I don;t think an RDF inferences engine will ever be able to assume that
there is anything definative retrievable at the URIref, except in the
special case of rdf:type -- which is why I maintain that rdf:type is
special.

>
> ?? I fail to follow this. As far as I can see, rdf:type is on a par
> with the rest of the RDF vocabulary and is not particularly special.

Well because when inferencing, engines are programmed to follow the isa
links.

You see I am saying that inference engines cannot generally follow URIrefs,
unless RDF limits itself to having RDF at the end of all URIrefs, which
isn't very interesting -- at least to me.

>
> >but otherwise, an RDF 'engine' whatever that may be, generally won't even
> >try to dereference a URI so this should be a non-issue. Correct?
>
> Well, a DAML or OWL engine certainly will, since URIs are used to
> import one ontology into another.

Ok, <daml:import > is handled specially.


>Even in RDF, engines like CWM and
> Euler often assume that some absolute URIs identify pieces of
> well-formed RDF, and act on that assumption, though this is not
> 'official'.

Again these rules need to be made official otherwise who knows.

Point: It drives me fairly insane that it has become common practice to
refer to CWM as RDF, because it is not RDF 1, having all sorts of
improvements and other features. For example in N3/CWM not all statements
are truths. etc. Not RDF is it?

> >
> >Yes and perhaps this is why RDF needs to very precisely define what a
> >"Resource" is,
>
> I think we do. Anything and everything is a resource. "Resource"
> simply means "entity", ie anything that the human mind can imagine or
> give a name to, and maybe some other things as well.

That is quite close to how RFC 2396 defines it, though note that "resource"
and "entity" are used there to refer to two distinct things.

>
> >to the point, perhaps, of stating that there is (?) no
> >relationship between the RFC 2396 resource identified by a URI, and the
RDF
> >resource identified by the URI. RDF can then define what it means by a
> >fragment identifier etc.
> >
> >The thorny issue, however, gets back to the fact that RDF needs to be
able
> >to make assertions about Web pages and parts of Web pages e.g. arbitrary
XML
> >and HTML documents. So try as you like you probably are stuck with RFC
2396
> >resources,
>
> ??? But that explicitly says that resources are NOT just things like
> web pages, but include off-web entities like books and people.
>

My point. But you are going further in claiming that RDF expects URIrefs
that it uses to resolve to parts of RDF documents. The definition of
resource I am using is anything the owner of the URI wants it to be, and
what you get back on resolving a URI is whatever the owner of the URI puts
there.

Jonathan
Received on Monday, 25 February 2002 00:18:04 UTC