- From: Pat Hayes <phayes@ai.uwf.edu>
- Date: Sun, 24 Feb 2002 21:44:30 -0600
- To: "Jonathan Borden" <jonathan@openhealth.org>
- Cc: <www-rdf-comments@w3.org>
>With apologies to Brian: either Pat or I are deeply confused about some >fundamental issues central to RDF. What this means is likely a problem in >specification that desperately needs to be clarified. > >In any case my responses to this round include specific points that I would >like clarified by the RDF WG: OK, Im CCing this as before. Sorry, Brian. > >Pat Hayes wrote: >> > >> >Careful, RDF uses frags in two ways: >> > >> >1) as you say >> >2) any subject,predicate or object of any statement may be identified by >a >> >URI reference. >> >> May BE a uriref, actually; but OK. > >In the current RDF REC, Section 5 says "sub is a resource ...", indicates to >me that the _subject_ is a _resource_ not a URIreferece, hence my specific >language. > >Does the current MT say that a "subject is a URIref" ? If so this seems to >be a significant change rather than a clarification. I believe the MT has always said this. Certainly that is my understanding of the basic graph syntax: triples consist of a subject, a property and an object, all of which can be urirefs. We are talking about the actual graph syntax here, right? Not what it denotes. So in this sense of 'subject' the subject of a sentence is a word, not what the word names. (Now, of course, urirefs are themselves resources, since everything is a resource....) > >> >> >Such URI references may have a fragment id. >> >> Sure, but what that *means* is not specified. It could well be >> meaningless. RDF syntax allows arbitrary urirefs to occur - it >> provides no constraints forbidding any URI combinations as illegal or >> ill-formed - but RDF provides no semantic guarantees that any such >> usage is meaningful. In particular, the one you provide seems >> nonsensical to me: > >Precisely my point. Nowhere in any RDF specification have I read anything to >suggest that a URI reference has any _meaning_ other than what can be >determined by the RDF statements made about the referenced resource. That is >to say, there is nothing to suggest that one can determine any meaning from >the syntactic structure of the URI ref. The example that I provide is >supposed to be "nonsensical" _only_ if you presume to interpret what the URI >ref 'means' based on its syntax. I am suggesting that RDF treat URI >references as opaque identifiers, and that it ought not be possible to >derive meaning by parsing the structure of the URI ref. > >To the WG: does RDF mean to say otherwise? Good question. I will respond for myself, not in the name of the WG. Answer: Yes and no. Yes, as far as RDF semantics is concerned, urirefs are opaque identifiers, and their internal structure is of no consequence as far as their referential semantics is concerned. All that matters to the MT is identity of the uriref, so that two urirefs in two distinct documents can be compared for syntactic equality. RDF assumes only that they are the same name, and have the same denotation wherever they occur. However, that identity test means that RDF needs to be able to discover coincidence between a uriref used in one document, consisting of a an absolute URL plus a fragId, and the uriref consisting of that fragId used in the RDF document which is retrievable by conventional web transfer protocols using the absolute URL. So to the extent that RDF inference depends on this ability to cross-identify urirefs in various documents, the answer is No. Notice that this is not a contradiction, but it is an equivocation upon 'meaning'. As far as RDF *meaning* is concerned, urirefs are opaque. But as far as what might be called the RDF global *syntax* is concerned, they are not opaque. RDF (and all web ontology languages) depend on a global agreement about the ability to recognize identity of *symbols* across documents, and that in turn - although simply considered a 'primitive' feature of the syntax and hence of the model theory - depends on the internal structure of urirefs being treated in a certain coherent way. For example, If A contains <http://example.org/Unicorn#Bottock> rdf:type foo:Bar . and the document at the URL <http://example.org/Unicorn> contains <Bottock> rdf:type Bra . then I would want A to be able to infer that http://example.org/Unicorn#Bra and foo:Bar had a nonempty intersection. And although this is not specified formally, I would expect to be able to use the absolute URL as a likely place to locate RDF assertions which use the uriref. However, the rest of the WG might shoot me down on that. > >> >> >e.g. >> > > > ><http://example.org/Unicorn#Bottock> rdf:type foo:Bar >> ><http://example.org/Unicorn> rdf:type foo:Unicorn >> > >> >does not imply any relationship between foo:Bar and foo:Unicorn >> >> Agreed; precisely my point. BUt the reason why it does not, is that >> there is no implied relationship between those two urirefs, either, >> other than that the *very use* of the first one implicitly assumes >> that the absolute URI is a URL of a document which contains some RDF >> using the fragID 'Buttock' as a name. > >According to the current RDF rec this is not true, there is no assumption >that a URIref used by an RDF application 'point to' anything in an RDF >document, explicitly: > >[[ >Resources > >All things being described by RDF expressions are called resources. A >resource may be an entire Web page; such as the HTML document >"http://www.w3.org/Overview.html" for example. A resource may be a part of a >Web page; e.g. a specific HTML or XML element within the document source. A >resource may also be a whole collection of pages; e.g. an entire Web site. A >resource may also be an object that is not directly accessible via the Web; >e.g. a printed book. Resources are always named by URIs plus optional anchor >ids (see [URI]). Anything can have a URI; the extensibility of URIs allows >the introduction of identifiers for any entity imaginable. >]] > >Note in particular: "A resource might be part of a Web page e.g. a specific >HTML or XML element ..." This seems to indicate that a URIref _when used by >RDF_ is NOT intended to point to ONLY RDF documents. We have to distinguish here between two senses of 'point to'. The quoted passage is talking about the sense 'mean' or 'refer to' (AKA 'denote'), which is the RDF semantic notion of naming. I was referring to the notion of 'point to' meaning 'indicate the source of (the name)' >Are URIrefs used in RDF statements assumed to point to locations in RDF >documents? If so this is a big change. The convention that I have been talking about is implicit in every use of RDF in every document on the web. Why else would one include things like this in RDF headers? <RDF xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:s="http://www.w3.org/2000/01/rdf-schema#"> Those URL's don't *denote* anything in RDF, but it is sure important not to type them wrong. > >> If there is no such document, >> or no such use of that fragId, then RDF has no way to make sense of >> the first triple, and would probably generate a 409 error. > >This confuses me. Does an RDF application need to follow each URIref. What >about non "http" URI schemes, e.g. "urn"s. Are non resolvable URI refs >illegal in RDF? No, sorry if I gave that impression. But those that are resolvable are often used in a way that presupposes that they are resolved 'properly'. > >> > >> >The URI reference that identifies the subject of the first statement has >a >> >fragment identifier. >> > >> >> If http://example.org/Unicorn >> >> really means a unicorn, then it should never have a fragId attached >> >> to it in RDF. >> > >> >Really! This is exactly Aaron's argument. >> >> ?? It is? Then I REALLY have not understood what Aaron is saying. >> >> >A unicorn is an example of what >> >some people call an "abstract resource". >> >> A unicorn is, sure. But the URI is a name, not what is named. Nobody >> is talking about adding a fragId to a unicorn, right? > >Right. Hmm. Perhaps you are using the term "mean" in a technical sense and I >am using it in an English sense. The URIref http://example.org/Unicorn >doesn't 'mean' Unicorn, but the URIref may be used to name the concept >"Unicorn". When dereferencing the URI a document entity of type text/plain >may be returned reading: "Unicorns are mythical creatures ..." If that ever happens on the semantic web, it ought to generate an error. Plain text is meaningless to software. > >> > >> >No this is the whole point. If one RDF treats URI references as opaque >> >identifiers, then one can make any statement about any URI reference. >> >> What does 'can' mean? RDF syntax does not forbid it, sure. However, >> it does make some implicit assumptions about how to interpret it, >> which are really part of the syntax of RDF, though implicitly so: >> they are incorporated into the very notion of 'merging' two RDF >> graphs. Those assumptions were sketched above. > >Well I guess what is important is that such assumptions may not be >reasonable. Because my reading of the current RDF REC says that I can make >statements about parts of XML or HTML documents. I interpret this to mean >that the URIref http://example.org/Unicorn#LeftButtock either may not >resolve at all, else may resolve to a piece of HTML I agree this is an ambiguity which we have not resolved or even discussed properly (since Ive been on the WG, maybe they did earlier.) Of course it MAY resolve to a piece of HTML, and indeed that would not make it unusable in RDF as a name; but it would not automatically make it into the RDF name of that piece of HTML. We could adopt this as a convention, I guess, but then we would have serious problems with use/mention ambiguities. [Later. It occurs to me that there is a quick-and-dirty way around the use/mention problem that might actually be just what we need. A URL-plus-fragID uriref is assumed to *denote* the relevant part of the document (where the fragID is interpreted according to the mime type), except when that part of the document consists of RDF, in which case it is interpreted as being the same identifier as that identified by the fragId in the document. In other words, RDF *uses* all the RDF it can find, but it treats all other fragIds as *names of* parts of documents. The only thing this can't do is refer to RDF in RDF, but that's what we have reification for, right? Highly unofficial proposal, needless to say.] > ><div id="LeftButtock"> > <p>This is a description of the Left Buttock of the mythical Unicorn ></div> > >(note use of non-well formed i.e. SGML based HTML) > >Now of course one >> might want to say something in RDF about a document with a URL, and >> it allows one to do that. But that use of an absolute URI as an RDF >> name is a very special use. > >Why is that a special case? Where does it say that? I assert it is not a >special case. Its special because in all the examples Ive seen, such use has been taken to mean that the *document* is the thing named by the URI. I agree this is not stated anywhere, but it seems to be universally understood. > >> >> >This >> >is the whole argument. Should RDF treat URI references as opaque or not? >> >Should all URIs that use the "http" scheme identify _documents_ or might >not >> >the URI http://example.org/Unicorn identify a Unicorn.. >> >> I would say that if someone wants to try to use it in that way, then >> nothing should prevent them from doing so, but they should be ready >> to take the consequences of doing something that makes such fragile >> semantic sense. Probably what they write will have ludicrous >> consequences. > >I dearly hope that RDF is not designed to make such usage ludicrous, >otherwise we may have huge problems for RDF's usability. At the very least >this would be a large architectural hole. Well, as I understand it, it would amount to saying that a unicorn had an http URL. (After all, that URI *is* a URL, right?) And that is ludicrous, right? > >> >> >For example,does your model theory contain anything pertaining to the >> >syntactic substructure of a URI reference? scheme, authority, >heirarchical >> >part, fragment id? I don't see it. >> >> No, it does not, because the WG consciously decided to avoid going >> into that territory. It would have been fun to try it, but it was >> outside our charter. But an adequate semantics for a web language > > should address such issues, eventually. > >Well that is the issue. I will argue strongly that OWL be able to make >statements about parts of arbirtary XML and HTML documents. I agree that would be great. Also parts of images, sound files, parts of all kinds of things. But hold on a second. You want it to be able to REFER TO parts of documents. OK, fine: but what I was talking about earlier was a global convention that allows RDF/DAML/OWL to USE names which are USED in other OWL documents. I wasn't talking about *reference to* the documents at all, which is another issue altogether. As far as I know, RDF has no official means for referring to documents (though absolute URLs are often interpreted that way) let alone parts of documents . We seem to have a use/mention disconnect here. BTW, I would predict that most of OWL isn't going to be ABOUT documents, but its all going to be WRITTEN IN documents. > >> >> > But the referring >> >> thing here is the whole uriref, not the absolute URI. That doesn't >> >> refer to anything but the document. The relationship between >> > > http://example.org/Unicorn and http://example.org/Unicorn#LeftButtock >> >> is not one of resource to subresource; >> > >> >Read the internet draft carefully. There is no _relationship_ defined >> >between _resource_ and _subresource_. A document does contain fragments. >One >> >might consider a sub resource to be contained by a resource but one can >> >make entirely independent assertions about a resource and any of the >> >subresources that it supposedly contains. >> >> Ive read this several times and it still seems incoherent to me, I >> think because it applies 'sub' to 'resource' rather than 'network >> entity'. > >Suppose I change the term "subresource" to "node", does that make more >sense? Maybe. It was the 'sub' that was puzzling me. > >> >What is returned is not a resource >> >but, _by definition_, a network entity. >> >> Why is a network entity not a resource? Surely *anything* can be a >resource. > >True, but the network entity returned by an HTTP GET on a URI _is not the >same as the resource identified by the URI_. > >This needs to be totally clear. Agreed in principle, though in many cases they might well be the same. Certainly that would seem to be a useful and harmless convention: how else is one supposed to refer to a web document, other than by using its URL? I agree this isn't formally stated anywhere in the RDF specs, but its often assumed, eg in the 'Ora said' examples in the original M&S. BUt now you have me puzzled, by the way. You seem to be *wanting* to use urirefs to identify parts of web documents, yet you are insistent that they do not refer to them. (Or is your point that RDF doesnt provide a way to re Just as a general point, RDF is a very 'weak' language in a strict logical sense, but it can be used in the context of what might be called extra-logical assumptions which if mutually understood by all users of the RDF, can impose a much more precise 'meaning'. The use of fragIds to refer to parts of documents might be one such convention, and datatyping conventions are another. >A URIref which _identifies_ a network resource would use the "data" scheme: >e.g. > >data:text/plain,A "Unicorn" is a mythical creature ... I fail to follow this. How does plain text identify, say, my CV, or the front page of the NYT for 13 October 1989? > > >> >So yes the _document fragment_ obtained by _resolving_ >> >http://www.w3.org/1999/02/22-rdf-syntax-ns#Class is a piece of XML. And >the >> >_document fragment_ is indeed contained in the document (entity). >> >> We seem to agree. > >finally ... > >...So in your example, the document fragment obtained >> by resolving http://example.org/Unicorn#LeftButtock had better be a >> piece of XML (well, RDF in any case). > >again, no it could be (non XML) HTML for example. I meant, if it is not a piece of RDF, then an RDF inference engine might get very confused trying to figure out where the identifiers are in it. There is certainly no official RDF assumption that the intermediate hash is in any way concerned with *referring to* a part of a document. > >In other words, >> http://example.org/Unicorn had better be the URL of a document. The >> RDF semantics might *interpret* it as anything at all, but that's >> completely irrelevant to its role in making connections across the >> semantic web; and it is only the latter role that is relevant to how >> fragIds are treated by an RDF engine. > >This is exactly why "rdf:type" is a special kind of property, because the >resource that an rdf:type points to really does need to be RDF (perhaps), ?? I fail to follow this. As far as I can see, rdf:type is on a par with the rest of the RDF vocabulary and is not particularly special. >but otherwise, an RDF 'engine' whatever that may be, generally won't even >try to dereference a URI so this should be a non-issue. Correct? Well, a DAML or OWL engine certainly will, since URIs are used to import one ontology into another. Even in RDF, engines like CWM and Euler often assume that some absolute URIs identify pieces of well-formed RDF, and act on that assumption, though this is not 'official'. > >> >> >It is very common to conflate a resource and the entity that represents >it >> >at any point in time. But whether you agree or not, this is how the >language >> >is defined. It is not possible to understand anything about "REST" until >> >this distinction is undetstood at least from a terminological point of >view. >> >> I think we are in violent agreement here. >> > >Yes and perhaps this is why RDF needs to very precisely define what a >"Resource" is, I think we do. Anything and everything is a resource. "Resource" simply means "entity", ie anything that the human mind can imagine or give a name to, and maybe some other things as well. >to the point, perhaps, of stating that there is (?) no >relationship between the RFC 2396 resource identified by a URI, and the RDF >resource identified by the URI. RDF can then define what it means by a >fragment identifier etc. > >The thorny issue, however, gets back to the fact that RDF needs to be able >to make assertions about Web pages and parts of Web pages e.g. arbitrary XML >and HTML documents. So try as you like you probably are stuck with RFC 2396 >resources, ??? But that explicitly says that resources are NOT just things like web pages, but include off-web entities like books and people. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola, FL 32501 (850)202 4440 fax phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Sunday, 24 February 2002 22:44:43 UTC