Re: [URI vs. URIViews] draft-frags-borden-00.txt from Brian McBride on 2002-02-26 (www-rdf-comments@w3.org from January to March 2002)

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: Tue, 26 Feb 2002 06:23:21 +0000
To: Pat Hayes <phayes@ai.uwf.edu>, "Jonathan Borden" <jonathan@openhealth.org>
Cc: <www-rdf-comments@w3.org>
Message-Id: <5.1.0.14.0.20020226052837.06fa5578@0-mail-1.hpl.hp.com>
At 21:44 24/02/2002 -0600, Pat Hayes wrote:
>>With apologies to Brian: either Pat or I are deeply confused about some
>>fundamental issues central to RDF. What this means is likely a problem in
>>specification that desperately needs to be clarified.
>>
>>In any case my responses to this round include specific points that I would
>>like clarified by the RDF WG:
>
>OK, Im CCing this as before. Sorry, Brian.

S'ok



>>Pat Hayes wrote:
>>>  >
>>>  >Careful, RDF uses frags in two ways:
>>>  >
>>>  >1) as you say
>>>  >2) any subject,predicate or object of any statement may be identified by
>>a
>>>  >URI reference.
>>>
>>>  May BE a uriref, actually; but OK.
>>
>>In the current RDF REC, Section 5 says "sub is a resource ...", indicates to
>>me that the _subject_ is a _resource_ not a URIreferece, hence my specific
>>language.
>>
>>Does the current MT say that a "subject is a URIref" ? If so this seems to
>>be a significant change rather than a clarification.
>
>I believe the MT has always said this. Certainly that is my understanding 
>of the basic graph syntax: triples consist of a subject, a property and an 
>object, all of which can be urirefs. We are talking about the actual graph 
>syntax here, right? Not what it denotes. So in this sense of 'subject' the 
>subject of a sentence is a word, not what the word names.

OK, So my take is that that the model theory is clarifying some use/mention 
distinctions that were unclear from the original m&s.  There is a reading 
of m&s, look at the glossary, which makes triple a syntactic entity and a 
statement the abstraction it denotes.  But lets not go there; m&s is not 
clear and the new model theory makes this distinction clearer.

As I see it though, the job is only half done.  We need a model theory, 
defined at TAG level, to explain how resources and HTTP work.  I think the 
issues you guys have been discussing are to do with that model.  I've been 
reading a paper by McCarthy in the red book where he talks about concepts 
as first class objects in FOL which stuck me as an excellent fit this idea 
that a resource is itself an abstraction (concept) which also has a 
denotation.  Pat probably winced - but a resource is in the domain of 
discourse - its not syntactic.  Could approach things two ways:

   - two domains of discourse and one man's syntax is the other man's domain
   - one domain of discourse and an HTTP get is a function from name cross 
setof mimetype to seq of bytes

Enough of my ramblings; I'm just trying to tease the experts into to doing 
a proper theory of the basis of the web.

So arcs in the RDF graph are syntactic thingies with a sharp end, a blunt 
end and a line linking them and some labels.  The label at the blunt end is 
string; in fact its a uri.  A use/mention problem in M&S is that its formal 
model confused the syntactic representation of a statement with its 
denotation.  A statement turned out to be used as both in different 
places.  Someone was aware of the distiction because the glossary defines 
triple as a syntactic representation of a statement, if I recall 
correctly.  Now, the arc in a graph is a syntactic representation.  Its 
denotation is a reified statement, though the model theory doesn't actually 
say that.  The concept of statement itself is downplayed.  The reified 
statement still takes a resource, not a name of a resource as its 
subject.  We have ducked putting anything into the model theory that makes 
a link between the arc in the graph and a reified statement, largely I 
think, because we think this approach to reification is broken, so we, 
whilst preserving it for those who might currently use it, the WG is not 
emphasizing it.

That seems to hang together;  but real early in the morning and I havn't 
had much sleep.


>(Now, of course, urirefs are themselves resources, since everything is a 
>resource....)
>
>>
>>>
>>>  >Such URI references may have a fragment id.
>>>
>>>  Sure, but what that *means* is not specified. It could well be
>>>  meaningless. RDF syntax allows arbitrary urirefs to occur - it
>>>  provides no constraints forbidding any URI combinations as illegal or
>>>  ill-formed -  but RDF provides no semantic guarantees that any such
>>>  usage is meaningful. In particular, the one you provide seems
>>>  nonsensical to me:
>>
>>Precisely my point. Nowhere in any RDF specification have I read anything to
>>suggest that a URI reference has any _meaning_ other than what can be
>>determined by the RDF statements made about the referenced resource. That is
>>to say, there is nothing to suggest that one can determine any meaning from
>>the syntactic structure of the URI ref. The example that I provide is
>>supposed to be "nonsensical" _only_ if you presume to interpret what the URI
>>ref 'means' based on its syntax. I am suggesting that RDF treat URI
>>references as opaque identifiers, and that it ought not be possible to
>>derive meaning by parsing the structure of the URI ref.
>>
>>To the WG: does RDF mean to say otherwise?

Hmmm, nice one.  RDF operates in the context of the web where there is a 
function GET (URI, setof mimetypes) to byteseq.  So far we have no formal 
connection between these, but maybe having one would be helpful.


>Good question. I will respond for myself, not in the name of the WG.
>
>Answer: Yes and no.
>
>Yes, as far as RDF semantics is concerned, urirefs are opaque identifiers, 
>and their internal structure is of no consequence as far as their 
>referential semantics is concerned. All that matters to the MT is identity 
>of the uriref, so that two urirefs in two distinct documents can be 
>compared for syntactic equality. RDF assumes only that they are the same 
>name, and have the same denotation wherever they occur.
>
>However, that identity test means that RDF needs to be able to discover 
>coincidence between a uriref used in one document, consisting of a an 
>absolute URL plus a fragId, and the uriref consisting of that fragId used 
>in the RDF document which is retrievable by conventional web transfer 
>protocols using the absolute URL. So to the extent that RDF inference 
>depends on this ability to cross-identify urirefs in various documents, 
>the answer is No.

Hmmm, I think of that as a feature of the RDF/XML transfer syntax, not of 
RDF per se.  What comes out of the parser is absolute uris with option frag 
id's - that's what's in the graph.  If the model theory is the essence of 
RDF, it operates on the graph and isn't bothered by this.  The graph 
contains only absolute URI's with opt frag id's - right?


>Notice that this is not a contradiction, but it is an equivocation upon 
>'meaning'. As far as RDF *meaning* is concerned, urirefs are opaque. But 
>as far as what might be called the RDF global *syntax* is concerned, they 
>are not opaque. RDF (and all web ontology languages) depend on a global 
>agreement about the ability to recognize identity of *symbols* across 
>documents, and that in turn - although simply considered a 'primitive' 
>feature of the syntax and hence of the model theory - depends on the 
>internal structure of urirefs being treated in a certain coherent way.
>
>For example, If A contains
>
><http://example.org/Unicorn#Bottock> rdf:type foo:Bar .
>
>and the document at the URL  <http://example.org/Unicorn>  contains
>
><Bottock> rdf:type Bra .
>
>then I would want A to be able to infer that 
>http://example.org/Unicorn#Bra and foo:Bar had a nonempty intersection.
>
>And although this is not specified formally, I would expect to be able to 
>use the absolute URL as a likely place to locate RDF assertions which use 
>the uriref. However, the rest of the WG might shoot me down on that.
>
>>
>>>
>>>  >e.g.
>>>  >
>>  > ><http://example.org/Unicorn#Bottock> rdf:type foo:Bar
>>>  ><http://example.org/Unicorn> rdf:type foo:Unicorn
>>>  >
>>>  >does not imply any relationship between foo:Bar and foo:Unicorn
>>>
>>>  Agreed; precisely my point. BUt the reason why it does not, is that
>>>  there is no implied relationship between those two urirefs, either,
>>>  other than that the *very use* of the first one implicitly assumes
>>>  that the absolute URI is a URL of a document which contains some RDF
>>>  using the fragID 'Buttock' as a name.
>>
>>According to the current RDF rec this is not true, there is no assumption
>>that a URIref used by an RDF application 'point to' anything in an RDF
>>document, explicitly:
>>
>>[[
>>Resources
>>
>>All things being described by RDF expressions are called resources. A
>>resource may be an entire Web page; such as the HTML document
>>"http://www.w3.org/Overview.html" for example. A resource may be a part of a
>>Web page; e.g. a specific HTML or XML element within the document source. A
>>resource may also be a whole collection of pages; e.g. an entire Web site. A
>>resource may also be an object that is not directly accessible via the Web;
>>e.g. a printed book. Resources are always named by URIs plus optional anchor
>>ids (see [URI]). Anything can have a URI; the extensibility of URIs allows
>>the introduction of identifiers for any entity imaginable.
>>]]
>>
>>Note in particular: "A resource might be part of a Web page e.g. a specific
>>HTML or XML element ..." This seems to indicate that a URIref _when used by
>>RDF_ is NOT intended to point to ONLY RDF documents.
>
>We have to distinguish here between two senses of 'point to'. The quoted 
>passage is talking about the sense 'mean' or 'refer to' (AKA 'denote'), 
>which is the RDF semantic notion of naming. I was referring to the notion 
>of 'point to' meaning 'indicate the source of (the name)'
>
>>Are URIrefs used in RDF statements assumed to point to locations in RDF
>>documents? If so this is a big change.
>
>The convention that I have been talking about is implicit in every use of 
>RDF in every document on the web. Why else would one include things like 
>this in RDF headers?
>
><RDF
>   xmlns="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>   xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>   xmlns:s="http://www.w3.org/2000/01/rdf-schema#">
>
>Those URL's don't *denote* anything in RDF, but it is sure important not 
>to type them wrong.

Sure they do; I can do a GET on them.  I can get a representation of what 
they denote.



>>>  If there is no such document,
>>>  or no such use of that fragId, then RDF has no way to make sense of
>>>  the first triple, and would probably generate a 409 error.
>>
>>This confuses me. Does an RDF application need to follow each URIref. What
>>about non "http" URI schemes, e.g. "urn"s. Are non resolvable URI refs
>>illegal in RDF?
>
>No, sorry if I gave that impression. But those that are resolvable are 
>often used in a way that presupposes that they are resolved 'properly'.
>
>>
>>>  >
>>>  >The URI reference that identifies the subject of the first statement has
>>a
>>>  >fragment identifier.
>>>  >
>>>  >>  If http://example.org/Unicorn
>>>  >>  really means a unicorn, then it should never have a fragId attached
>>>  >>  to it in RDF.
>>>  >
>>>  >Really! This is exactly Aaron's argument.
>>>
>>>  ?? It is? Then I REALLY have not understood what Aaron is saying.
>>>
>>>  >A unicorn is an example of what
>>>  >some people call an "abstract resource".
>>>
>>>  A unicorn is, sure. But the URI is a name, not what is named. Nobody
>>>  is talking about adding a fragId to a unicorn, right?
>>
>>Right. Hmm. Perhaps you are using the term "mean" in a technical sense and I
>>am using it in an English sense. The URIref http://example.org/Unicorn
>>doesn't 'mean' Unicorn, but the URIref may be used to name the concept
>>"Unicorn". When dereferencing the URI a document entity of type text/plain
>>may be returned reading: "Unicorns are mythical creatures ..."
>
>If that ever happens on the semantic web, it ought to generate an error. 
>Plain text is meaningless to software.
>
>>
>>>  >
>>>  >No this is the whole point. If one RDF treats URI references as opaque
>>>  >identifiers, then one can make any statement about any URI reference.
>>>
>>>  What does 'can' mean? RDF syntax does not forbid it, sure. However,
>>>  it does make some implicit assumptions about how to interpret it,
>>>  which are really part of the syntax of RDF, though implicitly so:
>>>  they are incorporated into the very notion of 'merging' two RDF
>>>  graphs. Those assumptions were sketched above.
>>
>>Well I guess what is important is that such assumptions may not be
>>reasonable. Because my reading of the current RDF REC says that I can make
>>statements about parts of XML or HTML documents. I interpret this to mean
>>that the URIref http://example.org/Unicorn#LeftButtock either may not
>>resolve at all, else may resolve to a piece of HTML

Ok, I've been itching to say this; might as well do it here.

I think the concept of subresource introduced by Jonathan isn't quite right.

The constraint that we have to live with is that HTTP only sends URI's over 
the wire - it doesn't send the fragid.  That means that if we want to get a 
representation of http://example.org/Unicorn#LeftButtock we have to 
retrieve a representation of http://example.org/Unicorn and the extract the 
relevant part.  I suggest that strictly, the "partof" relationship is one 
between representations of resources, not necessarily between the resources 
themselves, i.e. whilst it might be kinda strange, the leftbuttock in this 
example could be a cows left buttock, not a unicorns.

Brian
Received on Tuesday, 26 February 2002 01:30:34 UTC