Re: [URI vs. URIViews] draft-frags-borden-00.txt from Jonathan Borden on 2002-02-25 (www-rdf-comments@w3.org from January to March 2002)

From: Jonathan Borden <jonathan@openhealth.org>
Date: Sun, 24 Feb 2002 21:00:04 -0500
To: "Pat Hayes" <phayes@ai.uwf.edu>
Cc: <www-rdf-comments@w3.org>
Message-ID: <00a101c1bda0$29778400$0301a8c0@ne.mediaone.net>
With apologies to Brian: either Pat or I are deeply confused about some
fundamental issues central to RDF. What this means is likely a problem in
specification that desperately needs to be clarified.

In any case my responses to this round include specific points that I would
like clarified by the RDF WG:

Pat Hayes wrote:
> >
> >Careful, RDF uses frags in two ways:
> >
> >1) as you say
> >2) any subject,predicate or object of any statement may be identified by
a
> >URI reference.
>
> May BE a uriref, actually; but OK.

In the current RDF REC, Section 5 says "sub is a resource ...", indicates to
me that the _subject_ is a _resource_ not a URIreferece, hence my specific
language.

Does the current MT say that a "subject is a URIref" ? If so this seems to
be a significant change rather than a clarification.

>
> >Such URI references may have a fragment id.
>
> Sure, but what that *means* is not specified. It could well be
> meaningless. RDF syntax allows arbitrary urirefs to occur - it
> provides no constraints forbidding any URI combinations as illegal or
> ill-formed -  but RDF provides no semantic guarantees that any such
> usage is meaningful. In particular, the one you provide seems
> nonsensical to me:

Precisely my point. Nowhere in any RDF specification have I read anything to
suggest that a URI reference has any _meaning_ other than what can be
determined by the RDF statements made about the referenced resource. That is
to say, there is nothing to suggest that one can determine any meaning from
the syntactic structure of the URI ref. The example that I provide is
supposed to be "nonsensical" _only_ if you presume to interpret what the URI
ref 'means' based on its syntax. I am suggesting that RDF treat URI
references as opaque identifiers, and that it ought not be possible to
derive meaning by parsing the structure of the URI ref.

To the WG: does RDF mean to say otherwise?

>
> >e.g.
> >
> ><http://example.org/Unicorn#Bottock> rdf:type foo:Bar
> ><http://example.org/Unicorn> rdf:type foo:Unicorn
> >
> >does not imply any relationship between foo:Bar and foo:Unicorn
>
> Agreed; precisely my point. BUt the reason why it does not, is that
> there is no implied relationship between those two urirefs, either,
> other than that the *very use* of the first one implicitly assumes
> that the absolute URI is a URL of a document which contains some RDF
> using the fragID 'Buttock' as a name.

According to the current RDF rec this is not true, there is no assumption
that a URIref used by an RDF application 'point to' anything in an RDF
document, explicitly:

[[
Resources

All things being described by RDF expressions are called resources. A
resource may be an entire Web page; such as the HTML document
"http://www.w3.org/Overview.html" for example. A resource may be a part of a
Web page; e.g. a specific HTML or XML element within the document source. A
resource may also be a whole collection of pages; e.g. an entire Web site. A
resource may also be an object that is not directly accessible via the Web;
e.g. a printed book. Resources are always named by URIs plus optional anchor
ids (see [URI]). Anything can have a URI; the extensibility of URIs allows
the introduction of identifiers for any entity imaginable.
]]

Note in particular: "A resource might be part of a Web page e.g. a specific
HTML or XML element ..." This seems to indicate that a URIref _when used by
RDF_ is NOT intended to point to ONLY RDF documents.

Are URIrefs used in RDF statements assumed to point to locations in RDF
documents? If so this is a big change.

> If there is no such document,
> or no such use of that fragId, then RDF has no way to make sense of
> the first triple, and would probably generate a 409 error.

This confuses me. Does an RDF application need to follow each URIref. What
about non "http" URI schemes, e.g. "urn"s. Are non resolvable URI refs
illegal in RDF?

> >
> >The URI reference that identifies the subject of the first statement has
a
> >fragment identifier.
> >
> >>  If http://example.org/Unicorn
> >>  really means a unicorn, then it should never have a fragId attached
> >>  to it in RDF.
> >
> >Really! This is exactly Aaron's argument.
>
> ?? It is? Then I REALLY have not understood what Aaron is saying.
>
> >A unicorn is an example of what
> >some people call an "abstract resource".
>
> A unicorn is, sure. But the URI is a name, not what is named. Nobody
> is talking about adding a fragId to a unicorn, right?

Right. Hmm. Perhaps you are using the term "mean" in a technical sense and I
am using it in an English sense. The URIref http://example.org/Unicorn
doesn't 'mean' Unicorn, but the URIref may be used to name the concept
"Unicorn". When dereferencing the URI a document entity of type text/plain
may be returned reading: "Unicorns are mythical creatures ..."

> >
> >No this is the whole point. If one RDF treats URI references as opaque
> >identifiers, then one can make any statement about any URI reference.
>
> What does 'can' mean? RDF syntax does not forbid it, sure. However,
> it does make some implicit assumptions about how to interpret it,
> which are really part of the syntax of RDF, though implicitly so:
> they are incorporated into the very notion of 'merging' two RDF
> graphs. Those assumptions were sketched above.

Well I guess what is important is that such assumptions may not be
reasonable. Because my reading of the current RDF REC says that I can make
statements about parts of XML or HTML documents. I interpret this to mean
that the URIref http://example.org/Unicorn#LeftButtock either may not
resolve at all, else may resolve to a piece of HTML

<div id="LeftButtock">
    <p>This is a description of the Left Buttock of the mythical Unicorn
</div>

(note use of non-well formed i.e. SGML based HTML)

Now of course one
> might want to say something in RDF about a document with a URL, and
> it allows one to do that. But that use of an absolute URI as an RDF
> name is a very special use.

Why is that a special case? Where does it say that? I assert it is not a
special case.

>
> >This
> >is the whole argument. Should RDF treat URI references as opaque or not?
> >Should all URIs that use the "http" scheme identify _documents_ or might
not
> >the URI http://example.org/Unicorn identify a Unicorn..
>
> I would say that if someone wants to try to use it in that way, then
> nothing should prevent them from doing so, but they should be ready
> to take the consequences of doing something that makes such fragile
> semantic sense. Probably what they write will have ludicrous
> consequences.

I dearly hope that RDF is not designed to make such usage ludicrous,
otherwise we may have huge problems for RDF's usability. At the very least
this would be a large architectural hole.

>
> >For example,does your model theory contain anything pertaining to the
> >syntactic substructure of a URI reference? scheme, authority,
heirarchical
> >part, fragment id? I don't see it.
>
> No, it does not, because the WG consciously decided to avoid going
> into that territory. It would have been fun to try it, but it was
> outside our charter. But an adequate semantics for a web language
> should address such issues, eventually.

Well that is the issue. I will argue strongly that OWL be able to make
statements about parts of arbirtary XML and HTML documents.

>
> >  But the referring
> >>  thing here is the whole uriref, not the absolute URI. That doesn't
> >>  refer to anything but the document. The relationship between
> >  > http://example.org/Unicorn and http://example.org/Unicorn#LeftButtock
> >>  is not one of resource to subresource;
> >
> >Read the internet draft carefully. There is no _relationship_ defined
> >between _resource_ and _subresource_. A document does contain fragments.
One
> >might consider a sub resource to be contained by  a resource but one can
> >make entirely independent assertions about a resource and any of the
> >subresources that it supposedly contains.
>
> Ive read this several times and it still seems incoherent to me, I
> think because it applies 'sub' to 'resource' rather than 'network
> entity'.

Suppose I change the term "subresource" to "node", does that make more
sense?

> >What is returned is not a resource
> >but, _by definition_, a network entity.
>
> Why is a network entity not a resource? Surely *anything* can be a
resource.

True, but the network entity returned by an HTTP GET on a URI _is not the
same as the resource identified by the URI_.

This needs to be totally clear.

A URIref which _identifies_ a network resource would use the "data" scheme:
e.g.

data:text/plain,A "Unicorn" is a mythical creature ...

>
> >So yes the _document fragment_ obtained by _resolving_
> >http://www.w3.org/1999/02/22-rdf-syntax-ns#Class is a piece of XML. And
the
> >_document fragment_ is indeed contained in the document (entity).
>
> We seem to agree.

finally ...

...So in your example, the document fragment obtained
> by resolving http://example.org/Unicorn#LeftButtock had better be a
> piece of XML (well, RDF in any case).

again, no it could be (non XML) HTML for example.

In other words,
> http://example.org/Unicorn had better be the URL of a document.  The
> RDF semantics might *interpret* it as anything at all, but that's
> completely irrelevant to its role in making connections across the
> semantic web; and it is only the latter role that is relevant to how
> fragIds are treated by an RDF engine.

This is exactly why "rdf:type" is a special kind of property, because the
resource that an rdf:type points to really does need to be RDF (perhaps),
but otherwise, an RDF 'engine' whatever that may be, generally won't even
try to dereference a URI so this should be a non-issue. Correct?

>
> >It is very common to conflate a resource and the entity that represents
it
> >at any point in time. But whether you agree or not, this is how the
language
> >is defined. It is not possible to understand anything about "REST" until
> >this distinction is undetstood at least from a terminological point of
view.
>
> I think we are in violent agreement here.
>

Yes and perhaps this is why RDF needs to very precisely define what a
"Resource" is, to the point, perhaps, of stating that there is (?) no
relationship between the RFC 2396 resource identified by a URI, and the RDF
resource identified by the URI. RDF can then define what it means by a
fragment identifier etc.

The thorny issue, however, gets back to the fact that RDF needs to be able
to make assertions about Web pages and parts of Web pages e.g. arbitrary XML
and HTML documents. So try as you like you probably are stuck with RFC 2396
resources, else devise a formalism for specifying parts of XML and HTML
documents ... oh that would be DTDs and schemas.

Jonathan
Received on Sunday, 24 February 2002 20:24:58 UTC