Another fragment issue

I have recently realized that RDF fragment issues are more confusing than I
previously thought. To add to the complication there is the serious problem
that the RDF definition of resource is different than the established Web
definition.

Graham Klyne (first?) spotted this with his terminology definitions[1]. At
the time I didn't understand the issue and so when Danbri made it seem[2] as
if the problem were simply referring to things not available over the
Internet I didn't worry about it anymore.

I now realize the problem was far more serious than I had ever imagined.
When a recent discussion on #rdfig[3] brought the issue to my attention once
again, I looked at the URI specification and tried to make sense of it. This
is what I discovered:

http://example.org/q.html?aaa is a URI. It is bound to a resource.

http://example.org/q.html?aaa#foo is a URI-reference. It refers to the URI
mentioned above, along with the fragment "foo" to aid applications in
viewing a representation of the resource. Thus, http://example.org/#foo and
http://example.org/#bar refer to the same resource based on the URI spec.

To understand why this is true, you must understand a bit about Web
architecture. When resolving a URI reference, the application put the
fragment "in its back pocket" as TimBL said. It then sends the rest of the
URI to the server, gets back a document, and based on the mime type of the
document decides how to use the fragment.

Because the fragment is mime-type-dependent and the server never sees it,
the fragment does not refer to a portion or a view of a resource. Instead,
it merely refers to a portion or view of a representation -- the bag of bits
that HTTP sends back when you ask it for a URI. Claiming that it does refer
to a portion of a resource creates serious inconsistencies in the Web
architecture, as RDF has discovered, to some extent.

The RDF spec defines resource differently claiming that it also contains a
fragment, referring to a no-longer-existing Internet-Draft:

    Resources are always named by URIs plus optional anchor ids (see [URI]).
    Anything can have a URI; the extensibility of URIs allows the
    introduction of identifiers for any entity imaginable.

[URI] http://www.ietf.org/internet-drafts/draft-fielding-uri-syntax-04.txt

The Internet-Draft was obsoleted by the current URI RFC, which DOES NOT
define resources in this way.

I queried the W3C's URI list[4] and no less an expert on Web architecture
than Roy Fielding responded that he believed my view was correct. Dan
Connolly disagreed with my interpretation and claimed that the spec was
neutral on the issue.

Either way, RDF (and the RDF Core WG) needs to justify and document this
significant change from the URI RFCs. Sadly, since so many RDF properties
and documents refer to URIs with fragment identifiers, this issue could
cause serious problems were RDF to change its definition.

I see no easy solution to this issue. Continuing on with RDF's special
definition will exacerbate the problems with fragments that we have already
seen. Yet, if we change the definition, we will obsolete many RDF documents
and vocabularies. 

    :We :between ( [a :Rock] [ a :Place ; :feel :Hard ] ) .

    "We're between a rock and a hard place."

Sowa might say[5].

[1] http://lists.w3.org/Archives/Public/www-rdf-interest/2001Jan/0006.html
[2] http://lists.w3.org/Archives/Public/www-rdf-interest/2001Jan/0104.html
[3] http://rdfig.xmlhack.com/2001/05/08/2001-05-08.html#989341115.745719
[4] http://lists.w3.org/Archives/Public/uri/2001May/0017.html
[5] http://www.bestweb.net/~sowa/cg/cgexampw.htm#Ex_4
-- 
[ Aaron Swartz | me@aaronsw.com | http://www.aaronsw.com ]

Received on Sunday, 13 May 2001 01:38:43 UTC