Re: fragment identifiers from Roy T. Fielding on 2002-07-24 (www-tag@w3.org from July 2002)

From: Roy T. Fielding <fielding@apache.org>
Date: Tue, 23 Jul 2002 17:01:44 -0700
To: Graham Klyne <GK@NineByNine.org>
Cc: www-tag@w3.org
Message-Id: <8A2D93AA-9E98-11D6-8CCC-000393753936@apache.org>
On Monday, July 22, 2002, at 11:20  PM, Graham Klyne wrote:
> At 07:27 PM 7/22/02 -0700, Roy T. Fielding wrote:
>> Document means a lot of different things to different people,
>> one of which is a bag of bits representing the framework for a
>> renderable page.  All uses of the term "document" in RFC 2396
>> refer to the virtual document described by the retrieved
>> representation of a resource, where the virtual document may
>> consist of multiple individual representations within a single
>> rendering framework (e.g., a web page may consist of HTML,
>> stylesheets, in-line images, etc.).  In HTML, a fragment
>> identifies a portion of the complete virtual document, not
>> just the bits within the HTML framework.
>
> Interesting point.  (Obvious enough with hindsight, but interesting to 
> see it explicitly called out.)
>
> I read that as saying that the idea is well established that a "view" 
> selected by a fragment identifier is not necessarily contained within the 
> resource representation to which it is applied.

The contents of that view is not necessarily entirely contained within
that representation, but the boundaries of the anchor are defined
entirely within that representation.  In other words, there still
needs to be something within the representation that maps the
fragment identifier to an identified thing, since the representation
will have to define what fragment ids are meaningful.

>> [...]  The fragment identifier will, if the resource
>> provider has done it right, identify the same thing across multiple
>> representations.
>
> Well, yes:  that's a big "if".  As an ideal, that's fine, but is it truly 
> a reflection of web architecture as practised?

A reflection of the architecture, yes.  A reflection of individual 
websites?
Not our problem to fix.  We only need to make it possible for social
institutions to manage their namespace in a consistent and well-defined
manner.  The technology cannot do so on its own.

> [...]
>
>> The aspect of fragments that is media-type-specific is the mechanism
>> of the indirect reference when it is dereferenced.  The mechanism is
>> not known (and cannot be known) until a representation is in hand.
>> That is, either the fragment identifier is used in a same-document
>> reference or an action equivalent to GET is performed on the URI
>> preceding the fragment in order to obtain that representation.
>> The representation, once in hand, determines what needs to be done
>> to complete the retrieval action.
>
> I think this is all fine as far as it goes, but I think it leaves 
> untouched an aspect of fragment identifiers that's been nagging at a few 
> of us who work with RDF.  (Maybe RDF's use of fragment identifiers is 
> broken - as has been suggested in the past - and needs to be fixed?  I 
> used to think so, but now think it can be reconciled with the Web usage 
> you describe.)

The part that RDF has wrong is the notion that a fragment identifier
is necessary to denote resources that are not documents.  That position
is neither supported by Web architecture nor supported by historical
use of anchors, nor does it make any logical sense to me as an implementor
of HTTP clients and servers.

> So where's the problem here?  I find your description is centred very 
> much around the actions of retrieval and rendering of a web resource 
> representation.  RDF uses URI+fragment identifiers as opaque identifiers,
>  without any presumption of retrieval and/or rendering.  There's nothing 
> in your write-up that says how (other than "if the resource provider has 
> done it right") a fragment identifier can be part of an identifier that 
> denotes something independently of the form of representation that may be 
> to hand.

The paragraph above that stated that the only two operations available
on fragments are name equivalence and retrieval.  There's not much I can
say about name equivalence other than the fact that you must have the
URI resource's representation in hand in order to know what the
fragment identifies.

> So my questions to you are:
>
> (a) do you recognize that (as for RDF) that a URI+fragment identifier can 
> denote some value quite independently of resource representation?

Of course.  So can any URI.  Any URI, when detached from the process of
accessing a resource, serves as a name and only a name.  It is an
identifier.

> (b) does your design rationale for fragment identifiers address this 
> issue (or provide a basis for addressing it)?

I still don't see the issue, other than the folowing:

References to <uri-that-results-in-rdf#foo> can identify anything
for which the RDF representation has explicitly denoted "foo" as
referring to.  Likewise, an RDF representation can make assertions
about some "foo" that is entirely defined within that representation
using same-document references.  However, an RDF representation
cannot make assertions about the absolute URI of a
<uri-that-results-in-rdf#foo> using same-document references, since
the generator of the RDF has no idea what URI was actually used to
retrieve the RDF representation.  Therefore, it is semantically
invalid to construct an RDF reasoning tree in which same-document
references are replaced by the base URI#fragment, since doing so
loses critical information and replaces it with an invalid
assumption about the resource associated with the URI, rather than
the thing identified by the fragment.

I don't know if that issue has been fixed by more recent RDF drafts.
If the RDF processor requires an absolute URI, then it is more valid
to introduce a make-believe URI, such as "org.w3.rdf:this#fragment"
than it is to assume a fragment-only reference is applicable to
the base URI.

The distinction between same-document references and same-URI
references is not media-type specific.  It is an attribute of
indirect referencing due to the retrieval action that is
inherent in any indirect reference (including RDF).


Cheers,

Roy T. Fielding, Chief Scientist, Day Software
                  (roy.fielding@day.com) <http://www.day.com/>

                  Chairman, The Apache Software Foundation
                  (fielding@apache.org)  <http://www.apache.org/>
Received on Tuesday, 23 July 2002 20:01:54 UTC