Re: Generic processing of Fragment IDs in RFC 3023bis from Roy T. Fielding on 2010-10-05 (www-tag@w3.org from October 2010)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Tue, 5 Oct 2010 14:52:13 -0700
To: Noah Mendelsohn <nrm@arcanedomain.com>
Cc: Norman Walsh <ndw@nwalsh.com>, www-tag@w3.org
Message-Id: <F9780469-D114-4A7F-BD16-0EC2EAEA02C4@gbiv.com>
On Oct 5, 2010, at 12:53 PM, Noah Mendelsohn wrote:

> Roy Fielding writes:
> 
>> Where ambiguity might be present, bare name fragments always refer to the semantics defined by the specific media type.
> 
> My impression is that Norm's preference is:
> 
> Where ambiguity might be present, bare name fragments always refer to the semantics defined for generic processing per 3023bis;  thus the semantics for each specific media type SHOULD be the same as the generic, at least insofar as the syntax overlaps.

I think that would contradict his category (1), but I see your point.
In any case, I don't think that generic processing has the ability
to go back in time and I don't think RDF has the ability to change
how the Web or data format processing works.

Here's another way to think of it.

There is a set (or graph) of meaning for identifiers as interpreted
"on the Web". That is, when processed according to Web tools to do
Web things, each URI has a given meaning that is hopefully implied by
the consistency of interactions with that identified resource over time.

There is another set of meaning for identifiers as interpreted during
generic processing of XML (or, for that matter, parsing HTML, SVG, etc.).
That set occasionally has different semantics, such as when
XML Namespaces declares a different algorithm for URI-comparison than
the one used "on the Web". Moreover, fragment identifiers are used not
for the identification of Web resources, but rather for the
identification of regions within a given representation because
that is what the processor is trying to do.

There is another set of meaning for identifiers as interpreted
"within an RDF graph".  In theory, this set is a superset of
"on the Web", since it is supposed to be semantics "on the Web" plus
any given (presumed to be true) assertions.  In reality, however,
it is more like "what I observed today" plus assertions that
have not yet been proven false.

There is yet another set of meaning for identifiers, that can be
loosely described as "this isn't really an ideal URI to use for
this purpose but it currently corresponds to some representation
that does represent what I wanted to identify and I am going to
use it anyway even though I have no control over the resource owner
to ensure its consistency over time".

I think some people would like to say that the RDF graph must be a
superset of "on the Web" or be in error.  That would be nice, but the
fact is that RDF graphs are almost never true when compared to
"on the Web" because RDF has no conception of time or representation
as being distinct from resource, and the real meaning of Web resources
do change over time, whether we like it or not, due to the last set
of meanings above or due to simple carelessness.

Likewise, I think some people would like to say that generic processing
cannot use URIs in ways that semantically differ from RDF, since that
would lead to situations where an identifier intended for generic
processing is subsequently used as an identifier "on the Web" and
thereby becomes part of an RDF graph.  And yet such generic processing
already exists today and arguably has since the first HTML processor
defined a specific rendering for named anchors.

I don't think there is a solution to this issue that will satisfy
everyone.  I could easily state the observable truth: that each of
the sets of meaning above are, in fact, valuable world views that
are used by different processors at different times for different
purposes.  At best, what we can say is that *when* there is overlap
in identifiers, the meaning that an identifier might have during
"generic processing" is only considered significant while performing
that generic processing, and elsewhere is given lower precedence
than the "on the Web" meaning as defined by the media type of the
representation returned as a result of a retrieval action on the URI.

Is that enough?

....Roy
Received on Tuesday, 5 October 2010 21:52:43 UTC