RE: Generic processing of Fragment IDs in RFC 3023bis from Larry Masinter on 2010-09-23 (www-tag@w3.org from September 2010)

From: Larry Masinter <masinter@adobe.com>
Date: Thu, 23 Sep 2010 02:24:34 -0700
To: "Roy T. Fielding" <fielding@gbiv.com>, Norman Walsh <ndw@nwalsh.com>, Martin Duerst <duerst@w3.org>
CC: "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <C68CB012D9182D408CED7B884F441D4D013BDF5649@nambxv01a.corp.adobe.com>
At the last W3C TAG teleconference, I suggested an action for myself, ACTION-446
http://www.w3.org/2001/tag/group/track/actions/466.


First, I would like to convey to you the apology in the TAG's delay in addressing the
issue -- much of the delay because (a) I had expressed some concern about the
TAG's sentiment and (b) I was on an extended vacation through July & August.

That said, I wanted to request more details for a specific use case for the requirement
in RFC 3023bis that XML content delivered with content-type application/something+xml
as the result of retrieving a URI of the form uriA#fragmentB must be treated in a 
way that the fragment identifier "fragmentB" follows generic XML fragment identifier
semantics.

* I wanted to understand the requirement that the 'generic XML fragment'
 actually be communicated via the fragment identifier of the URI used to access
 the XML, rather than by some other alternative communication channel. I understand
 the desire to support generic XML processing, just not the communication channel
 for the fragment.

* I wanted to understand why this requirement would only apply if the URI
 (uriA) was a HTTP URI or some other URI that entailed communication of a MIME type --
 since ftp: or file: or many other schemes don't have MIME types. Or is it
 expected that the fragments would also work the same way even for other
 schemes?

* I wanted to explore the hypothetical situation of having two mime types, e.g.,
 application/frob and application/frob+xml, these two having the identical
 definition, except for their use of fragment identifiers. How would this
 work in practice in the use cases of generic XML processing of fragment identifiers?

* I wanted to make sure we were not introducing incompatibilities in the 
 HTML "polyglot" case, where the same content could be delivered as text/html and 
 as application/xhtml+xml where the overall content had the same result; would
 this requirement of generic XML processing of fragment identifiers interfere
 with the use of fragment identifiers as a means of passing parameters to scripting
 content of text/html.

I'm looking for specific software that you've written, used, or at least
have pointers to documentation for that need the 3023bis requirement, and
some explanation of how that software would work in the situations outlined
above.

Thanks,

Larry (for W3C TAG ACTION-446)
--
http://larry.masinter.net



-----Original Message-----
From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf Of Jonathan Rees
Sent: Wednesday, September 22, 2010 6:36 AM
To: Noah Mendelsohn
Cc: Martin J. Dürst; Roy T. Fielding; Norman Walsh; www-tag@w3.org; MURATA Makoto; Chris Lilley
Subject: Re: Generic processing of Fragment IDs in RFC 3023bis

On Tue, Sep 21, 2010 at 9:29 PM, Noah Mendelsohn <nrm@arcanedomain.com> wrote:
>
> On 9/20/2010 10:28 PM, Jonathan Rees wrote:
>>
>> I think it
>> does, but why? Something about the RDF/XML DTD?
>
> As best I can tell, yes that's the case.  If
> http://www.w3.org/XML/9710rdf-dtd/rdf.dtd is the normative DTD (something I
> haven't managed to prove), it seems to declare rdf:id to be of type ID,
> which as best I can tell makes it subject to references by XPointer id()
> functions and thus barenames.  Obviously, it would be helpful if someone
> with a more solid knowledge of the details could confirm this.
>
>> So suppose we have
>> <foaf:Person rdf:ID="B">... and that a generic XML processor treats
>> rdf:ID like xml:id. 3023bis says that A#B would have to "designate"
>> the foaf:Person *element*.  I don't think there is any normative
>> RDF-related document that says A#B has to "designate" anything other
>> than the element (say, a foaf:Person), and in fact I would be
>> distressed if there were. But in practice some applications act as if
>> A#B does "identify" (variously, "name", "denote", etc.) a foaf:Person
>> (i.e. that it does *not* identify an XML element), and it would be
>> natural to insist that "designate" and "identify" have to be coherent.
>
> Not my area of expertise, but I'm fairly sure that this whole thing came up
> because one or more TAG members asserted that, per the media type
> registration for application/rdf+xml, http://example.org/myrdf.xml#somename

> did NOT identify an element, but rather a node in an RDF graph.

Probably me, making an uninformed assertion.  But does it really
matter whether the conflict is between 3023bis-draft and
application/rdf+xml, or between 3023bis-draft and widespread RDF/XML
practice? We could drill down on this, but I've attempted to prove a
conflict with application/rdf+xml and it's not obviously there -- and
the question quickly becomes very complicated in very uninteresting
ways. If it's OK I'd like to avoid that question if I can, and just
consider what people are actually doing with RDF/XML, which is going
to be much harder to change than any spec.

(And by the way, no, not graph nodes, Noah. In an RDF application a
URI can by design "identify" anything at all, including an XML
element. I've used foaf:Person as a standin for "anything at all", and
a foaf:Person in the usual interpretation is definitely not a graph
node - it has a name and a home page, not in-degrees and neighbors. A
URI could "identify" a graph node but only in very unusual
circumstances, where RDF was being used to say something about the
infrastructure of RDF itself, as opposed to some more pedestrian
subject matter. This is not generally done.)

> So, if all that's true, we have a situation where:
>
> * Code written to normative specification 3023bis would conclude that the
> above URI reference resolved to an element (I.e. because the DTD says that
> rdf:id is of type ID, which implies that it identifies an element).

You mean "identified an element", not "resolved to an element", right?
We weren't talking about resolution.

> * Code written to normative reference [media type registration for
> application/rdf+xml] would conclude that the same URI reference, applied to
> the same retrieved representation, resolved to a node in an RDF graph.
>
> That seems broken, hence the concern.  What am I misundertanding?
>
> Now, if it turns out that the DTD does not apply, there still seems
> something disturbing about the situation:  3023bis implies that the range of
> such URI references is elements;  rdf+xml implies that it's graph nodes.

No restriction on the "range" of a URI reference. But the idea that
applications take the URI references to mean non-elements is right.

>  Code written to the latter concludes that the URI above has a referent (a
> graph node), and code written to the former concludes that the reference
> doesn't resolve.  I suppose that's less clearly broken than having them
> successfully resolve differently, but it still doesn't feel great to me.

How is this any different from any other situation in which a
resource's fragid definitions are drawn from multiple sources (e.g.
conneg to "representations" having distinct media types)? If you get a
fragid defined according to one source or spec and undefined according
to another source or spec, we've already decided that's OK (I can dig
up the minutes if you like - I think it was this summer, I had an
action about it). It kills follow-your-nose, but we knew that. The
question of range is not relevant, except to the extent that it seems
to be (erroneously) asserted by various media type registrations. Now
*that* is something we could fix.

We've got the same business at work with the fragid patterns that
Raman documented last year (http://www.w3.org/TR/hash-in-uri/). Try
explaining these just by following your nose from the RFCs.

> By the way, one of the things I've convinced myself in drafting this note is
> that, if a normative DTD for rdf+xml says that rdf:id is of type ID, it's
> probably lying; the attribute is not an identifier for the element. Right?

Well, it's not clearly *inconsistent* given the DTD and the RDF/XML
spec to say that the rdf:ID fragid "identifies" an element. One could
conceivably interpret the fragid that way and stay in spec. It's just
that this is at variance with practice.

As I was saying we have an option to remove the inconsistency between
xml:id and RDF/XML practice by changing the DTD to "divorce" rdf:ID
from xml:id. Doing so would permit generic processing to stand,
without losing the idea that the rdf:ID fragid "identifies" a
non-element. However, changing a ubiquitously deployed DTD seems like
a very bad idea...

We could say that the RDF/XML DTD is right and the RDF/XML
applications are wrong, i.e. all the rdf:ID fragids really do
"designate" elements, darnit. But this would break a lot of things for
no good reason.

I tend to agree with Henry that there probably isn't any real problem
here, it's pinhead angels, and we can keep generic processing without
doing anything other than maybe alerting those who might care to the
situation. It seems a harmless exception to the already controversial
and possibly untenable URIs-mean-single-things-across-all-applications
idea. (But give them an inch...?) I am also happy with almost any
other simple solution such as 'grandfathering' RDF/XML through an
explicit exception in 3023bis.

Of course I'm with Roy: the world would be a better place if we could
deprecate RDF/XML.

I look forward to the results of Larry's related ACTION-466 on this matter.

Jonathan

> Noah
>
Received on Thursday, 23 September 2010 09:24:43 UTC