Re: Generic processing of Fragment IDs in RFC 3023bis from Jonathan Rees on 2010-09-22 (www-tag@w3.org from September 2010)

From: Jonathan Rees <jar@creativecommons.org>
Date: Wed, 22 Sep 2010 09:35:50 -0400
To: Noah Mendelsohn <nrm@arcanedomain.com>
Cc: Martin J. Dürst <duerst@it.aoyama.ac.jp>, "Roy T. Fielding" <fielding@gbiv.com>, Norman Walsh <ndw@nwalsh.com>, www-tag@w3.org, MURATA Makoto <eb2m-mrt@asahi-net.or.jp>, Chris Lilley <chris@w3.org>
Message-ID: <AANLkTik7fj8hWbVuodhSXWVWv3Wk98p61SynsSOhVEC4@mail.gmail.com>
On Tue, Sep 21, 2010 at 9:29 PM, Noah Mendelsohn <nrm@arcanedomain.com> wrote:
>
> On 9/20/2010 10:28 PM, Jonathan Rees wrote:
>>
>> I think it
>> does, but why? Something about the RDF/XML DTD?
>
> As best I can tell, yes that's the case.  If
> http://www.w3.org/XML/9710rdf-dtd/rdf.dtd is the normative DTD (something I
> haven't managed to prove), it seems to declare rdf:id to be of type ID,
> which as best I can tell makes it subject to references by XPointer id()
> functions and thus barenames.  Obviously, it would be helpful if someone
> with a more solid knowledge of the details could confirm this.
>
>> So suppose we have
>> <foaf:Person rdf:ID="B">... and that a generic XML processor treats
>> rdf:ID like xml:id. 3023bis says that A#B would have to "designate"
>> the foaf:Person *element*.  I don't think there is any normative
>> RDF-related document that says A#B has to "designate" anything other
>> than the element (say, a foaf:Person), and in fact I would be
>> distressed if there were. But in practice some applications act as if
>> A#B does "identify" (variously, "name", "denote", etc.) a foaf:Person
>> (i.e. that it does *not* identify an XML element), and it would be
>> natural to insist that "designate" and "identify" have to be coherent.
>
> Not my area of expertise, but I'm fairly sure that this whole thing came up
> because one or more TAG members asserted that, per the media type
> registration for application/rdf+xml, http://example.org/myrdf.xml#somename
> did NOT identify an element, but rather a node in an RDF graph.

Probably me, making an uninformed assertion.  But does it really
matter whether the conflict is between 3023bis-draft and
application/rdf+xml, or between 3023bis-draft and widespread RDF/XML
practice? We could drill down on this, but I've attempted to prove a
conflict with application/rdf+xml and it's not obviously there -- and
the question quickly becomes very complicated in very uninteresting
ways. If it's OK I'd like to avoid that question if I can, and just
consider what people are actually doing with RDF/XML, which is going
to be much harder to change than any spec.

(And by the way, no, not graph nodes, Noah. In an RDF application a
URI can by design "identify" anything at all, including an XML
element. I've used foaf:Person as a standin for "anything at all", and
a foaf:Person in the usual interpretation is definitely not a graph
node - it has a name and a home page, not in-degrees and neighbors. A
URI could "identify" a graph node but only in very unusual
circumstances, where RDF was being used to say something about the
infrastructure of RDF itself, as opposed to some more pedestrian
subject matter. This is not generally done.)

> So, if all that's true, we have a situation where:
>
> * Code written to normative specification 3023bis would conclude that the
> above URI reference resolved to an element (I.e. because the DTD says that
> rdf:id is of type ID, which implies that it identifies an element).

You mean "identified an element", not "resolved to an element", right?
We weren't talking about resolution.

> * Code written to normative reference [media type registration for
> application/rdf+xml] would conclude that the same URI reference, applied to
> the same retrieved representation, resolved to a node in an RDF graph.
>
> That seems broken, hence the concern.  What am I misundertanding?
>
> Now, if it turns out that the DTD does not apply, there still seems
> something disturbing about the situation:  3023bis implies that the range of
> such URI references is elements;  rdf+xml implies that it's graph nodes.

No restriction on the "range" of a URI reference. But the idea that
applications take the URI references to mean non-elements is right.

>  Code written to the latter concludes that the URI above has a referent (a
> graph node), and code written to the former concludes that the reference
> doesn't resolve.  I suppose that's less clearly broken than having them
> successfully resolve differently, but it still doesn't feel great to me.

How is this any different from any other situation in which a
resource's fragid definitions are drawn from multiple sources (e.g.
conneg to "representations" having distinct media types)? If you get a
fragid defined according to one source or spec and undefined according
to another source or spec, we've already decided that's OK (I can dig
up the minutes if you like - I think it was this summer, I had an
action about it). It kills follow-your-nose, but we knew that. The
question of range is not relevant, except to the extent that it seems
to be (erroneously) asserted by various media type registrations. Now
*that* is something we could fix.

We've got the same business at work with the fragid patterns that
Raman documented last year (http://www.w3.org/TR/hash-in-uri/). Try
explaining these just by following your nose from the RFCs.

> By the way, one of the things I've convinced myself in drafting this note is
> that, if a normative DTD for rdf+xml says that rdf:id is of type ID, it's
> probably lying; the attribute is not an identifier for the element. Right?

Well, it's not clearly *inconsistent* given the DTD and the RDF/XML
spec to say that the rdf:ID fragid "identifies" an element. One could
conceivably interpret the fragid that way and stay in spec. It's just
that this is at variance with practice.

As I was saying we have an option to remove the inconsistency between
xml:id and RDF/XML practice by changing the DTD to "divorce" rdf:ID
from xml:id. Doing so would permit generic processing to stand,
without losing the idea that the rdf:ID fragid "identifies" a
non-element. However, changing a ubiquitously deployed DTD seems like
a very bad idea...

We could say that the RDF/XML DTD is right and the RDF/XML
applications are wrong, i.e. all the rdf:ID fragids really do
"designate" elements, darnit. But this would break a lot of things for
no good reason.

I tend to agree with Henry that there probably isn't any real problem
here, it's pinhead angels, and we can keep generic processing without
doing anything other than maybe alerting those who might care to the
situation. It seems a harmless exception to the already controversial
and possibly untenable URIs-mean-single-things-across-all-applications
idea. (But give them an inch...?) I am also happy with almost any
other simple solution such as 'grandfathering' RDF/XML through an
explicit exception in 3023bis.

Of course I'm with Roy: the world would be a better place if we could
deprecate RDF/XML.

I look forward to the results of Larry's related ACTION-466 on this matter.

Jonathan

> Noah
>
Received on Wednesday, 22 September 2010 13:43:08 UTC