Re: RDFa Core last call comments - "have not yet caught up" from Jeni Tennison on 2011-05-10 (www-tag@w3.org from May 2011)

From: Jeni Tennison <jeni@jenitennison.com>
Date: Tue, 10 May 2011 21:50:51 +0100
To: Jonathan Rees <jar@creativecommons.org>
Cc: www-tag@w3.org
Message-Id: <75F7E885-5CFA-499C-B155-5C75F3325873@jenitennison.com>
Hi Jonathan,

Thanks for taking the time to walk me through this. I think that I get it now, but I've been wrong before so let me check. 3023bis [1] says:

  Conformant applications MUST interpret such fragment identifiers 
  as designating that part of the retrieved representation specified 
  by [XPointerFramework] and whatever other specifications define any 
  XPointer schemes used.

and in the XPointer Framework [2] it states:

  A shorthand pointer, formerly known as a barename, consists of an 
  NCName alone. It identifies at most one element in the resource's 
  information set; specifically, the first one (if any) in document 
  order that has a matching NCName as an identifier.

In other words, if you use a fragment identifier such as '#me' with a application/xml media type then it must be interpreted as identifying an element information item within the document.

Further 3023bis says:

  If [XPointerFramework] and [XPointerElement] are inappropriate for 
  some XML-based media type, it SHOULD NOT follow the naming convention
  '+xml'.

This says '+xml' media types should define fragments in the same way, but the media type registration for application/rdf+xml [3] says:

  In RDF, the thing identified by a URI with fragment identifier does
  not necessarily bear any particular relationship to the thing
  identified by the URI alone.

So application/rdf+xml specifically allows an interpretation of fragment identifiers outside that allowed in 3023bis.

I understand from the thread that you pointed to at

  http://lists.w3.org/Archives/Public/www-tag/2010Nov/0078.html

that the 3023bis authors intend to address this by special-casing application/rdf+xml to say that that particular media type is allowed to define fragment identifiers differently.

The problem that's come up now is that RDFa is providing a standard mechanism for adding RDF semantics to XML-based markup languages and this is likely to encourage people to use RDF-style hash URIs, in which 'http://www.example.org/#me' identifies a person rather than an element-information item, while serving up XML documents. (Indeed, the RDFa Core WD illustrates this practice.) To support these URIs, these languages will need their own media type registrations that describe the interpretation of fragments. These won't be able to use a '+xml' media type because they won't be special-cased in 3023bis.

And of course there's one rather important language where this is a particular problem: XHTML. XHTML should be served as application/xhtml+xml [4], which currently defers to RFC 3023 for the interpretation of its fragments. So XHTML shouldn't be served at any location that is used as the base URI of a RDF-style hash URI. (This is true regardless of whether the XHTML holds RDFa or not.)

Linked data is the most obvious place where the constraints that 3023bis places on +xml media type registrations leads to issues with current practice, but there are others. For example, it won't be possible for application/svg+xml to adopt the "Media Fragments URI" schemes. And it means that using hash-bang and other similar fragment identifiers with XHTML is wrong. (As it is with text/html as well.)

Another example from my own experience: I deal with a markup language that uses milestone elements to indicate the start/end of overlapping structures within a document; I'd like to be able to address the 'virtual elements' indicated by these milestones, but since that won't be an XML element information item, I can't use either a barename fragment or a non-XPointer fragment syntax to do so: I have to register a new XPointer scheme and use that if I want to use a +xml mime type (with the advantages that that brings with generic processors).

---

For what it's worth, assuming that I've got things right above, this is my current take on a way through this.

As I understand it, the aim of the 3023bis authors is to provide some kind of guarantee that generic applications can use XPointer fragment syntax to address into all XML documents. If +xml mime types define their own interpretations of fragment identifiers then that's no longer possible, particularly if it leads to a conflict between how a generic XML application and a language-specific application interpret a given fragment identifier.

It seems to me that 3023bis could achieve this aim while still enabling people to use RDF-style hash URIs, conneg, hash bangs with XHTML and so on if it said:

  1. fragment identifiers for application/xml are interpreted as defined by XPointer

  2. +xml mime types MAY define their own interpretation of 'barename' fragment identifiers but MUST NOT define their own interpretation of any other fragment identifiers that match the XPointer syntax; all +xml mime types MUST support the element XPointer scheme

  3. +xml mime types MAY define their own interpretation of fragment identifiers that do not match the XPointer syntax

This gives a guaranteed core set of fragment identifiers which can be interpreted consistently by generic XML applications across +xml types (namely scheme-based XPointers, with those using the element scheme being supported for all types), but also enables +xml types to support other fragment identifier syntaxes.

It does mean that generic XML applications would only be able to interpret barename fragment identifiers as addressing element information items by ID if a document were served as application/xml. For +xml mime types, the XPointer element scheme would have to be used to address elements instead. (Effectively this means using '#element(foo)' rather than '#foo'.)

In addition, 3023bis should point out that if barename hash URIs are used to identify things other than element information items (by id) then any XML that's served at the base URI needs to be served as a +xml mime type rather than application/xml.

If 3023bis said this, there wouldn't be any requirement to special-case application/rdf+xml and RDFa Core could then say something along the lines of:

  RDFa may be used within any XML. Note that if hash URIs are used to 
  address resources that are not element information items, any XML
  representation served at those locations must be served under a mime 
  type other than application/xml.

I haven't seen this approach suggested previously, but I haven't read everything. What are/would be the objections?

Cheers,

Jeni

[1]: http://www.w3.org/2006/02/son-of-3023/draft-murata-kohn-lilley-xml-04.html#frag
[2]: http://www.w3.org/TR/xptr-framework/#shorthand
[3]: http://www.ietf.org/rfc/rfc3870.txt
[4]: http://www.ietf.org/rfc/rfc3236.txt
-- 
Jeni Tennison
http://www.jenitennison.com
Received on Tuesday, 10 May 2011 20:51:17 UTC