HTML media type vs. # URIs that do not identify document elements from Jonathan Rees on 2010-02-05 (www-tag@w3.org from February 2010)

From: Jonathan Rees <jar@creativecommons.org>
Date: Fri, 5 Feb 2010 14:09:22 -0500
To: www-tag@w3.org, Ben Adida <ben@adida.net>
Message-ID: <760bcb2a1002051109u1062adfdl97a8fb1456e417db@mail.gmail.com>

http://www.w3.org/TR/swbp-vocab-pub/ advocates providing RDF and HTML
versions of ontologies using content negotiation, and this is a
pattern that is, I believe, widely deployed. The hack is that in the
HTML version you have
  <a name="foo"> ...documentation for http://blah/bar#foo ...
and in the RDF you have
  <... rdf:resource="http://blah/bar#foo" ...> ... properties of
http://blah/bar#foo ...
There is a problem: the media type registration for text/html (also
application/xhtml+xml) says: "For documents labeled as text/html, the
fragment identifier designates the correspondingly named element". So
using the #foo URI to designate anything other than an element, as the
RDF 'representation' does, is out of spec (when there is an HTML
representation).

The same problem can arise with RDFa, even in the absence of content
negotiation.

Clearly being out of spec does not seem to be a problem for anyone who
does this kind of thing, but it is sort of an embarrassment.

Since the text/html media type is under revision, I wonder if anyone
has looked into making it more RDF-friendly, so that this usage
becomes legitimate?

Jonathan

Received on Friday, 5 February 2010 19:09:59 UTC