HTML and URI References compatability conserns

As the maintainer of a library that converts and parses URIs and IRIs, as
well as many Semantic Web-related libraries that use it, I was reading
through the HTML draft, and it appears that the core ingredient of RDF and
Semantic Web--the URI [1] and IRI [2]--is not, in current draft,
normatively referenced from its key hypertext technology, HTML [3].

That is to say, the core technology that the Web uses to identify resources
is not actually referenced to by the core technology that the Web uses to
make relationships _between_ those resources. I find this, to say the
least, a bit concerning.

Instead, HTML either misuses or fails to appropriately use the well-defined
vocabulary terms of URIs used in all other Web-related standards, including
the recent updates of RDF [4] and HTTP [5], and the newly published CoAP
[6]. Where I would expect to see terms like "URI Reference" and "IRI", I
see only "URL", which in the strictest sense would be incompatible with
RDF's IRIs. In actuality, HTML re-defines URI algorithms in a manner that
appears to be subtly incompatible with the URI parsing routines used
everywhere else (especially Semantic Web software). When literally
implemented, this would result in using different IRI parsers for the Link
header than we do in HTML itself. This does not sound like a good
proposition for Web growth.

The most recent discussion I can find on this was years ago. As HTML 5.0
goes into its final stages, I'd like to ensure that the result remains
compatible with Semantic Web principles and technologies.

I'd enjoy any corrections to my cursory observations, and answers to the
following questions:

(1) I understand that many User Agents historically have never followed the
standard, and it's argued that standards-mode would actually break some
legacy documents (though I'm not aware of any such documents). But in many
cases (most all Semantic Web applications), alignment with the standards
*is* necessary and desired. Is there, or could there be, some effort for
document authors to say "I demand standards-compliant behavior?" I.e. by
merely declaring the HTML5 doctype, or sending Content-Type:
application/xhtml+xml? How would this be brought up with the HTML WG?

(2) Has anyone thoroughly reviewed the compatibility of HTML's parsing of
URI References with Semantic Web and non-Web-browser implementations of the
URI (implementations like mine)? Is RDFa still compatible with HTML, in
every case, without subtle or unexpected bugs?

(3) The HTML WG Charter [7] specifies a role of liaison for related
Technical Reports and Community Groups. Does this include Semantic Web,
RDFa, and community groups like RDFJS (JavaScript/ECMAScript users of RDF
should be directly relevant to HTML)? Who is filling this role of liaison?

Thanks,

Austin Wright.

[1]http://tools.ietf.org/html/rfc3986
[2]http://tools.ietf.org/html/rfc3987
[3] http://www.w3.org/TR/2014/WD-html5-20140617/(current revision as of now)
[4]http://www.w3.org/TR/2014/NOTE-rdf11-primer-20140624/
[5]http://tools.ietf.org/html/rfc7230
[6] http://tools.ietf.org/html/rfc7252
[7]http://www.w3.org/2013/09/html-charter.html

Received on Monday, 18 August 2014 11:54:48 UTC