HTML5 ISSUE-120 rdfa-prefixes : Proposal to use RDFa according to spec

ISSUE-120
Current Status [1,2] :
>   We a single change proposal to simplify the HTML+RDFa specification
>   by removing prefixes.
> - We have another change proposal to clarify how prefixes work and
>   explain that they are optional.

I'd like to propose that HTML/HTML5 uses RDFa as found in the RDFa
specification [3]. This includes the use of namespace prefixes.

I'll counter the argument for changing the spec in regards to
namespace prefixes given (by Hixie) on the WHATWG Wiki [4]
(statistical evidence is my trump card), and then also offer a
sub-proposal that may help alleviate the perceived problems (but isn't
tied to the main proposal).

The Change Proposal summary (regarding namespace prefixes) is:
Simplify the specification by removing features that are documented to
be confusing to users.

First, this change is unnecessary as the use of namespace prefixes is
optional (full URIs can be used inline instead). If this feature is
actually confusing to users then confusion may be avoided by only
providing guidance in the HTML documentation on the use of RDFa
without prefixes. If the facility coverage is adequate, then the user
won't have any need to consult the RDFa spec for the namespace
prefixes-based alternative.

Second, the arguments given in the Change Proposal that support for
namespace prefixes is confusing are mostly anecdotal - i.e. person A,
B and C say it's confusing. (Given the size of the Web, such material
isn't in short supply on any issue you wish to choose - given a little
time with a search engine, arguments that the British Queen is an
alien lizard can be amassed). Additionally no real distinction is made
between issues faced by end-user publishers and tool developers. This
is significant because the only time full knowledge of the namespace
prefix mechanism is essential is when developers wish to write a
parser - this seems something of a minority activity.

Statistical evidence [5] would suggest that in reality the existence
of the option to use namespace prefixes* isn't a barrier to widespread
deployment of RDFa: "The data shows that the usage of RDFa has
increased 510% between March, 2009 and October, 2010, from 0.6% of
webpages to 3.6% of webpages (or 430 million webpages in our sample of
12 billion)".

(* It's possible that none of the pages analysed actually used
namespace prefixes, but that would still mean that their appearance in
the specs doesn't compromise the use of RDFa as-is)

A usability study is quoted, but as an internal Google study which was
flawed in design and limited in scope, I don't believe this can be
considered credible evidence.
(Personally my biggest issue there was that there were only 7
participants, but Hixie has assured me that conclusions can reasonably
be drawn from such small numbers of participants. On the blog it
states "people really don't have any problems dealing with URLs as
property names" - but as also stated there, this wasn't something that
the study was designed to test. A casual observation is not evidence.
There are also the issues mentioned in comments on the WHATWG blog [6]
: "Videos can’t be viewed out of Google. Bias on the part of the
creators of the study. Lack of outside involvement. No information
about where the people taking the study are employed. Lack of
diversity of demographics. Lack of proper, and neutral, oversight.
Interpretation by person or persons without proper background, and
neutrality. Single study, only.")

---

So onto a sub-proposal: a way of removing the need for the widespread
use of namespaces, and allow the use of short names rather than URIs
for common terms, would be to put such terms in the HTML namespace. In
other words, make a registry of terms along the same lines as already
used for common rel="" attributes. Of course such a registry could
never completely reflect the range of terms found in the wild, but it
does seem likely that in the near term at least, HTML developers are
most likely to predominantly use a limited range of terms, which could
be catered for in the HTML namespace.

This is akin to the approach taken by Google in their "Rich Snippets":
common terms are placed in a single namespace. As noted elsewhere, the
single-namespace approach is "hobbled" [7] and Google's particular
implemention is severely flawed [8] (the main flaw is
self-documenting, see http://rdf.data-vocabulary.org/name). But such
issues could be to some extent alleviated by providing references to
existing deployed vocabularies in the HTML namespace document, along
the lines of:

html:Person rdfs:subClassOf foaf:Person, vCard:Person, google:Person ...

(Probably done in RDFa)

Work would be needed in selecting suitable terms (the microformats
community could probably help there) and care taken in aligning them
appropriately with existing terms (i.e. where and in which direction
to use rdfs:subClassOf/rdfs:subPropertyOf,
owl:equivalentClass/owl:equivalentProperty etc).

Were this approach taken, I'd suggest it was used alongside including
RDFa as-is. As mentioned above, if the documentation guides the user
towards the syntactically simpler approach, any potential confusion
may be minimised.

Cheers,
Danny.

[1] http://www.w3.org/html/wg/tracker/issues/120
[2] http://dev.w3.org/html5/status/issue-status.html#ISSUE-120
[3] http://www.w3.org/TR/rdfa-syntax/
[4] http://wiki.whatwg.org/wiki/Change_Proposal_for_ISSUE-120
[5] http://tripletalk.wordpress.com/2011/01/25/rdfa-deployment-across-the-web/
[6] http://blog.whatwg.org/usability-testing-html5
[7] http://blog.iandavis.com/2009/05/13/googles-rdfa-a-damp-squib/
[8] http://www.jenitennison.com/blog/node/104

-- 
http://danny.ayers.name

Received on Friday, 4 February 2011 09:43:30 UTC