Last Call comments on RDFa Core from Harry Halpin on 2010-12-12 (public-rdfa-wg@w3.org from December 2010)

From: Harry Halpin <hhalpin@w3.org>
Date: Sun, 12 Dec 2010 19:26:17 -0000 (GMT)
To: public-rdfa-wg@w3.org
Message-ID: <19bea1536110e261979ca3cc1c74af3f.squirrel@webmail-mit.w3.org>
Due to discussion over RDFa with some colleagues (including Kavi Goel of
Google and Peter Mika of Yahoo! (whose own review is coming shortly and
will hopefully give empirical evidence to some of these arguments), I
finally sat down and read what I think is the most current RDFa document
[4]. Evidently the deadline for review feedback has been extended for
Shelley Powers and Peter Mika, so I attach my feedback as well.

Overall, very good work.  I do have a number of comments to reduce the
complexity and so encourage usability of RDFa. These comments are mine and
represent no-one or organization besides my own personal opinion.

 In general, by making RDFa less complex, it is likely to have more users.
Right now, there is a multitude of ways to do a single thing, and this
makes it hard for authors to remember which way is the right way, and also
it makes it hard predicting which triples will be parsed. While the
difficulty of the HTML5 parsing algorithm is to be expected due to
deployment differences amongst massively deployed browsers, it seems when
designing new technologies hoping for massive take-off simplicity should
be a goal. And the current levels of use of RDFa are good and growing
(hope we can get stats on that soon via Peter Mika), but it's still not a
huge portion of the Web enough to justify unnecessary complexity due to
backwards compatibility.  Machine-generation or authoring is not a
solution, as RDFa will also have to be understood by the people writing
the machine-generation scripts.


Using people's cognitive constraints as 7 plus or minus 2, ideally a
vocabulary should add 9 new things to a language. Right now RDFa adds 14,
I suggest bringing it back down to 9. As backwards compatibilty with RDF 1
is a goal, anything I am saying to removed can be kept in, but
not-highlighted (i.e. not used in examples) and explicitly marked as kept
in for backwards compability.

Here's the comments:

1) Remove any reference to @href and @src to the XHTML and HTML5 documents
about using RDFa. They do not really make sense in the core, as the are
obviously specialized for use in making RDFa easier to use with a certain
number of HTML elements, i.e. <a> and <img>. Having them in RDFa core
needlessly complicates the document, and makes the parsing algorithm much
more complicated. It's OK to keep them in XHTML or HTML5 profiles of RDFa
for ease of hand-authoring I assume, although much of their work can just
be done by use of <span> tags.

2) Please pick either @rel or @property for marking predicates, and do not
encourage the use of both. This comment comes from Kavi Goel I imagine the
use of both was caused by obscure features of HTML like the diff between
<link>, <meta> and desire to stay XHTML compliant with <a>. These are no
longer design goals, as now RDFa is just adding new attributes anyways.
Based on feedback from Google RichSnippets, people get these two confused
constantly. The supposed reason for keeping them apart is that @rel is
supposed to point to URIs while @property points to literals. However, why
not just use a simple EBNF for URI/IRIs to parse the object and *then* use
that to determine if it's a URI or not? In the pathological edge-case
where someone wants a URI to be parsed as a string rather than a resource,
they can just add in the correct datatype using @datatype. In the even
more pathological edge-case they rather have it parsed as a literal string
rather than a xsd:string (and the fact that these are not the same in RDF
is broken, obviously), one could add in "literal" as a datatype to handle
that.

Note that OGP already treats @property as something marking out URIs, not
literals, i.e. from [1]:

<meta property="og:url" content="http://www.imdb.com/title/tt0117500/" />
<meta property="og:image"
content="http://ia.media-imdb.com/images/rock.jpg" />

I'd ditch @rel, because as put by Kavi, it has two different possible uses
-- it's either used on a link to another page, or to convey a relationship
to another entity on the same page.

3) Similarly, pick either @resource or @content (or preferably, pick a
better name like @object) and do not encourage the use of both for marking
object. Again, the difference between @resource and @content is supposed
to be to mark the difference between URIs and literals.

3) Please just pick one way to do vocabularies, and so eliminate the use
of @vocab. Vocabularies should either reference an offline vocabulary
profile for the namespace (@profile), or specify the namespace directly
(@prefix). @profile serves a useful purpose of allowing prefixed CURIEs to
be used. Of course, @prefix is needed for compatibility with HTML5. I do
not really see any reason to add @vocab. Also, as a brief warning - was
not @profile eliminated by HTML5, and used for a bunch of other things?
Maybe just have @vocab be used for the current use of @profile if HTML5 is
determined to keep @profile out.

4) Is there really a reason to have this inline use of @prefix="ex:
http://www.example.org"? I'd remove it. It makes misuse of URIs more
likely, as Hixie is right on some things - people do not in general type
URIs in correctly, usually things like a trailing # are forgotten.

5) Also, lots of people will forget to put the URI in the
@prefix/profile/xmlns, but they tend to get things like "ogp:title" and
"foaf:name" right in the content. Therefore, I suggest that the current
rudimentary state of the XHTML profile be fixed, so that the top 10 or so
vocabularies also have their namespaces stored there, so the common use of
"foaf:name". Rather than hide that feature (I can barely find mention of
the default XHTML profile [3] in the doc), it should be put in an example
early on. Just using common namespaces with a centralized directory is a
lot more sensible for end-users than putting expecting them to remember,
or cut-and-paste, URIs correctly.

6) Lastly, as shown by OGP, there's two distinct use-cases for
vocabularies, one where one is talking about the things a web-page is
about, and the other a web-page. Right now RDFa is optimized to talk about
the web-page. Is there a way we could add something to the vocabulary
profile that says, "for this vocabulary, create a blanknode". Otherwise,
say to use Facebook OGP correctly, one would have to a declare a blank
node on top of the page, i.e. about="" or typeof="". I have to agree with
David Recordon here, asking users to do something like that is a bit
silly. It's better to have it in the vocabulary definition. You could
easily add:

<span property="rdfa:defaultresource">A blank node</span>

Or if you want to give that a value (unlikely, but possible)

<span property="rdfa:defaultresource"
resource="http://www.example.com/#">The default subject of this vocabulary
is <a href="http://www.example.com">example.com</a>.</span>

But you probably want to give it these blanknodes type:

<span property="rdfa:defaultresource" typeof="abc:review">The default
subject of this vocabulary is a blank node of type "review".</span>

Speaking of that, we need a better description of the vocabulary profile
parsing algorithm, it's kinda mentioned offhand the use of rdfa:uri and
things like that, and it seems to be in this document [2]. Since it's
important, why not just move that doc into core?

7) And since OGP has shown that many users find it easier to cut-and-paste
<meta> and <link> into the head rather than annotate the body, why not
show how that can be used as the first example *before* going into the
body?

8) Also, I can't tell if this is allowed (again, thanks to Kavi for the
example):

Let's say I have a review about a restaurant. The markup to convey the
relationship is:
<span typeof="abc:Review">
   <span rel="abc:itemReviewed">
      <span typeof="abc:Restaurant">

Microdata and microformats both remove one layer of nested html elements
for this scenario. For example in microdata, it is:

<span itemtype="site.com/Review">
   <span itemprop="itemReviewed" itemscope itemtype="site.com/Restaurant">

And in microformats it would be shorter still:
<span class="hreview">
   <span class="item hrestaurant">

Can we just have in RDFa?

<span typeof="abc:Review">
   <span rel="abc:itemReviewed" typeof="abc:Restaurant">

I can't see why not, but not sure what the parsing algorithm does here.

      cheers,
            harry

[1]http://ogp.me/
[2]http://www.w3.org/2010/02/rdfa/drafts/2010/ED-vocab-20100326/
[3]http://www.w3.org/1999/xhtml/vocab/
[4]http://www.w3.org/TR/2010/WD-rdfa-core-20101026/
Received on Sunday, 12 December 2010 19:26:19 UTC