Re: ISSUE-117 (about-on-HTML): Consider disallowing @about on <html> [RDFa 1.1 in HTML5]

Hi Ivan,

You wrote:
> What this rule says, in terms of elements, is that <html> (or any top element) _has_ this magic
> behaviour, ie, an @about="" is introduced on that level, conceptually (unless there is an explicit
> @about, that is). So what does <head> and <body> magic brings us?
>
> I am officially lost.:-)

:)

Bear with me, and I'll reconstruct the rationale, for you...

As you know, part of the RDFa story was to provide an RDF
interpretation of already existing HTML metadata features -- i.e., if
an existing HTML document that had no RDFa in it was parsed with an
RDFa parser, we would still like to end up with something sensible.

Take an example from the LINK element section of the HTML 4 spec [1]:

  <HEAD>
    ...other head information...
    <TITLE>Chapter 5</TITLE>
    <LINK rel="prev" href="chapter4.html">
    <LINK rel="next" href="chapter6.html">
  </HEAD>

It's obvious that the 'something sensible' that we would like to end
up with is two triples that provide two predicates for the 'current
document', one of which points to a document that is somehow 'prior'
to the 'current document' and the other which is somehow 'next'.
Expressed as triples we would want:

  <> some-ns:prev <chapter4.html> .
  <> some-ns:next <chapter6.html> .

By adopting this approach we get to derive RDF from pre-existing HTML
that has been created by authors that have no knowledge of (or
interest in) RDF. A result. :)

Now, let's add some RDFa to this document so that we can indicate that
the 'current document' is not just any old document, but it's actually
a book. This property of the document /could/ be indicated by adding
@typeof to the root element (excuse me whilst I also switch to
lower-case for element names!):

  <html typeof="bibio:book">
    <head>
      <title>Chapter 5</title>
      <link rel="prev" href="chapter4.html">
      <link rel="next" href="chapter6.html">
    </head>
    .
    .
    .

Unfortunately @typeof with no subject automatically creates a bnode
which would mean that the markup inside <head> would no longer
reference the 'current document' -- it would instead reference the
newly minted bnode. Given that we wanted authors to feel their way
through RDFa by incremental additions it would be quite a surprise for
them if the simple addition of a type description for a document
caused all of the previous predicates attached to that document to
disappear.

One way to get around this would have been to have put a default of
@about="" on the HTML element, so that our previous example was parsed
as if the following was present:

  <html about="" typeof="bibio:book">
    <head>
      <title>Chapter 5</title>
      <link rel="prev" href="chapter4.html">
      <link rel="next" href="chapter6.html">
    </head>
    .
    .
    .

However, this only moves the problem to the <head> element; if instead
of the above pattern someone used the following one we'd once again
have a bnode to contend with:

  <head typeof="bibio:book">
    <title>Chapter 5</title>
    <link rel="prev" href="chapter4.html">
    <link rel="next" href="chapter6.html">
  </head>

Ultimately for the @about="" to play the role of a guard it needs to
be placed as 'close' as possible to the entries you are trying to
protect (the <link> and <meta> elements inside <head>) and that means
having the implied @about="" on <head>.

The same logic goes for <body>, by the way. This is because 'vanilla'
(pre-RDFa) HTML also allowed metadata to be added to the body of a
document; the metadata we expressed above could also be marked up in
the following way:

    <h1>Chapter 5</h1>
    <a rel="prev" href="chapter4.html">Prev</a>
    |
    <a rel="next" href="chapter6.html">Next</a>

Once again, adding @typeof to <body> would throw all of this markup
out, causing the predicates to become attached to the 'new' bnode
instead of the 'current document'.

I don't think there is any doubt that a solution to this problem was
needed, since otherwise the triples about a document would have simply
'disappeared' the moment you indicated what type a document was. But
whether the solution we have is the right one is difficult to tell.

One minor variation of the technique we decided on would have been to
have said that <head> and <body> actually set the subject to the
current document in the processing and that's that, i.e., without
using an implied @about. However, I think it was generally felt that
having an 'implied @about' was a better solution since it gave authors
the ability to override the value if they wanted to, by providing an
explicit @about.

A completely different approach would have been to simply say that the
<html>, <head> and <body> elements are not allowed to take RDFa
attributes, and if they do then those attributes are ignored. However,
I think at the time we felt there might be other scenarios where you
would want to put attributes on those three elements, so having the
implied @about seemed to be the smallest change we could make to the
processing rules whilst still solving the problem.

I did a quick search in the old mailing-lists to find something that
might capture the essence of our discussion back then, just in case
anybody wanted to re-read! I think [2] is the best post since it links
to a lot of the prior discussions that show something of what we were
trying to grapple with.

Regards,

Mark

[1] <http://www.w3.org/TR/html4/struct/links.html#h-12.1.2>
[2] <http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2007Oct/0252.html>

Received on Sunday, 20 November 2011 12:00:23 UTC