RE: RDFa and its relationship to XHTML

On Thu, 2006-06-08 at 17:56 +0100, Mark Birbeck wrote:
> Hi Karl/Dan,
> 
> (Hope you don't mind me rolling your email in with this one, Dan. :)
> 
> > Le 06-06-08 à 22:01, Mark Birbeck a écrit :
> > > RDFa can therefore be used *today* for simple metadata structures 
> > > (rel="tag", for example), but provides many mechanisms to get more 
> > > advanced, should you need it. By being built on HTML metadata 
> > > principles it takes a generic approach--i.e., any language can be 
> > > marked up and interpreted by any RDFa parser, without 
> > having to know 
> > > anything about the language being parsed.
> > 
> > Could you give a document which
> > 
> > 	1. uses a subset of possible RDFa features
> > 	2. is a valid document HTML 4.01 or XHTML 1.0
> > 	   http://validator.w3.org/
> > 
> > I think that would help to have concrete test cases for everyone.
> 
> Sure. The following page uses a subset of RDFa, and validates as XHTML 1.0
> Strict:
> 
>   <http://www.w3.org/>
> 
> Near the bottom of the page you'll see this:
> 
>   <a
>    rel="Copyright"
>    href="/Consortium/Legal/ipr-notice#Copyright"
>    shape="rect"
>   >Copyright</a>
> 
> which is perfectly 'correct' RDFa.

What RDF triples does it produce? What code can I use to see?

I'm pretty sure the author didn't know that it produced any RDF triples.

I just installed Elias's RDFa parser from 
http://torrez.us/archives/2006/06/05/453/
http://svn.rdflib.net/trunk/ revision 783.

It produces, among other things:

<> <http://www.w3.org/Copyright>
<http://www.w3.org/Consortium/Legal/copyright-documents> .

and I'm sure the author didn't mean to use http://www.w3.org/Copyright
as an RDF property name.

> Unfortunately, this page also uses @rel in a slightly 'awkward' way. Each
> news item is marked up as follows:
> 
>   <a
>    rel="details"
>    title="Experts Share Perspectives on Web Standards at
>      Fundamentos Web 2006"
>    href="/News/2006#item101"
>    shape="rect"
>   >News archive</a>
> 
> @rel is being used here to indicate to a GRDDL transform that the value in
> the @href attribute is the URI of an rss:item, i.e., the *subject* of a
> statement. This technique of ignoring existing semantics provided by the
> document in favour of some specific, context-based interpretation, is not
> dissimilar to the approach taken by microformats.

Huh? ignoring semantics provided by the document? Those _are_ the
semantics of the document. I know because I personally negotiated that
markup with Susan Lesch and company who write it.

the rel="details" says that between http://www.w3.org/ (or some part
of it) and /News/2006#item101 is "details".

In a sense, it's saying:

<http://www.w3.org/#item23> <http://www.w3.org/#details>
<http://www.w3.org/News/2006#item101>.

The GRDDL transformation just relates that to a more standard RDF idiom,
using the RSS vocabulary.

> RDFa takes a different approach, in that it tries to preserve the hooks that
> HTML has for semantic information, but of course builds on them since we
> need more. So in the case of @rel we have the following definition in HTML
> 4.01:
> 
>   This attribute describes the relationship from the
>   current document to the anchor specified by the href
>   attribute. The value of this attribute is a
>   space-separated list of link types. [1]

Exactly; the relationship between http://www.w3.org/ and
/News/2006#item101 is "details".

> In RDF terms, @rel is playing the role of a predicate (or list of
> predicates) between the document as subject, and the value in the @href
> attribute as object. RDFa did not 'invent' the use of @rel and @rev, but
> rather documented how it should be interpreted by a parser that is looking
> for RDF triples.
> 
> RDFa uses this predicate 'hook' to good effect, by recommending mark-up such
> as this:
> 
>   This document is licensed under a
>   <a rel="license" 
>     href="http://creativecommons.org/licenses/by-sa/2.0/">
>       Creative Commons License
>   </a>
> 
> This example is not changing the HTML 4.01 meaning, in that @rel provides a
> predicate for the document. But defining what triples are generated when
> parsing this is incredibly powerful

well, in the Copyright case above, it seems to be _too_ powerful;
i.e. the mapping seems to generate triples that the author didn't mean.

>  (it's what Ben once described as
> "bridging the clickable and semantic webs"), and you don't need to go
> anywhere else to know what this means.
> 
> Now, we know very well that you will often want to make statements about
> things that aren't in the document, and that is why we introduced @about. Of
> course it's an _extension_ to XHTML, but we think it is a minor one to make
> given the benefits that it gives us. The alternative is to have absolutely
> no knowledge of the meaning of statements without reading one or more
> external GRDDL transforms. This means that:
> 
>   * you always have to look elsewhere for the meaning;
> 
>   * you have to ignore the HTML meaning of @rel
>     because it has now been hijacked to be as vague
>     as @class;

No, there's no hijacking going on.

>   * as microformats are finding now, you may find you
>     get vocabularies not playing well together.
> 
> So the full mark-up for an RSS news item on the W3C home page (ready for
> GRDDL) is as follows:

A full example would give a media type and any relevant namespace
declarations. I'd like it to be in HTTP space so that I can
GET it with running code. That's why I suggested an attachment.


> <div id="x20050714a" class="item">
>   <h3>Experts Share Perspectives on Web Standards ...</h3>
> 
>   <p>
>     <span class="date">2006-06-06:</span>
>     The W3C Spanish Office...
> 
>     <span class="archive">
>       (
>         <a
>          rel="details"
>          title="Experts Share Perspectives on Web Standards at
>            Fundamentos Web 2006"
>          href="/News/2006#item101"
>          shape="rect"
>         >News archive</a>
>       )
>     </span>
>   </p>
> </div>

When I put that markup in
  /home/connolly/src/rdflib/trunk/,rdfa.html
I get one triple out of Elias's parser:

<> <file://home/connolly/src/rdflib/trunk/details> <file://home/News/2006#item101> .

That's sorta close to what I sketched above...

<http://www.w3.org/#item23> <http://www.w3.org/#details>
<http://www.w3.org/News/2006#item101>.

If we correct for base URI, we get

<http://www.w3.org/> <http://www.w3.org/details>
<http://www.w3.org/News/2006#item101>.

but the subject is wrong and the # before details is missing.

The W3C webmaster-in-chief, TimBL, is never going to make an
RDF property called http://www.w3.org/details , I can assure you ;-)


> In this mark-up the @rel attribute (set to "details") as defined by HTML
> 4.01, *should be* establishing a relationship between the current document
> and the external news story, but GRDDL is overriding this, and in fact is
> making the value in the @href the *subject* of the statements.
> 
> RDFa proposes that the @about attribute provides us with a *generic* way of
> addressing these extremely common use cases, such as RSS feeds. This
> particular example might look like this:
> 
> <div
>  id="x20050714a" class="item"
>  about="http://www.w3.org/News/2006#item101"

That'll get a thumbs down from the HTML validation service, no?
I don't even think it's allowed by the text of the XHTML 1.0
spec without being namespace qualified.

>   <h3>Experts Share Perspectives on Web Standards ...</h3>
> 
>   <p>
>     <span class="date">2006-06-06:</span>
>     The W3C Spanish Office...
> 
>     <span class="archive">
>       (
>         <a
>          rel="link"
>          title="Experts Share Perspectives on Web Standards at
>            Fundamentos Web 2006"
>          href="/News/2006#item101"
>          shape="rect"
>         >News archive</a>
>       )
>     </span>
>   </p>
> </div>
> 
> As you can see, the @about attribute sets the context for subsequent @rel
> and @rev values. (Note also that the @rel value changed from "details" to
> "link", since it can now represent what it *really* is in RSS, a predicate
> called 'link' with a value of the item's URI.)
> 
> Of course I'm ignoring the namespace issues because they become relevant
> only when you want to be sure that your statements are globally unique. And
> we've picked quite a complicated example, which will also need the @property
> attribute from RDFa.

I'm happy for you to pick any example you like.

But please pick an example where
 (a) the document conforms to one of the XHTML 1.x specs;
     preferably, it gets a thumbs-up from the validation service
 (b) there's some code that generates RDF triples from it
 (c) the triples were meant by the author

You brought up rel="tag", so I'm particularly interested in an example
using that idiom.

> But that does not affect the key point which is that RDFa's use of @rel and
> @rev is *already standard*,

No, I don't agree that producing

<> <http://www.w3.org/Copyright>
<http://www.w3.org/Consortium/Legal/copyright-documents> .

from rel="Copyright" is already standardized.

>  and what's more *preserves* rather than
> overriding that standard.
> 
> So, to sum up:
> 
> 1. RDFa encourages @rel and @rev to be used as intended by HTML, and not to
> overload this attribute, or ignore its semantics. It's interpretation should
> be as a predicate on the current document, with @href providing the resource
> that is the object.

As explained above, I don't agree. RDFa seems to interpret @rel and @rev
in a way that is not what XHTML 1.x authors intend.

> 2. RDFa suggests also that namespaces are used to qualify the predicates
> provided by @rel and @rev, if your statements need to be part of RDF-world.
> 
> These two steps would still allow a document to be validated.

"would?" under what circumstances? I'm asking for an example that
does validate, not one that would validate.

> 3. RDFa also proposes that the @about attribute is a convenient way to
> 'qualify' or 'scope' the use of @rel and @rev, that builds on the spirit of
> the original HTML specification of those attributes.

I'm struggling to see how that's relevant to your claim that...
| RDFa can therefore be used *today* for simple metadata structures
| (rel="tag", for example)

> 4. RDFa proposes the use of @property to provide predicates for text items,
> such as rss:title in the above example.
> 
> These steps take us out of standard HTML territory, but are easily added
> with XHTML 1.1 modules.

I look forward to an example test case and running code.

> Regards,
> 
> Mark
> 
> [1] <http://www.w3.org/TR/REC-html40/struct/links.html#adef-rel>
> 

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E

Received on Thursday, 8 June 2006 18:00:54 UTC