Re: RDFa in HTML issues wiki page created from Julian Reschke on 2009-05-26 (public-html@w3.org from May 2009)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Tue, 26 May 2009 14:32:42 +0200
To: Mark Birbeck <mark.birbeck@webbackplane.com>
CC: Sam Ruby <rubys@intertwingly.net>, Manu Sporny <msporny@digitalbazaar.com>, Philip Taylor <pjt47@cam.ac.uk>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, HTML WG <public-html@w3.org>
Message-ID: <4A1BE16A.5040201@gmx.de>

Mark Birbeck wrote:
> The 'architecture' is that the RDFa parser 'receives' values from the
> CURIE-processing step. So if you want to create a generic RDFa parser,
> the 'pluggable' language-specific part would go into the
> CURIE-processing code.
> 
> (And all that involves is loading a list of predefined tokens at the
> beginning of processing -- see below.)

That, for instance, rules out an XSLT 1.0 based processor, because it 
will never see the original doctype, and thus will not know which list 
of predefined terms to load.

But I think this is besides the point: what we need is a robust 
processing model which works with all languages of the HTML family, and 
returns predictable results; having separate sets of predefined terms 
defeats this; as do different parsing rules for @rel, for that matter.

 > ...
> I think 'workaround' is the wrong term.
> ...

Misunderstanding.

With "workaround" I was referring to the proposal to have

   xmlns:http="http:"

to make the CURIE processor happy when it sees HTTP URIs.

> But that isn't the end of the story -- that's just the foundation.
> 
> If you look at it from a processor point of view, it might be a little
> easier to see.
> 
> Imagine an RDFa processor starts up, and the first thing it does is to
> load a CURIE processor, which it then initialises with a set of
> predefined tokens. In HTML-based languages that list might be 'next',
> 'prev' and so on.
> 
> Now, imagine also during the course of processing, some other list of
> tokens is added. These might come from @profile, or some other
> extension mechanism yet to be invented; tokens loaded in this way
> would override the language-specific tokens.
> 
> In this way we've allowed the host language to define a few defaults
> that are useful in their domain, but we've also left open the
> possibility that we can provide a mechanism for adding more tokens.
> 
> The domain-specific tokens are important, because through them we
> maintained backwards-compatibility with @rel/@rev in HTML/XHTML. But
> by making it part of the more generic mechanism, we haven't stifled
> the possibilities for other solutions.
> ...

That's an interesting proposal. Has anybody implemented that? What do 
you plan to do about the removal of "@profile" from HTML5?

>> Any new syntax will have to compete for followers with existing systems like
>> RDFa, DC-HTML, RDFa, or (gasp) "microdata". So I personally think it makes
>> more sense to get RDFa specified for HTML the way it is (using xmlns-based
>> prefixes).
> 
> I'm not talking about a new syntax for defining a prefix, but about
> how to provide additional _tokens_. So it's not really about
> 'competing'.

It's a new mechanism for providing short hand notations for URIs. I 
agree it doesn't use prefixes, but that really doesn't change the argument.

> I want be able to do something along these lines:
> 
>   <html
>    token="
>     Person=http://xmlns.com/foaf/0.1/Person
>     title=http://xmlns.com/foaf/0.1/title
>     fn=http://xmlns.com/foaf/0.1/name
>    "
>   >
>     <head>
>       <title>Ivan's homepage</title>
>     </head>
>     <body>
>       <div about="http://www.ivan-herman.net/me"
>        typeof="Person"
>       >
>         <span property="title">Dr</span>
>         <span property="fn">Ivan Herman</span>
>       </div>
>     </body>
>   </html>
> 
> Of course the tokens would ideally be in an external file, which would
> mean that interest-groups could create and share tokens without
> needing the central registry that is generally discussed.
> ...

I'm not convinced about the "ideally in an external file" part. The 
semantics of the document gets broken as soon as the link to the 
external file is broken; furthermore, if you put it at a well-known URI, 
the usual scalability and stability problems come up.

So, as I said before: adding a new level of indirection here is IMHO the 
wrong approach. Either use full IRIs, or use a prefix notation (which 
should not rely on out-of-band information).

BR, Julian

Received on Tuesday, 26 May 2009 12:33:24 UTC