Re: @rel syntax in RDFa (relevant to ISSUE-60 discussion), was: Using XMLNS in link/@rel from Henri Sivonen on 2009-03-01 (public-rdf-in-xhtml-tf@w3.org from March 2009)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Sun, 1 Mar 2009 16:54:36 +0200
To: Ben Adida <ben@adida.net>
Cc: Julian Reschke <julian.reschke@gmx.de>, Mark Nottingham <mnot@mnot.net>, HTMLWG WG <public-html@w3.org>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, public-xhtml2@w3.org, "www-tag@w3.org WG" <www-tag@w3.org>
Message-Id: <EFD8164E-31AA-4BD7-8127-F5E0E434DB32@iki.fi>
On Feb 28, 2009, at 01:10, Ben Adida wrote:

> Julian Reschke wrote:
>> I think it would be easier to convince them if you wouldn't have
>> unilaterally changed the semantics for the rel attribute (note that I
>> have less problems with CURIEs in *new* attributes).
>
> Well, for one, the RDFa task force is a joint effort of the Semantic  
> Web
> Deployment *and* the XHTML2 WGs, which was previously the HTML WG. Our
> work began before the HTML5 group had anything to do with W3C.

Note that when the XHTML2 WG was previously chartered as "HTML WG"[1],  
despite its name, it wasn't chartered to develop any new vocabularies  
for text/html or any new flavor of HTML. Instead, it was chartered  
with the assumption that new work would all be XML-based.

We now know that the demise of text/html was greatly exaggerated, so  
it's now necessary to evolve HTML and XHTML in such a way that a  
unified vocabulary works with both text/html and application/xhtml 
+xml. Hence, previous XML-only bets were misplaced. That's tough, but  
things need to be adjusted for the dual serialization situation. (For  
example, ARIA was also first developed with the assumption that it  
would exist in an all-XML world, but it has now been adjusted to fit  
the dual serialization with single above-infoset vocabulary model).

(Curiously, the only piece of charter that does talk about HTML as  
opposed to XHTML in the *current* charter of the XHTML2 WG is the part  
about RDFa, even though at the time of *that* charter, an entirely  
different WG was chartered for HTML stuff.)

> So I don't think we did anything rogue

It's easy to get a different perception. The RDFa REC mentions HTML in  
an introductionary sentence but then proceeds to define RDFa for XHTML  
only (and in a flavor of XHTML that under a previous Note was not  
supposed to be served as text/html). Yet, when RDFa is deployed in  
text/html, community sites like rdfa.info cheer[2].

I think mnot found the elephant he was looking for here (third  
paragraph):
http://lists.w3.org/Archives/Public/www-tag/2009Feb/0277.html

> or unilateral.

Again, it's easy to get a different perception. In http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2008-August/015913.html 
  , when HTML5 had had something to do with the W3C for about 18  
months, you said:
"Though I do think you should consider RDFa attributes in HTML5, I  
didn't mean to start this thread just yet (we're in the middle of our  
transition to Proposed Rec at W3C for RDFa in XHTML 1.1)."

Even if you didn't mean it, it sure looks a lot like getting RDFa to  
REC first for application/xhtml+xml and then presenting it as a done  
deal for text/html as well.

> Also, I think you're missing an important detail: @rel had *no*
> semantics, it was all free-form, without any recommended  
> interpretation
> (except for pre-defined link types). So even interpreting it as a URI
> involves "adding semantics." We added the URI semantic interpretation,
> with CURIE syntax, and we ensured that our approach preserved the
> existing pre-defined link types.

I think this argument is a distraction. Prepending "http:" to a string  
doesn't magically bring about semantics. For example, even though  
rel=canonical wasn't developed through any standards process, I think  
it clearly has semantics by having an effect in search engines and the  
semantics are even documented. On the other hand, you don't have any  
semantics if I invent an URI such as http://hsivonen.iki.fi/2009/03/01/you-do-not-what-semantics-i-meant 
.

> I've yet to see a real problem with this rather careful decision,  
> which we made and vetted through the normal W3C process.

The decisions weren't vetted through the W3C Process for text/html.  
Instead, the spec that made it to REC was developed in a context where  
it was reasonable for a reviewer to assume that XML-only technology  
was being proposed. I think Process-wise, RDFa in text/html doesn't  
occupy any moral high ground. (HTML5 probably doesn't, either. After  
all, it went outside the W3C entirely for a while.)

- -

Now, to put an actual technical proposal in here:

I suggest changing RDFa to use full IRIs instead of CURIEs. Then,  
suggest making it a conformance requirement for rel in both text/html  
and application/xhtml+xml that a rel token MUST NOT contain a colon or  
MUST be an absolute IRI and MUST NOT start with the string "http://www.iana.org/assignments/relation/ 
". Authors SHOULD NOT mint relation IRIs that differ only in case.

I suggest that processing requirements for text/html rel tokens,  
application/xhtml+xml rel and the Link header be the following for  
compatibility with all of Atom, browsers and RDF technologies:

For products that don't use an RDF data model:
Replace letters A-Z with a-z in each token, then compare tokens code  
point for code point with a well-known token. (For example, turn  
"StyleSheet" into "stylesheet", use "stylesheet" as the hard-coded  
string you compare with in software.)

For products that do use an RDF data model:
If the token contains a colon: Use the token as an IRI.
If the token does not contain a colon: Prepend "http://www.iana.org/assignments/relation/ 
" to the token and use the resulting string as an IRI.

This proposal doesn't involve profile in any way, because in practice  
profile very rarely is involved in processing.

[1] http://www.w3.org/2002/05/html/charter
[2] http://rdfa.info/2009/01/29/whitehousegov-uses-rdfa/
-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Sunday, 1 March 2009 14:55:24 UTC