[whatwg] Micro-data/Microformats/RDFa Interoperability Requirement from Ian Hickson on 2009-05-08 (public-whatwg-archive@w3.org from May 2009)

From: Ian Hickson <ian@hixie.ch>
Date: Fri, 8 May 2009 18:57:44 +0000 (UTC)
Message-ID: <Pine.LNX.4.62.0905081833080.7824@hixie.dreamhostps.com>
On Thu, 7 May 2009, Manu Sporny wrote:
> 
> That's certainly not what the WHATWG blog stated just 20 days ago for
> rel="license" [...]

The WHATWG blog is an open platform on which anyone can post, and content 
is not vetted for correctness. Mark can sometimes make mistakes. Feel free 
to post a correction. :-)


> and the spec doesn't seem to clearly outline the difference in 
> definition either (at least, that's not my reading of the spec):
> 
> http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#link-type-license
> http://www.whatwg.org/specs/web-apps/current-work/multipage/history.html#link-type-tag

Actually I just looked at the rel-tag faq and found that it disagrees with 
what Tantek had told me, so (assuming the faq is normative or that the 
rel-tag spec does mention this somewhere that I didn't find) the specs do 
match here.

For rel-license, the HTML5 spec defines the value to apply to the content 
and not the page as a whole. This is a recent change to match actual 
practice and I will be posting about this shortly.


> > The RDFa specification is very confusing to me (e.g. I don't 
> > understand how the normative processing model is separate from the 
> > section "RDFa Processing in detail"), so I may be misinterpreting 
> > things, but as far as I can tell:
> > 
> >   <html xmlns="http://www.w3.org/1999/xhtml">
> >    <head>
> >     <base href="http://example.com/"/>
> >     <link about="http://example.net/"
> >           rel="dc.author" 
> >           href="http://a.example.org/"/>
> >    ...
> > 
> > ...will result in the following triple:
> > 
> >    <http://example.net/> <http://example.com/dc.author> <http://a.example.org/> .
> 
> Two corrections:
> 
> The first is that an RDFa processor would not generate this triple.

My apologies, I misinterpreted 5.4.4. Use of CURIEs in Specific Attributes 
to mean that rel="" was a relative-uri-or-curie attribute. (5.4.4. Use of 
CURIEs in Specific Attributes says it's "link-type-or-curie", but 5.4.3. 
General Use of CURIEs in Attributes doesn't list that as a possibility and 
at the end says that rel="" is an exception only insofar as it supports 
specific link types as well, which I interpreted differently.)


> > For example, it would be somewhat presumptious of RDFa to prevent any 
> > future version of HTML from being able to use the word "resource" as 
> > an attribute name. What if we want to extend the forms features to 
> > have an XForms "datatype" compatibility layer; why should we not be 
> > able to use the "datatype" and "typeof" attributes?
> 
> As long as their legacy nature was preserved, and those uses didn't 
> create ambiguity in RDFa processors and semantic equivalence was 
> ensured, I don't see why they shouldn't be re-used.

Ah, ok. If such attributes are re-used, I suppose that it should be 
possible to make sure that it is possible to re-use them in a way that 
doesn't conflict with RDFa (e.g. by triggering the non-curie-non-uri 
behaviour for property="" or by having authors who want RDFa compatibility 
use xmlns:http="http:" declarations or some such).

Noted.


> > Surely this is what namespaces were intended for.
> 
> Uhh, what sort of namespaces are we talking about here? xmlns-style, 
> namespaces?

The idea of XML Namespaces was to allow people to extend vocabularies
with a new features without clashing with older features by putting the 
new names in new namespaces. It seems odd that RDFa, a W3C technology for 
an XML vocabulary, didn't use namespaces to do it.


> >>> For example, the way that "n:next" and "next" can end up being 
> >>> equivalent in RDFa processors despite being different per HTML rules 
> >>> (assuming an "n" namespace is appropriately declared).
> >>
> >> If they end up being equivalent in RDFa, the RDFa author did so 
> >> explicitly when declaring the 'n' prefix to the default prefix 
> >> mapping and we should not second-guess the authors intentions.
> > 
> > My only point is that it is not compatible with HTML4 and HTML5, 
> > because they end up with different results in the same situation (one 
> > can treat two different values as the same, while the other can treat 
> > two different values as different).
> 
> It is only not compatible with HTML5 if this community chooses for it to 
> not be compatible with HTML5. Do you agree or disagree that we shouldn't 
> second guess the authors intentions if they go out of their way to 
> declare a mapping for 'n'?

I don't think that's a relevant question. My point is that it is possible 
in RDFa to put two strings that have different semantics in HTML4 and yet 
have them have the same semantics in RDFa. This means RDFa is not 
compatible with HTML4.


> > Another example would be:
> > 
> >   <html xmlns="http://www.w3.org/1999/xhtml">
> >    <head about="">
> >     <link rel="stylesheet alternate next" href="...">
> >     ...
> > 
> > ...which in RDFa would cause the following triples to be created:
> > 
> >    <> <http://www.w3.org/1999/xhtml/vocab#stylesheet> <...> .
> >    <> <http://www.w3.org/1999/xhtml/vocab#alternate> <...> .
> >    <> <http://www.w3.org/1999/xhtml/vocab#next> <...> .
> > 
> > ...but according to HTML4/5, is really only two relations (an 
> > alternativee stylesheet and the next document).
> 
> That's a very strained argument. The contents of @rel are supposed to be 
> LinkTypes, which are space-separated keywords:
> 
> http://www.w3.org/TR/html4/types.html#type-links
> 
> It just so happens that when you use "alternate" and "stylesheet" 
> together that the browser is supposed to recognize that an alternate 
> stylesheet exists.
>
> I've always thought that this was an abuse of the rel attribute - there 
> should have been an "alternate-stylesheet" LinkType, but what's done is 
> done.
> 
> AFAIK, there is nothing in the HTML4 or HTML5 spec that states that for 
> rel="alternate stylesheet" that there is only one relation. There are 
> three relationships because there are three separate LinkTypes 
> specified.

It's very clear in HTML5. In HTML4 it's there but about as vague as the 
rest of HTML4. What HTML4 says is academic though since this is 
well-established implementation practice.

Whether it is inconvenient or not, the fact remains that legacy 
implementations (and HTML5, and arguably HTML4) require a processing here 
that is not semantically equivalent to RDFa's.


> > Browser vendors would not accept having to resolve prefixes in 
> > attribute values as part of processing link relations.
> 
> Why not?

You would have to ask them. I tend not to argue with implementor feedback. 
If they tell me they won't do something, I don't tell them to do it.

This by the way is the complaint regarding "QNames in content" and "QNames 
in attribute values". CURIEs don't change this, which is why you'll find a 
lot of people just say that CURIEs are no different to QNames.


> They differ in:
> 
> - The location that they can be used.

The complaint with QNames is when they are used in places they're not 
supposed to be used, like content or attribute values.


> - Whether or not they are required to be validated.

QNames in content or attribute values are typically not required to be 
validated either.


> The only similarity they have is that they can be expanded to full URIs.

That's the complaint. (Specifically, that they can do so in a dynamic 
fashion -- e.g. what happens if a user changes a prefix="" or xmlns:*="" 
attribute value dynamically from script? Do all the RDFa triples change 
too?)


> > The problem with QNames in attributes is that they require the 
> > attribute processor to have information from the namespace processor, 
> > and as far as I can tell this continues to exist in RDFa.)
> 
> If that's really the problem, why don't you just have a prefix processor 
> that the attribute processor relies on and drop the namespace processor 
> entirely?

There are two fundamental problems, one is with having the prefix 
processor be used by code that is in the XML parser and code that is very 
separate from the XML parser (e.g. based on DOM code) -- implementors find 
combining the two to be highly impractical -- and the other is that the 
prefix processor has to handle dynamic changes to prefixes.

These are the problems with QNames-in-content and QNames-in-attributes and 
they apply equally to CURIEs.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 8 May 2009 11:57:44 UTC