Re: Request to publish HTML+RDFa (draft 3) as FPWD

On Sep 22, 2009, at 01:01, Shane McCarron wrote:

> Okay, I understand what you are looking for.  I think that your  
> suggested text is correct when talking about the DOM and Infoset  
> processing.  But the processing rules in section 5.5 are not written  
> from a DOM or Infoset perspective - at least not exclusively nor  
> intentionally.  We really, really, really were talking about the  
> syntax and then the extraction of data from structures that conform  
> to that syntax.

I think it's a fundamental spec writing error to specify RDFa  
processing in terms of syntax as opposed to defining it in terms of  
the data structure abstractions that HTML parsers and XML processors  

An HTML parser or an XML processor sees the bits that come from the  
wire. Assuming that you intended an RDFa processor to be layered on  
top of an HTML parser or and XML processor, the RDFa processor never  
gets to see the bits on the wire. It gets to see the output of the  
HTML parser or the XML processor. Therefore, it's wrong to define the  
behavior of the RDFa processor in terms of bits on the wire and it  
would be correct to define it in terms of the output data structure  
abstractions of HTML parsers (namespace-aware DOM) or XML processors  
(the Infoset). (DOM Level 3 defines a mapping between the DOM and the  
Infoset, so you can avoid some duplication there.)

Alternatively, if an RDFa processor were defined to operate on the  
bits on the wire, RDFa shouldn't give the impression that it's layered  
on top of XML or HTML. Instead, it should define everything from the  
bits on the wire up and conspicuously warn implementors that they  
shouldn't try to use off-the-shelf XML processors or HTML parsers.  
(But that would be fundamentally bad, too.)

I don't support the publication of HTML+RDFa as an FPWD in the HTML  
WG, because I think HTML WG deliverables shouldn't have such  
fundamental spec writing errors.

Henri Sivonen

Received on Tuesday, 22 September 2009 06:56:29 UTC