Re: Request to publish HTML+RDFa (draft 3) as FPWD from Jonas Sicking on 2009-09-22 (public-html@w3.org from September 2009)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 22 Sep 2009 00:26:46 -0700
To: Henri Sivonen <hsivonen@iki.fi>
Cc: Shane McCarron <shane@aptest.com>, Maciej Stachowiak <mjs@apple.com>, Manu Sporny <msporny@digitalbazaar.com>, HTMLWG WG <public-html@w3.org>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>
Message-ID: <63df84f0909220026y4d64890cr375b6c0e42a367fd@mail.gmail.com>

On Mon, Sep 21, 2009 at 11:55 PM, Henri Sivonen <hsivonen@iki.fi> wrote:
> On Sep 22, 2009, at 01:01, Shane McCarron wrote:
>
>> Okay, I understand what you are looking for.  I think that your suggested
>> text is correct when talking about the DOM and Infoset processing.  But the
>> processing rules in section 5.5 are not written from a DOM or Infoset
>> perspective - at least not exclusively nor intentionally.  We really,
>> really, really were talking about the syntax and then the extraction of data
>> from structures that conform to that syntax.
>
> I think it's a fundamental spec writing error to specify RDFa processing in
> terms of syntax as opposed to defining it in terms of the data structure
> abstractions that HTML parsers and XML processors output.
>
> An HTML parser or an XML processor sees the bits that come from the wire.
> Assuming that you intended an RDFa processor to be layered on top of an HTML
> parser or and XML processor, the RDFa processor never gets to see the bits
> on the wire. It gets to see the output of the HTML parser or the XML
> processor. Therefore, it's wrong to define the behavior of the RDFa
> processor in terms of bits on the wire and it would be correct to define it
> in terms of the output data structure abstractions of HTML parsers
> (namespace-aware DOM) or XML processors (the Infoset). (DOM Level 3 defines
> a mapping between the DOM and the Infoset, so you can avoid some duplication
> there.)
>
> Alternatively, if an RDFa processor were defined to operate on the bits on
> the wire, RDFa shouldn't give the impression that it's layered on top of XML
> or HTML. Instead, it should define everything from the bits on the wire up
> and conspicuously warn implementors that they shouldn't try to use
> off-the-shelf XML processors or HTML parsers. (But that would be
> fundamentally bad, too.)
>
> I don't support the publication of HTML+RDFa as an FPWD in the HTML WG,
> because I think HTML WG deliverables shouldn't have such fundamental spec
> writing errors.

Well, I think it's fine to define processing in terms of syntax, if a
WG so chooses. So I have no problem with the Semantic Web
Deployment/XHTML2 WGs publishing a spec defining RDFa-in-XHTML in
terms of XML syntax.

However this means that only processing for XML documents are defined.
Not processing for DOM or HTML documents. Which is why Manu's draft is
needed, in order to define processing for at least HTML documents, and
ideally also DOMs.

As far as DOM processing goes, I realized that there is another thing
that seems unspecified. Whether prefix resolution happens at parse
time, or if it happens when data is extracted. If it's intended to
happen at parse time that means that the implementation has to store
which prefixes were in scope at parse time for each element. If it
happens when data is queried, that means that prefixes mappings might
change when a node is moved in the DOM.

I suspect it's intended that prefix mapping happens when data is
queried as that doesn't require additional changes to a DOM
implementation. However it seems like something that needs to be
specified.

As with other issues, this seems solveable by defining processing for DOMs.

/ Jonas

Received on Tuesday, 22 September 2009 07:27:46 UTC