W3C home > Mailing lists > Public > public-rdf-in-xhtml-tf@w3.org > May 2009

Re: rdfa parsing problem

From: Ivan Herman <ivan@w3.org>
Date: Sat, 23 May 2009 06:41:54 +0200
Message-ID: <4A177E92.8090403@w3.org>
To: Philip Taylor <pjt47@cam.ac.uk>
CC: public-rdf-in-xhtml-tf@w3.org, Meri Kovach <meri.kovach@gmail.com>
Thanks a lot Philip. And thanks for the pointer, it may help in future 
if similar circumstances arise.

The bad news (for me:-) is that the change the parser does on the DOM 
tree does not really explain the outcome of Meri's test, so there might 
be a bug in the rdfa distiller:-( I will have to check.

Thanks again

Ivan

Philip Taylor wrote:
> Ivan Herman wrote:
>> Meri Kovach wrote:
>>>  <table>
>>>     <tr>
>>>     <span about="#2105555" typeof="foaf:Person">
>>>        <td>1</td>
>>>        <td><span property="foaf:firstName">Meri</span></td>
>>>        <td><span property="foaf:familyName">Kovac</span></td>
>>>    </span>
>>>    </tr>
>>>  </table>
>> [...]
>> The way the distiller works is that if hits
>> an XML (not XHTML!) error, it then switches (if allowed in the command
>> arguments) to an HTML5 parser and attempts to run the code through that
>> one, too. I guess (but I do not know) that the HTML5 parser attempts to
>>   make some sense in the erroneous code and I would presume it will
>> simply remove the <span> element from the DOM tree it produces.
> 
> That guess is almost right - with invalid input like this, the HTML5 
> parser moves the <span> to just before the <table>, so it's equivalent to:
> 
>   <span about="#2105555" typeof="foaf:Person"></span>
>   <table>
>     <tr>
>       <td>1</td>
>       ...
>     </tr>
>   </table>
> 
> (You can test how html5lib parses HTML into a DOM tree using 
> <http://james.html5.org/parsetree.html>)
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf


Received on Saturday, 23 May 2009 04:41:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 23 May 2009 04:41:39 GMT