Re: rdfa parsing problem

Philip Taylor wrote:
> Ivan Herman wrote:
>> Meri Kovach wrote:
>>>  <table>
>>>     <tr>
>>>     <span about="#2105555" typeof="foaf:Person">
>>>        <td>1</td>
>>>        <td><span property="foaf:firstName">Meri</span></td>
>>>        <td><span property="foaf:familyName">Kovac</span></td>
>>>    </span>
>>>    </tr>
>>>  </table>
>> [...]
>> The way the distiller works is that if hits
>> an XML (not XHTML!) error, it then switches (if allowed in the command
>> arguments) to an HTML5 parser and attempts to run the code through that
>> one, too. I guess (but I do not know) that the HTML5 parser attempts to
>>   make some sense in the erroneous code and I would presume it will
>> simply remove the <span> element from the DOM tree it produces.
> 
> That guess is almost right - with invalid input like this, the HTML5
> parser moves the <span> to just before the <table>, so it's equivalent to:
> 
>   <span about="#2105555" typeof="foaf:Person"></span>
>   <table>
>     <tr>
>       <td>1</td>
>       ...
>     </tr>
>   </table>
> 

And which gives a perfectly valid explanation why the generated RDFa is
what Meri saw: the subject setting via @about is, sort of, disjoint from
the table which has the rest of the content...

Ivan

> (You can test how html5lib parses HTML into a DOM tree using
> <http://james.html5.org/parsetree.html>)
> 

-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Monday, 25 May 2009 10:29:20 UTC