- From: Philip Taylor <pjt47@cam.ac.uk>
- Date: Sun, 24 May 2009 20:29:31 +0100
- To: Maciej Stachowiak <mjs@apple.com>
- CC: Shelley Powers <shelleyp@burningbird.net>, Sam Ruby <rubys@intertwingly.net>, Manu Sporny <msporny@digitalbazaar.com>, RDFa mailing list <public-rdf-in-xhtml-tf@w3.org>, HTML WG <public-html@w3.org>
Maciej Stachowiak wrote: > [...] > > 2) An offline processor written in Python may treat XHTML served as > text/html as XML, since there are so many off-the-shelf XML parsing > libraries and the script author may be unaware of the off-the-shelf > HTML5 parsers now available. Some do that even when their authors are aware of off-the-shelf HTML5 parsers - e.g. pyRdfa ignores any Content-Type and always tries parsing with an XML parser first, and if that fails then it (optionally) falls back to html5lib. > If there is any difference between > text/html and application/xml processing rules for the same document, > this will almost certainly result in divergence in at least some cases. > Thus, we need to do at least one of ensuring identical processing, or > make it very clear that text/html must never be processed as XML by an > RDFa processor. We've already failed at ensuring identical processing, because of parsing differences - e.g. if I write <p about="..." /> <span property="..."> ... </span> then in XML it parses to sibling elements, but in HTML it parses to parent/child (because the trailing slash is ignored). If some processors unconditionally parse text/html content with an XML parser, they'll give different results to processors that correctly use a text/html parser, which results in a lack of interoperability and is therefore bad. -- Philip Taylor pjt47@cam.ac.uk
Received on Sunday, 24 May 2009 19:30:15 UTC