W3C home > Mailing lists > Public > public-html@w3.org > October 2009

Re: ISSUE-41/ACTION-97 decentralized-extensibility

From: Sam Ruby <rubys@intertwingly.net>
Date: Fri, 23 Oct 2009 06:06:56 -0400
Message-ID: <4AE18040.3040005@intertwingly.net>
To: Julian Reschke <julian.reschke@gmx.de>
CC: Tony Ross <tross@microsoft.com>, "public-html@w3.org" <public-html@w3.org>
Julian Reschke wrote:
> Sam Ruby wrote:
>> ...
>> In this case (issue-41/action-97), the simpler questions are:
>> 1) Can everybody live with the parsing rules that are specified in the 
>> current HTML5 draft?  (If not, what needs to change?)
>> ...
> I think it would be good to investigate whether HTML and XHTML parsing 
> rules can be aligned somewhat more.
> Right now the parser puts HTML elements already into the XHTML 
> namespace, and does similar things with MathML and SVG.
> Beyond that, the DOM it produces is inconsistent with what an XML parser 
> would produce for a similarly looking document. Can we do better?

Here is a (work in progress) list of differences, many of which deal 
with differences other than a DOM:


> I realize that there is some broken HTML content out there which uses 
> xmlns:* attributes, but doesn't expect them to have an impact on the 
> DOM. The question here is: how many namespace URIs does this affect? 
> Could we just exclude the big offenders (Word HTML export?) from 
> processing?

My recollection is that the biggest problem was xmlns="".  As to Word 
export, the biggest problem is finding attributes with names that start 
with o:, but with no declaration for the namespace.

> BR, Julian

- Sam Ruby
Received on Friday, 23 October 2009 10:07:33 UTC

This archive was generated by hypermail 2.3.1 : Thursday, 29 October 2015 10:15:53 UTC