Re: ISSUE-41/ACTION-97 decentralized-extensibility

Julian Reschke wrote:
> Sam Ruby wrote:
>> ...
>> In this case (issue-41/action-97), the simpler questions are:
>> 1) Can everybody live with the parsing rules that are specified in the 
>> current HTML5 draft?  (If not, what needs to change?)
>> ...
> I think it would be good to investigate whether HTML and XHTML parsing 
> rules can be aligned somewhat more.
> Right now the parser puts HTML elements already into the XHTML 
> namespace, and does similar things with MathML and SVG.
> Beyond that, the DOM it produces is inconsistent with what an XML parser 
> would produce for a similarly looking document. Can we do better?

Here is a (work in progress) list of differences, many of which deal 
with differences other than a DOM:

> I realize that there is some broken HTML content out there which uses 
> xmlns:* attributes, but doesn't expect them to have an impact on the 
> DOM. The question here is: how many namespace URIs does this affect? 
> Could we just exclude the big offenders (Word HTML export?) from 
> processing?

My recollection is that the biggest problem was xmlns="".  As to Word 
export, the biggest problem is finding attributes with names that start 
with o:, but with no declaration for the namespace.

> BR, Julian

- Sam Ruby

Received on Friday, 23 October 2009 10:07:33 UTC