Re: View Source from David Woolley on 2009-03-19 (public-html@w3.org from March 2009)

From: David Woolley <forums@david-woolley.me.uk>
Date: Thu, 19 Mar 2009 09:01:21 +0000
To: www-svg@w3.org
CC: public-html@w3.org
Message-ID: <49C209E1.8050206@david-woolley.me.uk>

Lachlan Hunt wrote:

> 
> Yes, it would be different from how browsers have to interpret it.  But 
> from the perspective of a web developer, the vast majority of cases 
> where an end tag is omitted as in the example I gave, the omission is 
> accidental and having a tool or browser feature designed to help web 
> developers clean up their own markup would be much more useful if it 
> behaved as I described, rather than naively serialising the DOM created 
> by the HTML5 parsing algorithm.

As I understand it, the HTML5 algorithm attempts to second guess the 
author's intent, so where it re-opens an element repeatedly, it is 
because it believes that the author either didn't understand that 
elements must nest or didn't understand that they were trying to span 
elements not allowed by the content model.

Compared with view canonical source, the parser has the added constraint 
that it should try to fixup the document without lookahead.  However the 
author will judge the correctness of their code by how it presents. 
Early insertion of the end tag will change that, so will not produce the 
intended result.

In terms of encouraging valid HTML, HTML5 should be using your proposed 
fixup scheme, but in its mission to interpret tag soup HTML more or less 
as the author has come to expect, both view canonical source and DOM 
need to match the tag soup recovery strategy.
> 

-- 
David Woolley
Emails are not formal business letters, whatever businesses may want.
RFC1855 says there should be an address here, but, in a world of spam,
that is no longer good advice, as archive address hiding may not work.

Received on Thursday, 19 March 2009 09:02:15 UTC