DTD driven browsers and real world HTML

If Amaya is to move towards a browser rather than an editor, it needs 
to be much more tolerant of real world HTML than is possible with a 
strict DTD based parser.

E.g. this afternoon, at least, the main page for 
http://www.altavista.com/ contained the illegal sequence:

TABLE
FORM
INPUT
TBODY

Amaya deleted FORM and INPUT, in its error recovery (validator.w3.org 
seems to close the table instead).  The result was that it was 
impossible to make a search!

(Lynx has had to go as far as both a DTD based parser and a process 
tag semantics in isolation parser, in order to both be correct and 
useable.)

-- 
David Woolley - Office: David Woolley <djw@bts.co.uk>
BTS             Home: <david@djwhome.demon.co.uk>
Wallington      TQ 2887 6421
England         51  21' 44" N,  00  09' 01" W (WGS 84)

Received on Tuesday, 19 January 1999 16:32:54 UTC