- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Wed, 11 Nov 2009 11:44:33 -0500
- To: John Cowan <cowan@ccil.org>
- Cc: James Graham <jgraham@opera.com>, Boris Zbarsky <bzbarsky@mit.edu>, Henri Sivonen <hsivonen@iki.fi>, David Carlisle <davidc@nag.co.uk>, public-html@w3.org
On Wed, Nov 11, 2009 at 10:41 AM, John Cowan <cowan@ccil.org> wrote: > Most programmers *want* draconian error handling of their code. Only if the error is either impossible to recover from sensibly (e.g., a segfault), or possible to reliably find at authoring time (e.g., a parse error), or if the code is being run in a testing environment. In production code, most programmers do not want draconian error handling where automatic recovery might be possible. Thus why, for instance, assert() is usually a no-op in production builds. When documents are being constructed dynamically by scripts, it's not possible to reliably find errors at authoring time. A small bug in your script might create a misplaced quotation mark, and in XML this means the entire document becomes unusable -- even though auto-closing the attribute will almost certainly leave you with a usable document (albeit perhaps with one or a few elements mangled). This means your site crashes for no good reason. That's undesirable from any perspective. Note that in contrast, source code for programs is almost never constructed dynamically. It's written by hand, or sometimes generated by a tool from another language and then immediately compiled (or interpreted). Syntax errors can be caught immediately, before production, so draconian error handling is fine in that case. The same is true for hand-authored XML documents, but many (most?) web pages are not hand-authored. On Wed, Nov 11, 2009 at 11:21 AM, John Cowan <cowan@ccil.org> wrote: > James Graham scripsit: > >> See section 9.2 "Parsing HTML Documents" [1] > > "This section only applies to user agents, data mining tools, and > conformance checkers." > > 9.1 is the section for documents. 9.2 specifies an algorithm that all user agents must apply to process a given input stream. There are no fatal errors anywhere in the section -- given any input, the algorithm will always produce a well-defined output tree that must be processed normally. There are parse errors, but recovery behavior is specified. (IIRC, UAs are allowed to abort on errors, but if they don't they must follow the recovery behavior specified.) This is precisely saying that "every single possible sequence of bytes is a[n] HTML5 document with a fixed interpretation". It may not be a *valid* HTML5 document, but it does have a fixed interpretation. Therefore HTML5 is not fragile as XML is; authoring errors will never result in a fatal error.
Received on Wednesday, 11 November 2009 16:53:06 UTC