- From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
- Date: Tue, 14 Jul 2009 16:03:15 +0200
- To: Leif Halvard Silli <lhs@malform.no>
- Cc: Doug Schepers <schepers@w3.org>, "public-html@w3.org" <public-html@w3.org>
Leif Halvard Silli wrote: > Lachlan Hunt On 09-07-14 14.32: > >> Doug Schepers wrote: >>> To meet this need, I propose a new attribute, 'parsing', which, when >>> placed on the document root, defines the type of parsing which a UA must >>> use when parsing the document... >> >> and nor is it clear which parsing rules would need to be followed to >> achieve this. There are 2 of possibilities I can think of... > > Indeed, this is not clear. But it seems most fruitful to say that > xhtml+xml rules should apply. Actually, I disagree with that. If this proposal is about introducing XML parsing for text/html, then it really doesn't gain anything over using real XML, or at least using content negotiation to send application/xhtml+xml to browsers that support it and text/html otherwise. It seems to me that this proposal would only be useful if it was some form of stricter HTML parsing, though I' still not convinced that this couldn't be addressed by a user controllable feature in browsers. >> What happens if the parser encounters an error prior to parsing the >> root element, and continues normally, but then later reaches the root >> element and sees parsing=strict. e.g. Given the following erroneous >> input: >> >> <!DOCTYPE html x> >> <html parsing=strict> >> ... >> >> Should the browser remember that it previously encountered the error >> and retroactively abort? > > If the feature was linked to the media type, namely to the a new > authoring media type, then the UA would be able to catch it without any > reparsing. The current suggestion is that this would be an attribute on the root element in the document, which is not diretly linked to the media type. Are you suggesting that we instead do this with, for example, a new media type parameter for text/html? e.g. Content-Type: text/html;charset=UTF-8;parsing=strict And if we do that, and also apply XML parsing, then we really haven't gained anything over real XHTML. (If we do that and, but just apply strict HTML parsing, then it could conceivably work, but still suffers from the practical problems of deployment that I mentioned.) >> Then, due to a bug in their CMS, some pages become non-well-formed due >> to some user input that wasn't properly sanitised. The affected pages >> would then break in the browsers that do support this new parsing >> mode, but continue to work fine in those that don't. So I share >> Maciej's concern about this triggering "a race to the bottom and >> neuter the feature". > > This, again, is yet another reason to place this option in CSS, and, by > default, link it to a new media type for authoring tools. Declaring this in CSS wouldn't work, since the parser would have to parse the HTML, find the <link> or <style> elements, stop and wait for the CSS parser to finish parsing the CSS and see if it found a parsing declaration in there, and if it did, start reparsing the document again with strict error handling enabled. >> Personally, I think a better solution could be for browsers to allow >> developers to turn on this parsing mode manually for the sites they >> test, without needing to specify any attribute, or simply report the >> parse errors in their error console. > > Allowing authors/users to switch the media type identity of the UA would > solve the problem. If you want XML parsing, authors already have the ability to set the media type sent by their server, or at least their testing server, or by simply changing the file extension. There is also at least one Firefox extension that allows users to override media types. https://addons.mozilla.org/en-US/firefox/addon/3207 -- Lachlan Hunt - Opera Software http://lachy.id.au/ http://www.opera.com/
Received on Tuesday, 14 July 2009 14:04:21 UTC