- From: James Graham <jg307@cam.ac.uk>
- Date: Mon, 28 Jan 2008 11:25:44 +0000
- To: temp17@staldal.nu
- CC: public-html-comments@w3.org
Note: this is my own opinion and I do not speak for the HTML-WG. temp17@staldal.nu wrote: >>> Why is this syntax [the traditional non-XML syntax] recommended? >> >> AIUI, because of wider support in UAs, because the syntax is more >> forgiving, and because most authors use it already. > > Wider support in UAs is a valid argument. But I don't understand why > more forgiving syntax is an advantage. Because the fail-on-error behavior of XML is user hostile in the sense that it requires clients to fail gracelessly leaving the end user -- who is in no position to fix the problem -- with an unintelligible error message (e.g. the YSoD in Firefox) and potentially, since the site is inaccessible, no way to report the problem [1]. In addition, the vast majority of CMS's in use today are not designed to ensure that content they send over the wire is XML-well-formed in all circumstances, so it is exceptionally hard to ensure that users never experience a YSoD. Indeed my experience is that almost all the sites I visit that serve XML have been caught out at one time or another. >>> Why not recommend the XML syntax instead? >> >> Why should it be recommended instead? > > Because it is an advantage to be able to process HTML documents with XML > tools. And it's easier to parse. This is not strictly true. Since HTML5 specifies parsing behavior for text/html there have been several interoperable, highly robust, libraries developed for parsing HTML. So, when you need to parse something, you simply choose an XML library for XML content or choose an HTML library for HTML content. Trying to do anything else (e.g. use regular expressions) is a mistake that will lead to problems. Once you have the content in a tree-like structure it is generally possible to serialize as either HTML or XML as you prefer. So it's totally possible to have a pipeline that looks like: html parser html serializer text/html content ------------> XML tools --------------> text/html content Planet Venus [2] does something like this [1] Arguably the XML spec does leave scope for fixing up the problem at the application layer, but then the benefits of XML are lost. [2] http://www.intertwingly.net/code/venus/ -- "Eternity's a terrible thought. I mean, where's it all going to end?" -- Tom Stoppard, Rosencrantz and Guildenstern are Dead
Received on Monday, 28 January 2008 11:26:08 UTC