- From: <noah_mendelsohn@us.ibm.com>
- Date: Tue, 20 Jan 2009 15:17:12 -0500
- To: elharo@metalab.unc.edu
- Cc: www-tag <www-tag@w3.org>
Elliotte Harold wrote: > Henri Sivonen wrote: > > > If you consider black box-distinguishable conformance, what's the > > difference between the XML parser signaling an error and handing the > > rest of the stream to the application which hands it to > another non-XML > > parser to continue and a parser signaling the first WF error and > > continuing with the rest of the stream itself? > > The application knows more about what it's finding than the XML parser > does. The XML parser only knows XML. If you hand it something that isn't > XML, it knows not what to do with it. I think it's valuable to consider some use cases that go beyond: "user browses to HTML page". While there are many pros and cons to trying to make an XHTML, I.e. a variant of HTML that's conforming XML, one of the advantages is the possibility of using general purpose XML tooling on the same documents. So, Henri is right I think, that if your black box is for the simple use case of browsing a page, there's little of any difference for users how the parsing and error recovery is layered internal to the browser. Consider, though, a different use case, in which some of the same XHMTL documents are to be stored in an XML database and their attributes and other data used as the subjects of queries. Now you have in intersting tension. The database will presumably deal only with well formed XML documents, which means that the messier content that browsers deal with won't work in the database, at least not in the obvious way. On the other hand, the positive value of the layering becomes a bit clearer. The XML specification describes the subset of the documents that will work in tools like the XML database. Conforming XML parsers will accept those documents and reject others (though, as Elliotte points out, nothing prevents those parsers from handing the input text up to a browser, that may still decide to render it.) So, the value of the layering is not primarily for the browsing-only scenario. It's to give you the opportunity of using HTML documents with a lot of additional tools and in additional scenarios. Now, whether that's worth designing for is a good debate, and I won't be surprised if Henri takes the position: no, I'd rather do without that capability and focus mainly on making HTML work for browsing. I do think this is the right way to ask the question though. XML may add some sorts of value to HTML in small ways when all you're doing is browsing, but I don't think that's the right "black box" to consider. The question is: how much trouble is it worth to design a language that works with a wide range of existing XML tools. I'm not taking a strong position on what the answer should be, as I can really see both sides: I do think it's probably the right question. (And yes, one can also debate whether the world would have been a better place if HTML error handling had been stricter from the start, and less junky HTML were out there, but that train has mostly left the station, I think.) Noah -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 -------------------------------------- Elliotte Harold <elharo@metalab.unc.edu> Sent by: www-tag-request@w3.org 01/11/2009 06:08 PM Please respond to elharo To: www-tag <www-tag@w3.org> cc: (bcc: Noah Mendelsohn/Cambridge/IBM) Subject: Re: Comments on HTML WG face to face meetings in France Oct 08 Henri Sivonen wrote: > If you consider black box-distinguishable conformance, what's the > difference between the XML parser signaling an error and handing the > rest of the stream to the application which hands it to another non-XML > parser to continue and a parser signaling the first WF error and > continuing with the rest of the stream itself? The application knows more about what it's finding than the XML parser does. The XML parser only knows XML. If you hand it something that isn't XML, it knows not what to do with it. -- Elliotte Rusty Harold elharo@metalab.unc.edu Refactoring HTML Just Published! http://www.amazon.com/exec/obidos/ISBN=0321503635/ref=nosim/cafeaulaitA
Received on Tuesday, 20 January 2009 20:17:57 UTC