- From: Elliotte Harold <elharo@metalab.unc.edu>
- Date: Wed, 01 Nov 2006 07:25:35 -0500
- To: www-tag@w3.org
Vincent.Quint@inrialpes.fr wrote: > All, > > On 24 October 2006, the TAG has accepted a new issue > TagSoupIntegration-54: > > Is the indefinite persistence of 'tag soup' HTML consistent with a > sound architecture for the Web? If so, what changes, if any, to > fundamental Web technologies are necessary to integrate 'tag soup' > with SGML-valid HTML and well-formed XML? > > It is now part of the TAG issues list. Refer to the list for more > details and to track future progress: > A straw man: 1. Tag soup isn't going away, no matter what we say. 2. Tag soup is a good idea in itself. It expands ease of authoring, and means readers do not encounter errors they are not responsible for and cannot fix. 3. However, error recovery causes browser interoperability problems. This is partially what XML's draconian error handling was designed to solve. 4. Well-formed, valid XHTML is very useful for machine processing, including JavaScript. 5. To resolve this conflict, we need a means of converting tag soup into valid XHTML that is invisible to a typical end user. Here is what I propose: The W3C define a process by which *any* arbitrary byte stream can be converted into valid XHTML, no matter what. This would act as a filter on incoming data. Browsers in normal operation would be expected to apply this filter before rendering a document, constructing a DOM, or doing pretty much anything else with a page that purports to be HTML. Other tools could use this as well. This process must be fully determinate. That is two independent implementations must always produce the same XHTML version modulo insignificant details like white space inside tags or quotes around attribute values. This must be both a specification *and* a normative reference implementation. Fortunately we have at least one existence proof of such a product and it is called, obviously enough, TagSoup: http://home.ccil.org/~cowan/XML/tagsoup/ I understand there is also work along these lines going on in the HTML 5 community, though I am not intimately familiar with it. -- Elliotte Rusty Harold elharo@metalab.unc.edu Java I/O 2nd Edition Just Published! http://www.cafeaulait.org/books/javaio2/ http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
Received on Wednesday, 1 November 2006 12:26:02 UTC