- From: Norman Walsh <Norman.Walsh@Sun.COM>
- Date: Thu, 02 Nov 2006 10:08:10 -0500
- To: www-tag@w3.org
- CC: John Cowan <cowan@ccil.org>
- Message-ID: <87zmb9kht1.fsf@nwalsh.com>
/ Mike Schinkel <mikeschinkel@gmail.com> was heard to say: | The list newbie in me is curious; why not go ahead and simplify it and | instead fully define it to *include* white space inside tags and quotes | around attributes? It would make comparisons of output between different | parsers easier. Because there's no where in any of the common models to store that information. It's always been regarded as insignificant (as has attribute order and a few other things). You simply can't distinguish between <span class="foo"></span> and <span class='foo' ></span>. Extending the internal models to include this information would be impractical. And wrong. Both of those lexical forms represent an empty element called "span" with a single attribute called "class" with the value "foo". That's all there is there. |>> Fortunately we have at least one existence proof of |>> such a product and it is called, obviously enough, |>> TagSoup: http://home.ccil.org/~cowan/XML/tagsoup/ | | I read this page and have questions. | | "TagSoup also includes a command-line processor | that reads HTML files and can generate either | clean HTML or well-formed XML that is a close | approximation to XHTML." | | 1.) Why a "generate ... a close approximation to XHTML?" Doesn't it need to | "generate XHTML?" I wonder if John reads this list. John? My guess is that it has to do with rules that XHTML imposes but that aren't easy to deduce from a random stream of tags, but I could be wrong. | 2.) Secondly (and you may no know this and maybe I shouldn't even be asking | on the list, but...) how do I use TagSoup on a Windows machine? Download a Java VM and you should be able to run the TagSoup jar without any trouble. You can get a VM from http://java.sun.com/javase/downloads/index.jsp (Note that I'm employed by Sun Microsystems, so it can hardly be seen as a surprise that I'd recommend that one; I'm sure there are others.) Be seeing you, norm -- Norman Walsh XML Standards Architect Sun Microsystems, Inc.
Received on Thursday, 2 November 2006 15:08:37 UTC