- From: Norman Walsh <Norman.Walsh@Sun.COM>
- Date: Thu, 02 Nov 2006 10:08:10 -0500
- To: www-tag@w3.org
- CC: John Cowan <cowan@ccil.org>
- Message-ID: <87zmb9kht1.fsf@nwalsh.com>
/ Mike Schinkel <mikeschinkel@gmail.com> was heard to say:
| The list newbie in me is curious; why not go ahead and simplify it and
| instead fully define it to *include* white space inside tags and quotes
| around attributes? It would make comparisons of output between different
| parsers easier.
Because there's no where in any of the common models to store that
information. It's always been regarded as insignificant (as has
attribute order and a few other things). You simply
can't distinguish between <span class="foo"></span> and
<span class='foo' ></span>.
Extending the internal models to include this information would
be impractical. And wrong.
Both of those lexical forms represent an empty element called "span"
with a single attribute called "class" with the value "foo". That's
all there is there.
|>> Fortunately we have at least one existence proof of
|>> such a product and it is called, obviously enough,
|>> TagSoup: http://home.ccil.org/~cowan/XML/tagsoup/
|
| I read this page and have questions.
|
| "TagSoup also includes a command-line processor
| that reads HTML files and can generate either
| clean HTML or well-formed XML that is a close
| approximation to XHTML."
|
| 1.) Why a "generate ... a close approximation to XHTML?" Doesn't it need to
| "generate XHTML?"
I wonder if John reads this list. John? My guess is that it has to do
with rules that XHTML imposes but that aren't easy to deduce from a
random stream of tags, but I could be wrong.
| 2.) Secondly (and you may no know this and maybe I shouldn't even be asking
| on the list, but...) how do I use TagSoup on a Windows machine?
Download a Java VM and you should be able to run the TagSoup jar
without any trouble. You can get a VM from
http://java.sun.com/javase/downloads/index.jsp (Note that I'm employed
by Sun Microsystems, so it can hardly be seen as a surprise that I'd
recommend that one; I'm sure there are others.)
Be seeing you,
norm
--
Norman Walsh
XML Standards Architect
Sun Microsystems, Inc.
Received on Thursday, 2 November 2006 15:08:37 UTC