Re: HTML Parsing Step? from Norman Walsh on 2007-07-03 (public-xml-processing-model-wg@w3.org from July 2007)

From: Norman Walsh <ndw@nwalsh.com>
Date: Tue, 03 Jul 2007 11:14:22 -0400
To: public-xml-processing-model-wg@w3.org
Message-ID: <87odita59t.fsf@nwalsh.com>

/ Alex Milowski <alex@milowski.org> was heard to say:
| On 7/3/07, Norman Walsh <ndw@nwalsh.com> wrote:
|> / Alex Milowski <alex@milowski.org> was heard to say:
|> | At the end of our e-mail discussion in May I suggested we have a separate
|> | step for parsing HTML.  I still think this is a good idea.  Anyone else?
|>
|> So this is the equivalent of "tidy" not the equivalent of "tagsoup",
|> right?
|
| I don't understand this question.
|
| Tidy and Tagsoup cleanup HTML.

You're right. Brain cramp. I was thinking that tidy had knowledge of
the HTML vocabulary (that img and hr are empty, for example) whereas
tagsoup just cleaned up not-well-formed XML. But that's not the case.
So nevermind.

                                        Be seeing you,
                                          norm

-- 
Norman Walsh <ndw@nwalsh.com> | You must not think me necessarily
http://nwalsh.com/            | foolish because I am facetious, nor
                              | will I consider you necessarily wise
                              | because you are grave.--Sydney Smith

Received on Tuesday, 3 July 2007 15:14:35 UTC