W3C home > Mailing lists > Public > www-lib@w3.org > January to March 2005

Re: HTML Parser problem

From: Shashank Kavishwar <skavishwar@frontbridge.com>
Date: Thu, 10 Mar 2005 10:56:53 -0800
To: <www-lib@w3.org>
Cc: <bancroft@america.net>
Message-ID: <04a901c525a2$efecf260$a801a8c0@internal.bigfish.com>
Well, the '<' character can appear in HTML in no reference to a tag. For
e.g. I recently saw a signature as:


Direct Line
web site

Shouldn't the '<' character, if its not an HTML tag be kept as it is
while parsing? Basically, I think the presence of a '<' character in the
HTML which is not part of a tag, should not break the parse function to
skip the rest of the HTML after the '<' character.

The Perl HTML::Parser has the capability to do this, and will parse the
HTML correctly.


FrontBridge introduces Message Archive and Secure Email. Get leading Enterprise Message Security services from FrontBridge. www.frontbridge.com.
Received on Thursday, 10 March 2005 18:57:18 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:33:56 UTC