W3C home > Mailing lists > Public > www-lib@w3.org > January to March 2005

Re: HTML Parser problem

From: Howard Cole <howard.cole@selestial.com>
Date: Fri, 11 Mar 2005 10:00:50 +0000
Message-ID: <42316C52.4010101@selestial.com>
To: www-lib@w3.org

Not speaking authoratively, I think this is badly formed HTML; all '<' 
and '>' tags should be represented as "&lt;" and "&gt;" respectively.
Howard Cole
www.selestial.com

Shashank Kavishwar wrote:

> Well, the ‘<’ character can appear in HTML in no reference to a tag. 
> For e.g. I recently saw a signature as:
>
><')))><
>
>Name
>
>Title
>
>Company
>
>Direct Line
>
>Mobile
>
>e-mail:
>
>web site
>
> 
>
> Shouldn’t the ‘<’ character, if its not an HTML tag be kept as it is 
> while parsing? Basically, I think the presence of a ‘<’ character in 
> the HTML which is not part of a tag, should not break the parse 
> function to skip the rest of the HTML after the ‘<’ character.
>
> The Perl HTML::Parser has the capability to do this, and will parse 
> the HTML correctly.
>
>
> _FrontBridge introduces Message Archive and Secure Email. Get leading 
> Enterprise Message Security services from FrontBridge._
> www.frontbridge.com <http://www.frontbridge.com> 
Received on Friday, 11 March 2005 14:05:43 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 April 2007 18:18:44 GMT