W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2007

Re: libtidy-php: does not clean invalid character refs

From: Felix Natter <felix.natter@smail.inf.fh-bonn-rhein-sieg.de>
Date: Tue, 30 Jan 2007 10:50:19 +0100
To: Krzysztof Gorzelak <krzysztof@uno.pl>
Cc: html-tidy@w3.org
Message-Id: <1170150620.5234.9.camel@localhost.localdomain>

On Tue, 2007-01-30 at 10:41 +0100, Krzysztof Gorzelak wrote:
> ----- Original Message ----- 
> From: "Felix Natter" <felix.natter@smail.inf.fh-bonn-rhein-sieg.de>
> To: <html-tidy@w3.org>
> Sent: Tuesday, January 23, 2007 3:21 PM
> Subject: libtidy-php: does not clean invalid character refs
> 
> 
> > And the output still contains invalid character references because
> > this still shows when I run the command-line tidy over the result:
> >
> > line 1567 column 31 - Warning: replacing invalid character code 145
> > line 1573 column 31 - Warning: replacing invalid character code 145
> > line 1579 column 33 - Warning: replacing invalid character code 145
> > line 1712 column 45 - Warning: <a> attribute with missing trailing
> quote
> > mark
> > line 1771 column 28 - Warning: replacing invalid character code 136
> >
> > How can I get libtidy under PHP to fix these "invalid character
> > messages"?
> >
> > I am using libtidy 20050415-1 on debian sarge with php-tidy 1.2
> (which
> > seems to be no longer maintained).
> 
> I'm using "Multibyte String" library ( function mb_convert_encoding )
> to 
> prepare my webpage for tidy cleaning...
> 
Thanks for the reply!

Unfortunately mb_convert_encoding does not eliminate these problems.
I do this:
$out = mb_convert_encoding($text, "UTF-8", "UTF-8");
and still get the same problems in tidy.

Do you have another idea on how to fix this?

thanks!

-- 
Felix Natter <felix.natter@smail.inf.fh-brs.de>
Received on Tuesday, 30 January 2007 09:49:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:56 GMT