- From: Krzysztof Gorzelak <krzysztof@uno.pl>
- Date: Tue, 30 Jan 2007 10:51:00 +0100
- To: "Felix Natter" <felix.natter@smail.inf.fh-bonn-rhein-sieg.de>
- Cc: <html-tidy@w3.org>
----- Original Message ----- From: "Felix Natter" <felix.natter@smail.inf.fh-bonn-rhein-sieg.de> To: "Krzysztof Gorzelak" <krzysztof@uno.pl> Sent: Tuesday, January 30, 2007 10:33 AM Subject: Re: libtidy-php: does not clean invalid character refs >> > And the output still contains invalid character references because >> > this still shows when I run the command-line tidy over the result: >> > >> > line 1567 column 31 - Warning: replacing invalid character code 145 >> > line 1573 column 31 - Warning: replacing invalid character code 145 >> > line 1579 column 33 - Warning: replacing invalid character code 145 >> > line 1712 column 45 - Warning: <a> attribute with missing trailing >> > quote >> > mark >> > line 1771 column 28 - Warning: replacing invalid character code 136 >> > >> > How can I get libtidy under PHP to fix these "invalid character >> > messages"? >> > >> > I am using libtidy 20050415-1 on debian sarge with php-tidy 1.2 (which >> > seems to be no longer maintained). >> >> I'm using "Multibyte String" library ( function mb_convert_encoding ) to >> prepare my webpage for tidy cleaning... > > Thanks for the reply! > > Unfortunately mb_convert_encoding does not eliminate these problems. > I do this: > $out = mb_convert_encoding($text, "UTF-8", "UTF-8"); > and still get the same problems in tidy. > > Do you have another idea on how to fix this? > You can try iconv library (character set conversion facility) or just functions utf8_encode() & utf8_decode()... There is also a nice article about internationalization: http://www.phpwact.org/php/i18n/charsets Bonne journee! Krzysztof Gorzelak krzysztof@uno.pl http://www.uno.pl
Received on Tuesday, 30 January 2007 09:51:12 UTC