- From: Felix Natter <felix.natter@smail.inf.fh-bonn-rhein-sieg.de>
- Date: Tue, 30 Jan 2007 11:40:38 +0100
- To: Krzysztof Gorzelak <krzysztof@uno.pl>
- Cc: html-tidy@w3.org
On Tue, 2007-01-30 at 10:51 +0100, Krzysztof Gorzelak wrote:
> ----- Original Message -----
> From: "Felix Natter" <felix.natter@smail.inf.fh-bonn-rhein-sieg.de>
> To: "Krzysztof Gorzelak" <krzysztof@uno.pl>
> Sent: Tuesday, January 30, 2007 10:33 AM
> Subject: Re: libtidy-php: does not clean invalid character refs
>
>
> >> > And the output still contains invalid character references because
> >> > this still shows when I run the command-line tidy over the result:
> >> >
> >> > line 1567 column 31 - Warning: replacing invalid character code 145
> >> > line 1573 column 31 - Warning: replacing invalid character code 145
> >> > line 1579 column 33 - Warning: replacing invalid character code 145
> >> > line 1712 column 45 - Warning: <a> attribute with missing trailing
> >> > quote
> >> > mark
> >> > line 1771 column 28 - Warning: replacing invalid character code 136
> >> >
> >> > How can I get libtidy under PHP to fix these "invalid character
> >> > messages"?
> >> >
> >> > I am using libtidy 20050415-1 on debian sarge with php-tidy 1.2 (which
> >> > seems to be no longer maintained).
> >>
> >> I'm using "Multibyte String" library ( function mb_convert_encoding ) to
> >> prepare my webpage for tidy cleaning...
> >
> > Thanks for the reply!
> >
> > Unfortunately mb_convert_encoding does not eliminate these problems.
> > I do this:
> > $out = mb_convert_encoding($text, "UTF-8", "UTF-8");
> > and still get the same problems in tidy.
> >
> > Do you have another idea on how to fix this?
> >
>
> You can try iconv library (character set conversion facility) or just
> functions utf8_encode() & utf8_decode()... There is also a nice article
> about internationalization: http://www.phpwact.org/php/i18n/charsets
Now I tried this:
$out = iconv("UTF-8", "UTF-8//IGNORE", $text);
and this:
$out = utf8_decode($text);
$out = utf8_encode($out);
And both things don't seem to work. Do you have more hints?
I uploaded the problematic file here:
http://www2.inf.fh-brs.de/~fnatte2s/Adenauer.html
thanks!
--
Felix Natter <felix.natter@smail.inf.fh-brs.de>
Received on Tuesday, 30 January 2007 10:39:59 UTC