W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2005

AW: Tidy outputs no HTML-Entities

From: Eric Bleinagel <e.bleinagel@sinnerschrader.com>
Date: Tue, 15 Feb 2005 09:25:48 +0100
To: "'Bjoern Hoehrmann'" <derhoermi@gmx.net>
Cc: <html-tidy@w3.org>
Message-ID: <002801c51337$f3d4dd70$5945a8c0@lynxa>

Hi Björn,
thanks for your answer, I'm still trying..

I found out that the application uses the TidyATL.dll, in
http://users.rcn.com/creitzel/tidy.html#comatl Charles Reitzel says
something about problems in character encoding... didn't really understand
it, my English is not the best and I'm not really a programmer, could this
be the answer?

I tried this:
The application got this source code:

<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body>
ü &uuml;
</body>
</html>

And this tidy-config:

output-xhtml:yes
char-encoding:ascii


And I got this output:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="content-type" content="text/html; charset=us-ascii" />
<title></title>
</head>
<body>
ü ü
</body>
</html>


OK, Tidy cannot recognize the meta-tag and assumes I want ascii, in the
config-file I tell him again to put out ascii - and in the output Tidy
writes in the meta-tag that he used ascii and not Latin1(iso8859-1) - but
then inside the body-tag there has to be a '&uuml; &uuml;' and not a 'ü ü' ?

So this constellation could not be correct, or did I misunderstood
something?

Again, thanks in advance for any comment..

Eric




-----Ursprüngliche Nachricht-----
Von: Bjoern Hoehrmann [mailto:derhoermi@gmx.net] 
Gesendet: Freitag, 11. Februar 2005 14:30
An: Eric Bleinagel
Cc: html-tidy@w3.org
Betreff: Re: Tidy outputs no HTML-Entities


* Eric Bleinagel wrote:
>Tidy is running on a server within a CMS (as the last instance, the 
>output is coming directly from Tidy); I have no idea what version it 
>is. I can access Tidy only via changing the config-file.

I suspect this is a misconfiguration on the server, as you point out, it
works fine using your configuration file. For example, the server might
invoke Tidy via something that corresponds to

  % tidy -config example.cfg -latin1 ...

such that the -latin1 option overrides the configuration file. I would
suggest you contact whoever running the server.

>input-encoding:ascii

Note that your input file is not actually US-ASCII encoded, you should use
latin1 here.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Tuesday, 15 February 2005 08:45:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:55 GMT