W3C home > Mailing lists > Public > html-tidy@w3.org > October to December 2001

Re: Error in HTML Tidy Beta

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Fri, 12 Oct 2001 01:04:45 +0200
To: "Tapani Raikkonen" <tapani.raikkonen@tiivistekeskus.fi>
Cc: <html-tidy@w3.org>
Message-ID: <qo8cst4esb03t401e1q4p1m4p5b8cbgtkq@4ax.com>
* Tapani Raikkonen wrote:
> I noticed a problem with HTML Tidy Beta when using mailto: and
> scandinavian characters. It' not rendering right. Example:
> <!-- EMAIL -->
> <div id="mail"><a
> href="mailto:tapani@raikkonen.com?Subject=Lähetä postia Tapsalle"><img
> src="kuvat/mailbox.gif" alt="Postia Tapsalle" border="0" width="27"
> height="32" /></a></div>
> This is what this looks like after using Tidy Beta:
> <!-- EMAIL -->
> <div id="mail"><a href=
> "mailto:tapani@raikkonen.com?Subject=L%C3%A4het%C3%A4%20postia%20Tapsalle"><img
> src="kuvat/mailbox.gif" alt="Postia Tapsalle" border="0" width="27" height=
> "32" /></a></div>
> No browser  or e-mail program (Windows platform) can read this right.

Well, Tidy gives you a rather longish explanation for this. Your URI is
invalid if you use non-ASCII characters like 'ä' and if it works, well,
that's just by accident, not because anyone says it has to. You have 4

  * Don't use non-ASCII characters in URIs
  * Live with the recommended UTF-8/URI escaping
    (see e.g. http://www.w3.org/International/O-URL-and-ident.html)
  * use another character encoding like ISO-8859-1 (ideally your overall
    character encoding, maybe it is a good idea to switch to UTF-8 there
    also), e.g. ...subject=L%E4het%E4%20postia...
  * use '--fix-uri no' to stop Tidy fixing your URIs.

Tidy is required to escape URIs like it does by various specifications,
especially HTML 4 and http://www.w3.org/TR/charmod/ I am sorry if this
causes any trouble (I haven't checked this for mailto:-URIs), but
non-ASCII characters are invalid in URIs and you shouldn't have used

Björn Höhrmann { mailto:bjoern@hoehrmann.de } http://www.bjoernsworld.de
am Badedeich 7 } Telefon: +49(0)4667/981028 { http://bjoern.hoehrmann.de
25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/
Received on Thursday, 11 October 2001 19:05:51 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:51 UTC