Re: Error in HTML Tidy Beta

* Tapani Raikkonen wrote:
> I noticed a problem with HTML Tidy Beta when using mailto: and
> scandinavian characters. It' not rendering right. Example:
> 
> <!-- EMAIL -->
> <div id="mail"><a
> href="mailto:tapani@raikkonen.com?Subject=Lähetä postia Tapsalle"><img
> src="kuvat/mailbox.gif" alt="Postia Tapsalle" border="0" width="27"
> height="32" /></a></div>
> 
> This is what this looks like after using Tidy Beta:
> 
> <!-- EMAIL -->
> <div id="mail"><a href=
> "mailto:tapani@raikkonen.com?Subject=L%C3%A4het%C3%A4%20postia%20Tapsalle"><img
> src="kuvat/mailbox.gif" alt="Postia Tapsalle" border="0" width="27" height=
> "32" /></a></div>
> 
> No browser  or e-mail program (Windows platform) can read this right.

Well, Tidy gives you a rather longish explanation for this. Your URI is
invalid if you use non-ASCII characters like 'ä' and if it works, well,
that's just by accident, not because anyone says it has to. You have 4
choices

  * Don't use non-ASCII characters in URIs
  * Live with the recommended UTF-8/URI escaping
    (see e.g. http://www.w3.org/International/O-URL-and-ident.html)
  * use another character encoding like ISO-8859-1 (ideally your overall
    character encoding, maybe it is a good idea to switch to UTF-8 there
    also), e.g. ...subject=L%E4het%E4%20postia...
  * use '--fix-uri no' to stop Tidy fixing your URIs.

Tidy is required to escape URIs like it does by various specifications,
especially HTML 4 and http://www.w3.org/TR/charmod/ I am sorry if this
causes any trouble (I haven't checked this for mailto:-URIs), but
non-ASCII characters are invalid in URIs and you shouldn't have used
them.

regards,
-- 
Björn Höhrmann { mailto:bjoern@hoehrmann.de } http://www.bjoernsworld.de
am Badedeich 7 } Telefon: +49(0)4667/981028 { http://bjoern.hoehrmann.de
25899 Dagebüll { PGP Pub. KeyID: 0xA4357E78 } http://www.learn.to/quote/

Received on Thursday, 11 October 2001 19:05:51 UTC