W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2003

Re: Can Tidy stop changing web addresses (URIs/URLs)?

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Fri, 24 Jan 2003 17:11:06 +0100
To: Michael Jennings <mjennings@myrealbox.com>
Cc: html-tidy@w3.org
Message-ID: <3e56620a.193791827@smtp.bjoern.hoehrmann.de>

* Michael Jennings wrote:
>How do I get HTML Tidy to stop changing web addresses (URI/URL)?
>Fix-URI does not work:

Please send sample code, config options and detailed information on what
version of HTML Tidy you use (e.g., where did you get it from, what does
`tidy -v` tell you). I am not able to reproduce any problem with that
function. --fix-uri enables you to control how Tidy deals with invalid
characters in URIs, e.g.

  <a href = 'björn.html'>...</a>

is such a invalid URI reference, with enabled --fix-uri Tidy will change
it to

  <a href = 'bj%C3%B6rn.html'>...</a>

There is another option --fix-backslash that turns invalid references

  <a href = 'http:\\www.example.org\'>...</a>


  <a href = 'http://www.example.org/'>...</a>
A third config option --quote-ampersand specifies how Tidy deals with
invalid entity references in the document, e.g.

  <a href = 'http://www.example.org/?foo=bar&baz=foo'>...</a>

will be cleaned up to

  <a href = 'http://www.example.org/?foo=bar&amp;baz=foo'>...</a>

if the option is set to a true value. By default, all these options are
enabled, and that's necessary in order get a valid document.

>It is NEVER correct to change a web address.

Tidy does not change URIs, it turns invalid URIs or invalid
representations of URIs into valid ones, and it does so in a way
consistent with the HTML 4.01 Recommendation.

>That's because a web site 
>designer can use any string of characters to express a method of 
>accessing a database, for example.

Sure, but that individual must properly escape this string in order to
get a valid (HTML representation of a valid) URI reference. Tidy will
not change the document, if it does not encounter any problems.
Received on Friday, 24 January 2003 11:10:44 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:53 UTC