W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2001

Re: Bug + fix for illegal ampersands and character entities

From: Alexander Biron <biron@ifh.de>
Date: Mon, 19 Feb 2001 10:44:53 +0100 (MET)
To: Bertilo Wennergren <bertilow@chello.se>
cc: <html-tidy@w3.org>
Message-ID: <Pine.HPX.4.31.0102191028110.8514-100000@ceres.ifh.de>
Hi Bertilo,

On Sat, 17 Feb 2001, Bertilo Wennergren wrote:

> Randy Waki:
> > 4-Aug-2000 Tidy's handling of illegal ampersands such as "id=1&lang=en"
> > is inconsistent with browsers.
> Which browsers? Please demonstrate code that uses the correct "&amp;"
> and that breaks in a named browser.
> If you can't do that, then just correct the code, as Tidy wants
> you too, and be done with it.

There is no question in this case that using the correct code would be
the best choice. But:

Tidy has turned from a pure pretty-printer to a multi purpose tool, one
of whom's tasks is error-fixing. Yes, not only error-warning but
error-fixing. So, when we discuss tidy's error-fixing features, we have
to assume that the input code is illegal/wrong/whatever-you-call-it.

Now, in the above example of "id=1&lang=en", my mind tells me that the
author probably should have written "id=1&amp;lang=en". But perhaps I am
wrong, perhaps the author (human or program) did actually want to code
something which should be written as "id=1&lang;=en".

This is a classic example that automized error-fixing is doomed to fail
in at least some cases. So the question with respect to error-fixing is:
which option should tidy use? I would support Randy (fix it the way the
two large browsers interpret it) for a very simple reason: The most
common way of HTML debugging is not validating, but browser-testing (by
the author and by the users). So if the intended effect is not achieved
in browsers, the error is likely to be fixed soon. If the browsers
interpret the error in the way the author wants it to be interpreted,
the error will persist - untill tidy wants to fix it.

Cheers alex          Alexander Biron

"All science is either physics or stamp collecting."
Ernest Rutherford (Nobel Price in chemistry 1908)

  /"\  ASCII ribbon campaign
  \ /  ---------------------   	http://www-zeuthen.desy.de/~biron/
   X     against HTML mail  	Tel (+49)33762-7-7516
  / \      and postings     	mailto:biron@ifh.de
Received on Monday, 19 February 2001 04:45:05 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:38:49 UTC