Re: Flagging & in URL in HTML 4.01 transitional type.

On Fri, 8 Jun 2001, Lloyd Wood wrote:

> On Fri, 8 Jun 2001, Jim Correia wrote:
> 
> > On 12:11 PM 6/8/01 Mike Heins <mheins@redhat.com> wrote:

(is that Mike Heins as in minivend?)

> > > Since every browser in the world must tolerate &,

Tolerate - fine.  But it introduces serious ambiguity, which browsers
have to error-correct one way or another.

>	 my opinion is that
> > > this is an artificially created tempest in a teapot, created by the
> > > failure of the validation suite writer to provide a "pedantic" mode.

The validator is "pedantic" by definition.  That is its purpose.

Several years ago, the old webtechs validator used a hack that would
allow unescaped ampersands in URLs (specifically, it would escape
them internally, before feeding the document to sgmls).  This was
IMO a Bad Thing, since it did much to spread confusion, including
perhaps yours.

> >     <http://www.example.com/script.pl?foo=bar&copy=true>
> >
> > can be interpreted as the copyright symbol, which is not what you
> > intended.

Not only can, but will.  And we're not talking some obscure browser
here either: we're talking (AFAIK) every modern browser.

> In which case, you moan at the browser writer for not insisting on
> the trailing semicolon of &copy;

Browsers do lots of error-correction, and we don't (usually) complain).

>	,or for trying to pass an unescaped
> copyright symbol in a GET request.

No, they do URL-escape any characters that require it.

> Yes, the fact that the forms authors didn't do much reading is a
> problem in principle. But it's rarely a problem in practice.

AIUI the URLencoding scheme predates the expression of HTML as SGML.

-- 
Nick Kew

Received on Saturday, 9 June 2001 06:53:34 UTC