W3C home > Mailing lists > Public > html-tidy@w3.org > January to March 2001

Re: Erroneous 'unescaped &' warning message from CGI urls

From: Bob Long <bob@oblong.com.au>
Date: Tue, 20 Feb 2001 21:20:22 +1000
Message-ID: <004b01c09b2f$1eab87e0$0200a8c0@bob>
To: "html-tidy" <html-tidy@w3.org>

----- Original Message -----
From: "Martyn J Shaw" <mjshaw@uclan.ac.uk>
To: "bob" <bob@oblong.com.au>; "html-tidy" <html-tidy@w3.org>
Sent: Tuesday, February 20, 2001 8:36 PM
Subject: Re: Erroneous 'unescaped &' warning message from CGI urls


> Bob
>
> I've seen this argument before somewhere.  Is the full answer that the
HTML 4.01 spec says that href attribute takes a URI as an value and a URI is
of type CDATA and that a user agent should replace character entities in
CDATA sections?
>
> Martyn Shaw

From comp.infosystems.www.authoring.html message by Jukka Korpela:
http://groups.google.com/groups?q=group:comp.infosystems.www.authoring.html+
insubject:character+insubject:references+author:jukka&hl=en&lr=&safe=off&rnu
m=1&seld=924924029&ic=1

[Subject: "Character references" explained; Date: 16/Feb/2001.]

... "On the other hand, especially when URLs with query parts, such as
http://www.server.example/cgi-bin/x.pl?foo=bar&copy=42
occur in HTML documents, you need to take into account that
an HTML parser (a correct one at least) will recognize anything
that starts with & and a letter as an entity reference, even if
it is not terminated by a semicolon. In the example case,
&copy is taken as an entity reference that denotes &#169; which
in turn denotes the copyright sign, which is not allowed in a URL,
but I digress. To prevent this, replace that & by &amp; which is
an entity reference that denotes &#38; that denotes the & character
as such (i.e. as not to be taken as a constituent of a character
reference or an entity reference)."

Bob Long

> >>> Bob Long <bob@oblong.com.au> 02/18/01 11:27am >>>
> ----- Original Message -----
> From: "Chris Hamer-Hodges" <chh@delcam.com>
>
>
> > ---BUG REPORT---
> >
> > If I add a link to a CGI script that takes more than one argument
> > eg.
> > <a href="cgi-bin/cgi-script?arg1=value1&arg2=value2">some text</a>
> > I get a warning from Tidy saying: Warning: unescaped & or unknown entity
> > "&arg2"
> >
> > -------------------------
> >
> > Otherwise great utility :-)
> >
> > Chris HH
>
> Tidy is correct. You should really use &amp; rather than simply &.
>
> Bob Long
>
>
>
Received on Tuesday, 20 February 2001 06:20:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:45 GMT