W3C home > Mailing lists > Public > www-validator@w3.org > October 2002

Re: Ampersands in URLs: Validator conflicts with RFC2396?

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 24 Oct 2002 00:34:56 +0000 (GMT)
To: Maarten de Boer <mdeboer@iua.upf.es>
Cc: "liam@htmlhelp.com" <liam@htmlhelp.com>, "www-validator@w3.org" <www-validator@w3.org>, "gerald@w3.org" <gerald@w3.org>, "sogm@kmt.hku.nl" <sogm@kmt.hku.nl>, ot@w3.mag.keio.ac.jp
Message-ID: <Pine.LNX.4.21.0210240030210.29501-100000@dhalsim.dreamhost.com>

On Wed, 23 Oct 2002, Maarten de Boer wrote:
>
> The w3c validator tells me about the invalid use of an & in a
> URI, just as described in the FAQ.
> 
> http://www.htmlhelp.com/tools/validator/problems.html#amp
> 
> However, I find this explanation rather dubious, when I compare it
> to the "Uniform Resource Identifiers (URI): Generic Syntax" RFC2396

The error has absolutely nothing to do with RFC2396. Using ampersands
anywhere in HTML requires them to be escaped, since & means something
special _to HTML_.

For example:

   <abbr title="Dungeons&Dragons">

...is invalid, because the & implies that the next word (Dragons) is an
entity, which isn't the case.

Instead it has to be written as:

   <abbr title="Dungeons&amp;Dragons">   

It's just that ampersands occur in URIs more often, so the error is hit
more often in the context of URIs.

The problem is most obvious if the text after the & character is something
like "gt", as in:

   foo&gt

...because that is then EXACTLY equivalent to:

   foo>

...which is not the same as:

   foo&amp;gt

HTH,
-- 
Ian Hickson                                      )\._.,--....,'``.    fL
"meow"                                          /,   _.. \   _\  ;`._ ,.
http://index.hixie.ch/                         `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 23 October 2002 20:35:07 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:04 GMT