W3C home > Mailing lists > Public > public-qa-dev@w3.org > April 2011

Re: HTML entities and the validator...

From: David Dorward <david@dorward.me.uk>
Date: Sun, 24 Apr 2011 14:15:53 +0100
Message-Id: <A1CAA7AC-F119-4FDC-9348-F503A1009001@dorward.me.uk>
To: public-qa-dev Dev <public-qa-dev@w3.org>, "www-validator@w3.org Community" <www-validator@w3.org>

On 23 Apr 2011, at 14:48, sierkb@gmx.de wrote:
> Question: is there, by any means, anywhere, a definition, if in HTML (concrete: HTML 4.01) and/or it's parent, SGML, it's allowed and valid to shorten an entity (like &amp;) to "&;" (Ampersant + Semikolon) in a given <a href="URL"> with &; instead of &amp;" within an URL construct, so that the W3C Markup Validator is right, in NOT labeling it as an error and let passing it as valid?

& followed by a non-name character is treated the same as &
; is not a name character
&; is thus the same as &amp;;

In a query string, most (but not all) systems will let you separate key=value pairs with & AND ;[1] so

?foo=bar&baz=ping is usually treated the same way as ?foo=bar&baz=ping BUT is still a distinct URI
?foo=bar&;baz=ping is thus treated the same way as ?foo=bar&&baz=ping and ?foo=bar;;baz=ping

So:

* It is not a validity error
* Since &; is not the same as & it is an error (since the URI being linked to is not the one the author intended to link to)
* If you are looking to avoid typing character references out, and the server side process supports it, use ; instead of (not as well as) &.

[1] http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2

-- 
David Dorward
http://dorward.me.uk
Received on Sunday, 24 April 2011 13:18:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 24 April 2011 13:18:37 GMT