Re: HTML entities and the validator...

On 23 Apr 2011, at 14:48, sierkb@gmx.de wrote:
> Question: is there, by any means, anywhere, a definition, if in HTML (concrete: HTML 4.01) and/or it's parent, SGML, it's allowed and valid to shorten an entity (like &amp;) to "&;" (Ampersant + Semikolon) in a given <a href="URL"> with &; instead of &amp;" within an URL construct, so that the W3C Markup Validator is right, in NOT labeling it as an error and let passing it as valid?

& followed by a non-name character is treated the same as &
; is not a name character
&; is thus the same as &amp;;

In a query string, most (but not all) systems will let you separate key=value pairs with & AND ;[1] so

?foo=bar&baz=ping is usually treated the same way as ?foo=bar&baz=ping BUT is still a distinct URI
?foo=bar&;baz=ping is thus treated the same way as ?foo=bar&&baz=ping and ?foo=bar;;baz=ping

So:

* It is not a validity error
* Since &; is not the same as & it is an error (since the URI being linked to is not the one the author intended to link to)
* If you are looking to avoid typing character references out, and the server side process supports it, use ; instead of (not as well as) &.

[1] http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2

-- 
David Dorward
http://dorward.me.uk

Received on Sunday, 24 April 2011 13:16:26 UTC