- From: David Dorward <david@dorward.me.uk>
- Date: Tue, 24 Apr 2007 23:15:21 +0100
- To: www-html@w3.org
Mike S wrote: > The W3C validator (using HTML 4.01 Transitional) says that a & in a URL > should be encoded as &. I don't think that this should be required. & should be encoded as & except in attribute values which represent URLs? Please, no! Simplicity is a virtue, and exceptions are the enemy of simplicity. > For one thing, I like to keep my code neat with as few entities as > possible, and having to encode &'s all the time doesn't really help > that. Your options include using an authoring tool that does it for you, or using semi-colons instead (most form data parsing libraries I've encounted respect the advice of HTML 4.01: http://www.w3.org/TR/html4/appendix/notes.html#h-B.2.2 > Another (more important) reason is that an entity is not recognized as > an entity unless it starts with &, and ends with a semicolon. If I remember correctly, that is not true. The semi-colon is optional where a non-name character is present. So ?foo=bar&=12 is an HTML representation of ?foo=bar&=12. I'm not a big fan of this and would rather the semi-colon is required (as it is in XML based languages) for the reasons mentioned above (simplicity). > A URL such > as the one in <a href="somepage.php?foo=1©=2"> has the string > '©' in it, but it has no trailing semicolon and therefore should not > recognized as an entity in a browser. (I just tested this in Firefox, > and it does indeed convert © to a copyright symbol, but I see this > as incorrect behavior as the HTML spec itself states that "In SGML, it > is possible to eliminate the final ";" after a character reference in > some cases (e.g., at a line break or immediately before a tag).", and > inside an attribute value is not a line break or before a tag.) Those "some cases" include, I believe, "if the next character is a non-name character such as an equals sign". The example was just that, not a complete list of circumstances. > (If I'm wrong, and it is actually legal to omit the ; after an entity, > then perhaps it should be required to stop confusion like this?) It is in XHTML. -- David Dorward <http://dorward.me.uk/>
Received on Tuesday, 24 April 2007 22:15:35 UTC