- From: Mike Meyer <mwm@contessa.phone.net>
- Date: Tue, 23 Jan 1996 08:29:03 PST
- To: www-html@w3.org
> >2) Codes or names -must- be used to replace characters which would otherwise > >be interpreted as mark-up. There are four [<>&"], and they conform to ISO > >standards for their codes and names. Other codes or names from 8859-1 may > >be used to avoid similar confusion, e.g, [/\-_]. > > Your phrase "otherwise be interpreted as mark-up" is the key, but it's also ambiguous. As > far as I understand (and you may have meant this), only < needs to always be replaced by > its entity (<). The others [>&"] only need to be replaced by their entities (> & > and ") if they're inside a tag. Note quite. & needs to be replaced outside if it's followed by a name character or a '#' in any context where entities are recognized - which means most contexts in HTML. For to present the string '&' you need to use '&amp'. Greater than should only occur inside a tag if it's inside of quotes; Some browsers incorrectly terminate the tag early if they see a '>' inside of quotes and so require the replacement. So the replacement is a good idea most of the time. Other browsers don't require the quotes and require the replacement as well. Depending on a browser interpreting an illegal construct a specific way is a bad idea. Similar comments apply about double quotes, except that more browsers don't recognize single quotes as a valid mechanism for quoting attribute values. If you want a simple phrasing that catches all required instances, "<& anywhere they occur and >" inside of "'ed attribute values" would do. <mike
Received on Tuesday, 23 January 1996 11:56:47 UTC