- From: Laurens Holst <lholst@students.cs.uu.nl>
- Date: Wed, 25 May 2005 20:26:22 +0200
- To: Matthew Wilson <Matthew.Wilson@YourMove.co.uk>
- Cc: www-html-editor@w3.org
Matthew Wilson wrote: > I'm not sure whether or not this is being sent to the correct e-mail address, but it was listed in the XHTML errata. > > After looking through the character entity sets for XHTML I noticed the following lines in http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent ; > > <!ENTITY quot """> <!-- quotation mark, U+0022 ISOnum --> > <!ENTITY amp "&#38;"> <!-- ampersand, U+0026 ISOnum --> > <!ENTITY lt "&#60;"> <!-- less-than sign, U+003C ISOnum --> > <!ENTITY gt ">"> <!-- greater-than sign, U+003E ISOnum --> > > It occurred to me that the middle two lines should probably read; > > <!ENTITY amp "&"> <!-- ampersand, U+0026 ISOnum --> > <!ENTITY lt "<"> <!-- less-than sign, U+003C ISOnum --> > > As "&#38;" and "&#60;" are surely invalid character codes and hex 26 = decimal 38, likewise hex 3C = decimal 60- is this correct? No, that is correct. Because the & nor the < can’t be written without using entities. Quote from the XML spec about DTDs (which also applies to DTDs in general): If the entities lt or amp are declared, they MUST be declared as internal entities whose replacement text is a character reference to the respective character (less-than sign or ampersand) being escaped; the double escaping is REQUIRED for these entities so that references to them produce a well-formed result. If the entities gt, apos, or quot are declared, they MUST be declared as internal entities whose replacement text is the single character being escaped (or a character reference to that character; the double escaping here is OPTIONAL but harmless). For example: <!ENTITY lt "&#60;"> <!ENTITY gt ">"> <!ENTITY amp "&#38;"> <!ENTITY apos "'"> <!ENTITY quot """> http://www.w3.org/TR/REC-xml/#sec-entexpand ~Grauw -- Ushiko-san! Kimi wa doushite, Ushiko-san!!
Received on Wednesday, 25 May 2005 18:26:18 UTC