Re: "Special characters for XHTML"

Matthew Wilson wrote:
> I'm not sure whether or not this is being sent to the correct e-mail address, but it was listed in the XHTML errata.
> 
> After looking through the character entity sets for XHTML I noticed the following lines in http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent ;
> 
> <!ENTITY quot    "&#34;"> <!--  quotation mark, U+0022 ISOnum -->
> <!ENTITY amp     "&#38;#38;"> <!--  ampersand, U+0026 ISOnum -->
> <!ENTITY lt      "&#38;#60;"> <!--  less-than sign, U+003C ISOnum -->
> <!ENTITY gt      "&#62;"> <!--  greater-than sign, U+003E ISOnum -->
> 
> It occurred to me that the middle two lines should probably read;
> 
> <!ENTITY amp     "&#38;"> <!--  ampersand, U+0026 ISOnum -->
> <!ENTITY lt      "&#60;"> <!--  less-than sign, U+003C ISOnum -->
> 
> As "&#38;#38;" and "&#38;#60;" are surely invalid character codes and hex 26 = decimal 38, likewise hex 3C = decimal 60- is this correct?

No, that is correct. Because the & nor the < can’t be written without 
using entities.

Quote from the XML spec about DTDs (which also applies to DTDs in general):

If the entities lt or amp are declared, they MUST be declared as 
internal entities whose replacement text is a character reference to the 
respective character (less-than sign or ampersand) being escaped; the 
double escaping is REQUIRED for these entities so that references to 
them produce a well-formed result. If the entities gt, apos, or quot are 
declared, they MUST be declared as internal entities whose replacement 
text is the single character being escaped (or a character reference to 
that character; the double escaping here is OPTIONAL but harmless). For 
example:

<!ENTITY lt     "&#38;#60;">
<!ENTITY gt     "&#62;">
<!ENTITY amp    "&#38;#38;">
<!ENTITY apos   "&#39;">
<!ENTITY quot   "&#34;">

http://www.w3.org/TR/REC-xml/#sec-entexpand


~Grauw

-- 
Ushiko-san! Kimi wa doushite, Ushiko-san!!

Received on Wednesday, 25 May 2005 18:26:18 UTC