XHTML: errors in "HTMLspecialx.ent"

This pertains to the character set

<!ENTITY % HTMLspecial PUBLIC
   "-//W3C//ENTITIES Special//EN//HTML"
   "http://www.w3.org/TR/xhtml1/DTD/HTMLspecialx.ent">

which is declared and referenced in the "strict", "transitional", and
"frameset" XML HTML DTDs.

This character set contains the following entity definitions

<!ENTITY amp     "&#38;"> <!--  ampersand, U+0026 ISOnum -->

<!ENTITY lt      "&#60;"> <!--  less-than sign, U+003C ISOnum -->

The replacement texts of these two ENTITY declarations are not well-formed
according to clause 2.4 "Character Data and Markup" of the latest XML
specification (http://www.w3.org/TR/REC-xml).

The "amp" and "lt" entities would have replacement texts of "&" and "<"
respectively after the XML processor resolves the original character
reference replacement texts. However, "&" and "<" are not acceptable
replacement texts for the following reason. When "&amp;" and "&lt;" would be
referenced, they would be directly replaced by the "&" and "<" data
characters respectively. Now XML does not allow "&" and "<" as data
characters except in circumstances where "&amp;" and "&lt;" would not be
recognized such as comments, processing instructions, and CD sections.

Accordingly, these entities must be redefined as

<!ENTITY amp    "&#38;&#38;" > <!--  ampersand, U+0026 ISOnum -->

<!ENTITY lt      "&#38;&#60;"> <!--  less-than sign, U+003C ISOnum --> 

These re-definitions will result in the entities having character reference
replacement texts. That is, the character references (not the data
characters) will be provided when the entities are referenced. See clause
4.6 "Predefined Entities"and the commentary on it on pages 229 - 230 of Bob
Ducharme's "XML The Annotated Specification"(Prentice-Hall, 1999).

Donald Gignac 

Received on Tuesday, 1 June 1999 04:49:44 UTC