- From: Shane P. McCarron <shane@aptest.com>
- Date: Thu, 06 Jul 2000 10:37:59 -0500
- To: Gerald Oskoboiny <gerald@w3.org>, Paul McGarry <paulm@opentec.com.au>, www-validator@w3.org
"Shane P. McCarron" wrote: > > Gerald Oskoboiny wrote: > > There are some cases where the ampersands don't need to be > > escaped, like: <p>foo & bar</p>, or <a href="foo&_bar"> > > > > I don't think I agree. In SGML, an ampersand always introduces an > entity reference. If you want to actually use an ampersand, you are > required to use &. I don't see any way around this requirement. Okay... The XML specification is pretty clear on this, and is available on-line at http://www.w3.org/TR/REC-xml it says: The ampersand character (&) and the left angle bracket (<) may appear in their literal form only when used as markup delimiters, or within a comment, a processing instruction, or a CDATA section. They are also legal within the literal entity value of an internal entity declaration; see "4.3.2 Well-Formed Parsed Entities". If they are needed elsewhere, they must be escaped using either numeric character references or the strings "&" and "<" respectively. The right angle bracket (>) may be represented using the string ">", and must, for compatibility, be escaped using ">" or a character reference when it appears in the string "]]>" in content, when that string is not marking the end of a CDATA section. From this I conclude that any use of an ampersand in the PCDATA sections of a document, or in other words in the text of a document, must be to introduce a general entity reference. This is true in all instances where most people might use it. The exception would be a CDATA section (<[CDATA[ stuff ]]>). You might use a CDATA section to delimit javascript code in a document so that it is not processed by the XML processor, for example. -- Shane P. McCarron phone: +1 763 786-8160 ApTest fax: +1 763 786-8180 mobile: +1 612 799-6942 e-mail: shane@aptest.com
Received on Thursday, 6 July 2000 11:38:09 UTC