Unescaped XML Ampersands Incorrectly Validated

This is not well-formed, but the validator passes it:



The issue is that there is an unescaped ampersand "&" in the source 
which is not being detected.

This issue was originally noted here:


Masayasu Ishikawa commented:

This is one of known limitations in SP-derived SGML/XML parsers.
"Real" XML processors can easily catch this kind of fatal error,
e.g. the CSS Validator does catch such error.

Further IRC discussion:

<xover> Uhm. What's the problem?
<xover> The unescaped amperstand?
<xover> That's an artifical constraint imposed only by the prose of
   the XML REC and inexpressible in a DTD or SGML AFAICT.
<xover> And since OpenSP doesn't allow us to treat it as an error, we
   do the best we can by emitting a warning instead.
<sbp> nontheless, it's a constraint
<deltab> um, what is?
<sbp> ampersands must be escaped as &amp; in XML PCDATA
<deltab> yes, as they must anywhere
<xover> "anywhere" (almost) in XML. Not in SGML.
<deltab> where not in SGML?
<xover> SGML allows the & to appear bare anywhere it is unambigious.
- Swhack, 2004-02-29 21:00

Please let me know whether this is appropriate enough a bug to enter 
into the database at <http://www.w3.org/Bugs/Public/>. (It would also 
be appreciated if the validator.w3.org feedback page were more 
bug-report oriented!)


Sean B. Palmer, <http://purl.org/net/sbp/>
"phenomicity by the bucketful" - http://miscoranda.com/

Received on Sunday, 29 February 2004 16:23:01 UTC