W3C home > Mailing lists > Public > www-validator@w3.org > May 2006

Re: [VE][394] Error Message Feedback

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Sun, 28 May 2006 00:33:53 +1000
Message-ID: <44786351.1030206@lachy.id.au>
To: Nick Kew <nick@webthing.com>
CC: henri aruküla <henriarukyla@gmail.com>, www-validator@w3.org

Nick Kew wrote:
> On Saturday 27 May 2006 00:09, you wrote:
>> I tried to validate my site and i had problem with "&" symbol.
>> I fixed that by $return = str_replace( '&', '&amp;', $return );
>> But now i get errors:
>>
>> http://www.dix.pri.ee/index.php?option=com_content&task=blogcategory&id=13&Itemid=34&lang=est
>>
>> I think that, the semicolon is exists end of the &amp;
>>
>> What shall i do to fix it?

The first error shows this fragment:

...&amp;Itemid=33&lang=est"

The ampersand before lang needs to be escaped as &amp;.  At the moment, 
it's not well-formed XML.

But since the document is being incorrectly served as text/html, HTML 
rules apply in reality.  In HTML rules, &lang is a valid entity 
reference which refers to the character: '〈' (U+2329 - left-pointing 
angle bracket) and you're actually depending upon non-conformant 
behaviour in browsers for that to work as intended.

http://www.w3.org/TR/html401/sgml/entities.html

> There's a serious bug in the validator here: it proclaims the page
> valid and highlights the unterminated entity refs as warnings
> rather than errors.

That's just one of the validator's many limitations with XML due to its 
origin as an SGML based validator.

>  That would be correct for an HTML page (under SGML rules).

While it would be technically valid under SGML rules, the fragment would 
actually be equvalent to the following:

...&amp;Itemid=33〈=est"

Which is clearly not what was intend though its also not how browsers 
actually handle it.

-- 
Lachlan Hunt
http://lachy.id.au/
Received on Saturday, 27 May 2006 14:34:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:22 GMT