Re: Validator error from Jukka K. Korpela on 2014-10-26 (www-validator@w3.org from October 2014)

From: Jukka K. Korpela <jkorpela@cs.tut.fi>
Date: Mon, 27 Oct 2014 00:29:57 +0200
To: Roman Grinyov <w3lifer@gmail.com>, www-validator@w3.org
Message-ID: <544D75E5.9030602@cs.tut.fi>

2014-10-26 22:38, Roman Grinyov wrote:

> Validation of this page: http://websnippets.ru/article.php?id=30; an
> error occurs in line 33. However apparent reason at no.

This seems to be an odd bug in the validator. The data contains the 
correct character reference &gt; with no hidden control characters. I 
tried to isolate the problem and noticed that deleting everything after 
the <ul> element on lines 30–36 except the end tag </article> makes the 
page validate. It is a mystery how any content there can make the 
validator reject the correct character reference on line 33.

I even tried deleting the &gt; reference. Then the validator issues an 
error message about &lt;. OK, let’s delete that too. Now it complains 
about &amp;. After removing that as well I get
“Error: & did not start a character reference. (& probably should have 
been escaped as &amp;.)”
with no reference to any line number.

OK, one more test: starting from the original document and deleting just 
the <textarea> element containing some sample HTML code as data (with 
“<” properly encoded as “&lt;”), the document passes.

The culprit appears to be on line 48:

  &lt;p&gt;10 строка | &w3 &lt;/p&gt;

Validating this line in isolation, with a minimal document around it, 
results in a correct message that points to the “&w3” construct.

The bug in the validator is that it does not report this properly at all 
in the given context but instead flags completely correct character 
references *before* it as erroneous.

The bug is reproducible at http://validator.nu too.

Yucca

Received on Sunday, 26 October 2014 22:30:28 UTC