Validator-Bug with: <div>:1:&nbsp;</div>

Hi

I think we found a serious bug in the validator. According to my
understanding the following document is valid:

-------------------------------------------
http://www.manuelmoser.de/stuff/validator/notvalid1.html
-------------------------------------------
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>blah</title>
</head>
<body>
<div>:1:&nbsp;</div>
</body>
</html>
-------------------------------------------

But the validator complains about the document with the following
message:

-------------------------------------------
Validation Output: 1 Error
 Line 1, Column 14: &nbsp;</div>.
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/x
-------------------------------------------


Some research has shown, that the line number seems to be very random
for this error (depending on the code). I even got an error in line 34
in an document with 18 lines of code. The third line mostly refers to
a random line, or to some tags after the line, where I see the
problem. This gave me some hints and I was able to reduce the problem
to this line: 

<div>:1:&nbsp;</div>

Important for the error seems to be the fact that there a two colon
and a number in between, followed by a &nbsp;

Some examples:

Removing the number makes the document valid:

-------------------------------------------
http://www.manuelmoser.de/stuff/validator/valid1.html
-------------------------------------------
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>blah</title>
</head>
<body>
<div>::&nbsp;</div>
</body>
</html>
-------------------------------------------

Changing the number to a letter makes the document valid:

-------------------------------------------
http://www.manuelmoser.de/stuff/validator/valid2.html
-------------------------------------------
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>blah</title>
</head>
<body>
<div>:a:&nbsp;</div>
</body>
</html>
-------------------------------------------

Removing the &nbsp; makes the document valid:

-------------------------------------------
http://www.manuelmoser.de/stuff/validator/valid3.html
-------------------------------------------
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>blah</title>
</head>
<body>
<div>:1:</div>
</body>
</html>
-------------------------------------------

If someone could verify this bug we should put it into Bugzilla.

Manuel Moser

Received on Friday, 27 July 2007 15:30:40 UTC