Re:  (was: link check) from Martin Duerst on 2001-07-03 (www-validator@w3.org from July 2001)

From: Martin Duerst <duerst@w3.org>
Date: Tue, 03 Jul 2001 23:44:42 +0900
To: Frank Ellermann <Frank.Ellermann@t-online.de>, Terje Bless <link@tss.no>, Tim Bagot <tsb-w3-validator-0004@earth.li>, Hugo Haas <hugo@w3.org>
Cc: www-validator@w3.org
Message-Id: <4.2.0.58.J.20010703233350.03d6a930@sh.w3.mag.keio.ac.jp>

Hello Frank,

&#128; and &#131;, with or without an explicit charset,
is wild abuse. These are not what you mean, independent
of the charset. &#128; is an undefined control character.
&#131; is NBH (no break here) (see
http://www.unicode.org/charts/PDF/U0080.pdf). All numeric
character references refer to Unicode, since HTML 2.0,
even if some older browsers don't do that correctly.

Actually, although strictly speaking, the character numbers
in the &#x80; - &#x9F; range are legal in XML
(see http://www.w3.org/TR/REC-xml#NT-Char), I'm thinking
about checking them in the validator because using them
(as something that they are not) is a very frequent mistake.

Regards,   Martin.

At 11:51 01/07/03 +0200, Frank Ellermann wrote:
>Hi Terje, Tim, and Hugo...
>
>thanks for your answers, now I'll know how to interpret
>this kind of check result (and I even managed to create a
>form doing this without further typing :-)
>
>I hope you do like these problems, because here's my next
>observation, now it's the XHTML-transitional-validator:
>
>Trying to find a workaround for &euro; and &fnof; with
>my (very) old browser I now abuse &#128; and &#131; and
>an explicit charset (instead of documenting the abuse).
>
>The XHTML-check doesn't comment this practice.  Later I
>needed the same hack in another document, but a bug in
>my script generated a DOS-EOF character at the end (hex.
>1A, remember ? :-)  Of course the validator does not
>accept this... but it also complains about the 2nd of 2
>&#128; and the 1st of 2 &#131; !?!
>
>After removing the EOF-nonsense: *No errors found.  So
>a single character can have strange side effects in
>other parts of the checked document... intentionally ?
>
>                 Bye, Frank

Received on Tuesday, 3 July 2001 23:29:20 UTC