Re: € (was: link check)

From: Martin Duerst (duerst@w3.org)
Date: Tue, Jul 03 2001

  • Next message: Tomasz Trejderowski: "Whole Page Validate"

    Message-Id: <4.2.0.58.J.20010703233350.03d6a930@sh.w3.mag.keio.ac.jp>
    Date: Tue, 03 Jul 2001 23:44:42 +0900
    To: Frank Ellermann <Frank.Ellermann@t-online.de>, Terje Bless <link@tss.no>, Tim Bagot <tsb-w3-validator-0004@earth.li>, Hugo Haas <hugo@w3.org>
    From: Martin Duerst <duerst@w3.org>
    Cc: www-validator@w3.org
    Subject: Re: &#128; (was: link check)
    
    Hello Frank,
    
    &#128; and &#131;, with or without an explicit charset,
    is wild abuse. These are not what you mean, independent
    of the charset. &#128; is an undefined control character.
    &#131; is NBH (no break here) (see
    http://www.unicode.org/charts/PDF/U0080.pdf). All numeric
    character references refer to Unicode, since HTML 2.0,
    even if some older browsers don't do that correctly.
    
    Actually, although strictly speaking, the character numbers
    in the &#x80; - &#x9F; range are legal in XML
    (see http://www.w3.org/TR/REC-xml#NT-Char), I'm thinking
    about checking them in the validator because using them
    (as something that they are not) is a very frequent mistake.
    
    Regards,   Martin.
    
    At 11:51 01/07/03 +0200, Frank Ellermann wrote:
    >Hi Terje, Tim, and Hugo...
    >
    >thanks for your answers, now I'll know how to interpret
    >this kind of check result (and I even managed to create a
    >form doing this without further typing :-)
    >
    >I hope you do like these problems, because here's my next
    >observation, now it's the XHTML-transitional-validator:
    >
    >Trying to find a workaround for &euro; and &fnof; with
    >my (very) old browser I now abuse &#128; and &#131; and
    >an explicit charset (instead of documenting the abuse).
    >
    >The XHTML-check doesn't comment this practice.  Later I
    >needed the same hack in another document, but a bug in
    >my script generated a DOS-EOF character at the end (hex.
    >1A, remember ? :-)  Of course the validator does not
    >accept this... but it also complains about the 2nd of 2
    >&#128; and the 1st of 2 &#131; !?!
    >
    >After removing the EOF-nonsense: *No errors found.  So
    >a single character can have strange side effects in
    >other parts of the checked document... intentionally ?
    >
    >                 Bye, Frank