Re: Non-SGML Char Refs

From: Thanasis Kinias (tkinias@optimalco.com)
Date: Sun, Jul 15 2001

  • Next message: Terje Bless: "Re: CSS/XHTML Validators weirdness"

    Date: Sun, 15 Jul 2001 20:46:44 -0700
    From: Thanasis Kinias <tkinias@optimalco.com>
    To: Bjoern Hoehrmann <derhoermi@gmx.net>, Martin Duerst <duerst@w3.org>
    Cc: tkinias@optimalco.com, "'www-validator@w3.org'" <www-validator@w3.org>, www-html@w3.org
    Message-id: <01071520464400.12933@localhost.localdomain>
    Subject: Re: Non-SGML Char Refs
    
    On Sunday 15 July 2001 16:07, Bjoern Hoehrmann wrote:
    > * Martin Duerst wrote:
    > >At 04:32 01/06/05 +0200, Bjoern Hoehrmann wrote:
    > >>* Thanasis Kinias wrote:
    > >> >The validator complains about "non-SGML character" references (e.g.,
    > >> > &#147; instead of the correct &#8220;) only when validating as XHTML. 
    > >> > That implies that &#147; and the other Microsoft characters from
    > >> > decimal 128-159 (hex 80-9f) _are_ valid in HTML.
    > >>
    > >>They are, they just refer to non-printing control characters.
    >
    > The other way round, valid XML, invalid HTML.
    
    That's not what the validator says.  Check <http://www.asu.edu/>, for example 
    (which uses &#149; for bullets).  If you validate as XHTML 1.0 Transitional, 
    you get (among a myriad other errors) a bunch of "Error: reference to 
    non-SGML character" messages.  As HTML 4.01, no error is reported for the 
    &#149; character references.
    
    Whatever the situation is with non-HTML XML or with XHTML, with HTML<=4 these 
    character references should be reported as errors, because the SGML 
    declaration for HTML forbids them.
    
    -- 
    Thanasis Kinias
    Optimal LLC
    Scottsdale, Arizona, USA