W3C home > Mailing lists > Public > www-validator@w3.org > June 2001

Re: Non-SGML Char Refs

From: Thanasis Kinias <tkinias@optimalco.com>
Date: Fri, 08 Jun 2001 09:21:58 -0700
To: Martin Duerst <duerst@w3.org>, Bjoern Hoehrmann <derhoermi@gmx.net>
Cc: "'www-validator@w3.org'" <www-validator@w3.org>
Message-id: <01060809215801.05845@localhost.localdomain>
On Tuesday 05 June 2001 01:02, Martin Duerst wrote:
> At 04:32 01/06/05 +0200, Bjoern Hoehrmann wrote:
> >* Thanasis Kinias wrote:
> > >The validator complains about "non-SGML character" references (e.g.,
> > > &#147; instead of the correct &#8220;) only when validating as XHTML. 
> > > That implies that &#147; and the other Microsoft characters from
> > > decimal 128-159 (hex 80-9f) _are_ valid in HTML.
> >
> >They are, they just refer to non-printing control characters.
> No, sorry, they are not. See
> http://www.w3.org/TR/REC-html40/sgml/sgmldecl.html
>           BASESET  "ISO Registration Number 177//CHARSET

Funny, I quoted this exactly in my original post.  Great minds must think 
alike, eh Martin?

> Actually, these code positions are valid (though rather useless)
> in XML, but they are invalid in HTML. So I'm not sure what the
> result is for XHTML.

The intent of my original post (which was admittedly not entirely clear) was 
to find out why the validator shows exactly the opposite of this:  it accepts 
the characters in HTML4 but complains in XHTML.  (WDG's, BTW, complains about 
them under HTML4 DTDs, too.).

I don't think these can be valid code positions in XML, because an XML doc is 
also a SGML doc, so if SGML disallows them XML must also, no?

At any rate, the validator is producing erroneous output for HTML4, and maybe 
for XHTML as well.

Thanasis Kinias
Vice President & Manager of Information Systems
Optimal LLC
Scottsdale, Arizona, USA
Received on Friday, 8 June 2001 12:22:10 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:30:31 UTC