Re: Extraneous Characters in Netscape Display

Paul Grosso (paul@arbortext.com)
Thu, 25 May 95 11:00:09 BST


Date: Thu, 25 May 95 11:00:09 BST
From: paul@arbortext.com (Paul Grosso)
Message-Id: <9505251000.AA13809@texcel.no.texcel.no>
To: www-html@www10.w3.org
Subject: Re: Extraneous Characters in Netscape Display

> From: "Alexander, Larry" <lalexander@acad.com>
> 
> I have been creating Web pages using the netscape rules file in HoTMetaL 
> Pro. Problems appear when I have pages with special entities. Each file 
> begins with:
> 
> <!DOCTYPE HTML PUBLIC "-//Netscape Corp.//DTD HTML plus Tables//EN" 
> "html-net.dtd"
> [
> <!ENTITY pound CDATA "">
> ]>
> 
> When using NCSA Mosaic v2B4 everything looks fine. However, with Spry Air 
> or Netscape, I see:
> 
> ]>
> 
> at the head of each page.
> 
> Is HoTMetaL doing something illegal with the !ENTITY declaration?

The document type declaration--including the optional 'internal subset' 
(the part from [ to ] inclusive)--is valid SGML.

However, it is probably the case that many browsers do not handle the
optional internal subset.  It looks like some broswers blindly look
for the first > to end the doctype declaration.

Not handling the internal subset is probably understandable for non-SGML
browsers.  However, it would be better [serious understatement!] if 
browsers that don't handle the internal subset would at least skip over
it properly.

I also note, however, that the Public Identifier you show is not one
of the valid choices given in the HTML 2.0 spec.  From the 95 March 31
spec by Dan:

  4.  Document Structure Elements

   To identify information as an HTML document conforming to this
   specification, each document should start with the prologue:

       <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">

       Note: If the body of a text/html body part does not begin
       with a document type declaration, an HTML user agent should
       infer the above document type declaration.

[There are other valid Public Identifiers listed later in the spec, but 
none like the one you show above.]  So, while valid SGML, the document 
you show is not valid HTML.  As such, though you can expect your document
to work with SGML browsers (given that you make your DTD and whatever
style information that's needed available), you cannot necessarily expect
your document to work with non-SGML, HTML-compliant browsers.  Unless I
misunderstand the HTML 2.0 spec, an HTML browser has every right to see 
your doctype statement and say, "oops, this isn't HTML, so I won't attempt
to display this."  

While it might be reasonable behavior for a very forgiving browser to
say "I don't recognize the Public Id in this doctype declaration, so
I'll skip over it and pretend it wasn't there and instead infer the
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN"> declaration and
continue in that manner with this document," I don't think it's safe
to assume that will always happen.


paul

Paul Grosso
VP Research, ArborText, Inc.
  and
Chief Technical Officer, SGML Open

Email: paul@arbortext.com