Re: assorted HTML and SGML questions

Terry Allen (terry@ora.com)
Sun, 19 Nov 1995 07:19:36 PST


Message-Id: <199511191519.HAA19793@rock.west.ora.com>
From: Terry Allen <terry@ora.com>
Date: Sun, 19 Nov 1995 07:19:36 PST
In-Reply-To: jbw@cs.bu.edu (Joe Wells)
To: jbw@cs.bu.edu (Joe Wells), www-html@w3.org
Subject: Re: assorted HTML and SGML questions

| Q: (("text/html" Internet Media Type)) Does text/html forbid including the
|    SGML declaration (<!SGML ...>)?  I know it forbids including a document
|    type declaration subset, but the standard is unclear on whether the
|    SGML declaration is allowed.

I don't see where it is forbidden, but of course it isn't needed.

| Q: ((Internet Media Types for SGML)) Since the text/html Internet media
|    type forbids including a DTD subset, what media type should one use if
|    one wishes to transmit an HTML document with a DTD subset via HTTP?  Is
|    there something like a text/sgml media type defined anywhere?

Not forbidden:

   HTML user agents may support other document types. In particular,
   they may support other formal public identifiers, or other document
   types altogether. They may support an internal declaration subset
   with supplemental entity, element, and other markup declarations.

I don't see any language that would classify an HTML doc amplified by
an internal subset as not HTML (interestingly).  text/sgml is being
defined in the MIMESGML WG.  See draft-ietf-mime-sgml-00.txt or its
successor.  

| Q: ((HTML and Empty P Elements)) What are the semantics of an empty P
|    element in HTML?  The standard doesn't really seem to deal with this.
|    There are *lots* of documents on the net with *lots* of empty P
|    elements.  Is it reasonable for a user agent to issue a warning that
|    this is bad HTML?

No, Ps without content are valid HTML.  The meaning of a P without 
content is that it's a paragraph without any words in it ...

| Q: ((SGML Mixed Content)) I'm not sure if I understand the mixed content
|    rules properly.  Let me state what I guess the rules are so that you
|    can tell me if I got it right or wrong.  Here is what I think the rules
|    are:

Don't guess, buy a good book on SGML.  See ftp://ftp.ifi.uio.no/pub/SGML/ .

| Q: ((SGML LITLEN)) Is the SGML limit on attribute value lengths (LITLEN)
|    applied to the attribute value after parsing and entity replacement or
|    before?

In practice, it is ignored by most tools, and a good thing.  For details,
see that book you're going to buy, or spend a day with the Handbook.

| Q: ((HTML PRE Containing FORM)) RFC 1866 says this:
|    
|      For example, a <PRE> element may contain a <FORM> element, ...
|    
|    This doesn't make any sense because it contradicts the DTD given in the
|    same document.  What's the story here?

A DTD is code+prose, so the statement above belongs to the DTD just as does
the code.  I take the code to be normative.


Good questions.  We don't see their like much around here.

Regards,

-- 
Terry Allen  (terry@songline.com), Online Books Editor, Songline Studios
               affiliated with O'Reilly & Associates, Inc.   
A Davenport Group sponsor.  See http://www.ora.com/davenport/README.html
 "Laid across a map of the US, Indonesia would stretch from coast to coast."