assorted HTML and SGML questions

Joe Wells (jbw@cs.bu.edu)
Sat, 18 Nov 1995 22:30:14 -0500


From: jbw@cs.bu.edu (Joe Wells)
Date: Sat, 18 Nov 1995 22:30:14 -0500
To: www-html@w3.org
Subject: assorted HTML and SGML questions
Message-Id: <SatNov18222818EST1995jbw@cs.bu.edu>

Hi, HTML and SGML gurus,

I've got some more questions the answers to which I haven't been able to
find in my WWW browsing.  Some of these questions are about HTML, some are
about SGML, and some are about HTML as an SGML document type.

Q: (("text/html" Internet Media Type)) Does text/html forbid including the
   SGML declaration (<!SGML ...>)?  I know it forbids including a document
   type declaration subset, but the standard is unclear on whether the
   SGML declaration is allowed.
   
Q: ((Internet Media Types for SGML)) Since the text/html Internet media
   type forbids including a DTD subset, what media type should one use if
   one wishes to transmit an HTML document with a DTD subset via HTTP?  Is
   there something like a text/sgml media type defined anywhere?
   
Q: ((HTML and Empty P Elements)) What are the semantics of an empty P
   element in HTML?  The standard doesn't really seem to deal with this.
   There are *lots* of documents on the net with *lots* of empty P
   elements.  Is it reasonable for a user agent to issue a warning that
   this is bad HTML?
   
Q: ((SGML Mixed Content)) I'm not sure if I understand the mixed content
   rules properly.  Let me state what I guess the rules are so that you
   can tell me if I got it right or wrong.  Here is what I think the rules
   are:
     
     * If a content model contains #PCDATA anywhere, the the entire
       element has "mixed content".
     * If an element does _not_ have mixed content, then a sequence of
       characters between two tags that is solely composed of whitespace
       (SPACE, TAB, RS, RE) is ignored, otherwise the whitespace is
       treated as ordinary data characters and must correspond to an
       occurrence of #PCDATA in the content model.
   
   Is this right?
   
Q: ((SGML LITLEN)) Is the SGML limit on attribute value lengths (LITLEN)
   applied to the attribute value after parsing and entity replacement or
   before?
   
Q: ((HTML PRE Containing FORM)) RFC 1866 says this:
   
     For example, a <PRE> element may contain a <FORM> element, ...
   
   This doesn't make any sense because it contradicts the DTD given in the
   same document.  What's the story here?
   
Q: ((HTML INPUT and SELECT Attributes)) Why is the SIZE attribute of the
   INPUT element specified to have type CDATA while the SIZE attribute of
   element SELECT is specified to have type NUMBER?  Is this to allow
   dimension units to be specified?  It doesn't say in the standard.

Thanks for any help you can give me.

-- 
Joe Wells <jbw@cs.bu.edu>