Re: SGML, HTML, and whitespace [was: 24 Oct Release of HTML 4.0 spec]

Dan Connolly wrote:

[All changes below will appear in the next draft.]

> I think that adding section numbers to such XXX references
> (and in the process, checking the HTML 4 spec against
> the XXX spec for consistency) would be cost-effective
> in improving the quality of the HTML 4 spec. Please
> do it if you can find the time before 31 Oct; otherwise,
> maybe before the HTML 4 REC. For example,
> I suggest the spec read:
>         "SGML rules for and line
>         breaks (c.f. [ISO8879] section 7.6.1) ..."

I am adding this to our TODO list.

> [Re: sgmltut.src]
> "A complete discussion of SGML parsing, e.g. the
> mapping of a sequence of characters to a sequence of tags and data,
> is left to the SGML standard. This section is only a summary. "
> 
> That text appears in 3.1, but applies to all of section 3,
> esp elements, attributes, ...

Ok, I'm generalizing this statement to read:

   A complete discussion
   of SGML is left to the SGML standard. The following sections only
   provide summary information.


> Please subordinate the sections "HTML syntax " and
> "How to read the HTML DTD" under "Introduction to SGML"

I don't really agree and would like to discuss this more.

> "SGML (and HTML) rules for white space characters and line breaks
> allow authors to write legible documents with white space and extra
> lines that will not be rendered by a user agent. "
> 
> That just totally muddies the waters. There are SGML rules
> about ignored record start/record end characters (c.f. ISO8879,
> section 7.6.1), and then there are HTML rules (suggestions,
> actually) about collapsing whitespace during rendering.
> 
> The distinction between normative and non-normative information
> is crucial, and has been repeatedly stressed by members of
> the WG. We MUST not lose it in our efforts to make the spec
> more readable.
> 
> The HTML rules MUST NOT appear in this non-normative section.

Then I shall remove this section from the SGML tutorial and leave
the (existing) section in text.src that discusses SGML's 
treatment of newlines and HTML's treatment of white space
separately.

> And I notice 10.1 still says must as in "Thus, the following
> two examples must be rendered identically: ". Change it to should.

Also gone.

> More notes as I review this section:
> 
> "SGML applications conforming to [ISO8879] are expected to recognize
> a number of features that aren't widely supported by HTML user
> agents. "
> 
> That seems to confuse the term "SGML application" (which means
> something more like "SGML profile" than "SGML implementation)
> with "SGML system" which means "SGML implementation".
> 
> In short, please s/applications/systems/.

I have already moved this material to appendix/notes.src, and
made your new suggested changes there.

> Another:
> 
> "Document Type Declaration Subset"
> 
> Strike this whole subsection, and change the "should"
> in "8.2 HTML version information" to "must" per my
> earlier message.

This is currently in global.src and reads as you indicate.

> "3.3.3 Element definitions"
> 
> Change it to "Element Declarations"
> 
> And it confuses "element" with "element type." An element
> declaration declares an element type; an element is
> a particluar start-tag/content/end-tag sequence within
> the instance, not a class of things.
> 
> One way to effect this change whithout sounding too pedantic
> is to *not* refer to elements at all in 3.3.3, but only
> to element names, content models, etc. For example,
> 
> s/The element being defined is UL/The element name is UL/
> 
> Occasionally you have to speak of element types. But
> we can still be kinder, gentler without being imprecise,
> ala:
> 
> s!both the start tag <UL> and the
>      end tag </UL> for this element !both the start tag <UL> and the
>      end tag </UL> for this type of element !

I will go through the tutorial (and if I have time, the spec)
and clean this up as you indicate.

Thank you,

Ian

-- 
Ian Jacobs / 401 Second Ave. #19G / New York, NY 10010 USA
Tel/Fax: (212) 684-1814
Email: ibjacobs@panix.com

Received on Wednesday, 29 October 1997 09:28:01 UTC