W3C home > Mailing lists > Public > www-html@w3.org > October 1999

RE: FW: I-D ACTION:draft-connolly-text-html-00.txt

From: Larry Masinter <masinter@parc.xerox.com>
Date: Mon, 4 Oct 1999 13:06:35 PDT
To: <Jukka.Korpela@hut.fi>
Cc: <www-html@w3.org>
Message-ID: <000b01bf0ea3$f5f81e80$8c67010d@copper.parc.xerox.com>
in regard to
> >   ftp://ftp.parc.xerox.com/pub/masinter/draft-connolly-text-html-02.txt
which says:
> > It is intended to obsolete the previous IETF
> > documents defining HTML, including RFC 1866, RFC 1867, RFC 1980,
> > RFC 1942 and RFC 2070.

Jukka wrote:

> But simply obsoleting those RFCs would be going backwards, unless
> essential information from them is carefully incorporated into
> the HTML specification by the W3C. There is a large number of issues
> where those RFCs address points which are not addressed or have been
> formulated vaguely in HTML 4.0 (or HTML 4.01).

It is now my belief that most of these points were intentionally
left more vague; see below.

>   ... For example,
>   - RFC 1866 makes the useful requirement that EM and STRONG must
>     be rendered as distinctly from each other and from normal text.

In fact, since implementations vary on whether EM and STRONG are
rendered distinctly, authors should not rely on there being
a distinction. This might form a 'note' in HTML 4.01.

>   - RFC 1866 makes some vague notes on nested text-level elements,
>     whereas HTML 3.2 suggests that "user agents should do their best to
>     respect nested emphasis" - and HTML 4.0 ignores the whole issue,
>     unless I'm missing something;
>     this is _not_ just a presentational question - the basic problem
>     is whether e.g. in <strong>...<em>...</em>...</strong>
>     the text inside the EM element is just emphasized as compared with
>     normal text or _relatively_ emphasized with respect to the content
>     of its parent element

Again, since if there are no guarantees and variable implementation,
then it is more appropriate to note this. I'm hoping that an explicit
note to this effect can be added to HTML 4.01 before it becomes
a recommendation.

>   - RFC 1866 describes, under "6. Characters, Words, and Paragraphs",
>     TEXTAREA as preformatted text, which corresponds to actual
>     implementations; the corresponding discussion in HTML 4.0 spec
>     explicitly says that PRE is the only exception to the collapse
>     of white space

HTML 4.01 should probably note this change more explicitly.

> - RFC 2070 makes, with some handwaving, serious attempts at
>   formulating requirements and recommendations on having
>   character encoding ("charset") information handled properly;
>   as an example, it makes the following important point (in 1.2.2):
>       To ensure interoperability and proper support for at least ISO-
>       8859-1 in an environment where character encoding schemes other
>       than ISO-8859-1 are present, user agents MUST correctly interpret
>       the charset parameter accompanying an HTML document received from
>       the network.
>   But HTML 4.0 does not seem to make any such requirement, or even
>   a suggestion. In fact, it explicitly says: "This specification does not
>   mandate which character encodings a user agent must support." 
>   ( http://www.w3.org/TR/REC-html40/charset.html#h-5.2 )   
>   (Note: Not even US-Ascii is required to be supported!)

Well, if there's a requirement for various charset parameters,
it could be in draft-connolly-text-html rather than in the HTML
specification itself. My impression is that the ISO-8859-1 requirement
only applies to HTTP in any case. See section 19.3 of RFC 2616. 

> - in menus created with a set of radio buttons or with a select element,
>   there is great confusion between different specs (incl. RFC 1866),
>   and HTML 4.0 doesn't clear it up - au contraire, it increases
>   the vagueness; see http://www.hut.fi/u/jkorpela/forms/choices.html#app

This is something that's worth clarifying in HTML 4.01, if only
to note the ambiguity.

> - for file input, RFC 1867 is the only extensive description, and
>   HTML 4.0 makes references to it a vague manner - something between
>   informative and normative it seems; anyway, it is obvious from the note
>   at http://www.w3.org/TR/REC-html40/interact/forms.html#h-
>   that the HTML 4.0 specification was not written to be a standalone
>   description of all aspects of file input;

I agree that this is a problem, and hope that substitute wording can
be added to HTML 4.01.

>   nasty question of the day: what does the maxlength attribute mean
>   in an <input type="file">
>   a) by RFC 1867
>   b) by HTML 4.0
>   c) in current implementations?

Good question, but what's the answer?

> I suppose this contains more examples than would really be needed.

Actually, it would be great to have a complete list as feedback to
the "Proposed Recommendation".

> Wouldn't it be better to specify, as an interim solution, to make
> it possible to redefine the text/html media type in a reasonable time,
> that the HTML-related RFCs are _not_ obsoleted? On the contrary,
> they should be listed as sources of additional information, to be
> used in issues where they relate to the meanings of constructs
> defined in HTML 4.0(1).

Because something is obsoleted doesn't mean that it is no longer
a reference to what the specification _used to be_. The RFC doesn't
go away. But I wouldn't recommend anyone use the RFCs, either
as guidelines for authoring or as a guideline for implementing.

If the HTML 4.01 document is inadequate, then we can and
should fix it. There's not much time, so your help would be

Received on Monday, 4 October 1999 16:06:34 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:05:51 UTC