W3C home > Mailing lists > Public > www-html-editor@w3.org > October to December 1997

Editorial comments (last ones ;)

From: Gerald Oskoboiny <gerald@w3.org>
Date: Wed, 10 Dec 1997 19:40:56 -0500 (EST)
To: www-html-editor@w3.org
Message-ID: <Pine.SOL.3.96.971210193658.20591L-100000@anansi>
The ToC says:

   14.Style Sheets - Controlling the presentation of an HTML document

Is this what we want to say? You can't "control" the presentation of
an HTML document; style sheets are just suggestions, aren't they?

Maybe better would be:

   14.Style Sheets - Adding presentation hints to an HTML document

?

Next, there are a couple typos in /struct/text.src:

    notes on line brakes</a> in the appendix.
    notes on line brakes</a> in the appendix.

should be "breaks" (2 occurrences.)


"B.3.1 Search robots"
http://www.w3.org/MarkUp/Group/9712/PR-html40-971205/appendix/notes.html#h-B.3.1

This section talks about the robots.txt file.

I didn't review this very carefully, but one thing I noticed
is that it says:

    There must be exactly one "User-agent" field.

This isn't quite right. There can be more than one User-Agent field
in the robots.txt file, just not more than one per "record". That's
not clear with the current wording.

But mainly, I'm surprised to see this documented in the HTML spec
at all. Why is it there? What if something else is incorrect?
Wouldn't it be better to cite an official source instead?

Maybe you didn't because there are no good official sources? :(
I can't find an up to date internet-draft or anything, but
some good sources are:

    http://www.kollar.com/robots.html
    http://info.webcrawler.com/mak/projects/robots/norobots-rfc.html
    http://info.webcrawler.com/mak/projects/robots/robots.html

Further down on that same page, in "Robots and the META element",
it says:

    The list of terms in the content is ALL, INDEX, NOFOLLOW, NOINDEX.

but those terms aren't defined anywhere; what do they mean?

Including a description of what they mean would be quite long,
but it doesn't seem to make sense to include them but not define
them.

You could cite one of the URLs above for the explanation, but I
guess there's no guarantee those URLs will be there forever.
(the first URL I gave has a good explanation on this stuff.)

Sorry, I don't know what to suggest as the solution to these
problems. I don't think this section belongs in this specification
at all, but I assume you had some reason to put it there.

Next,

"9.3.4 Preformatted text: The PRE element"
http://www.w3.org/MarkUp/Group/9712/PR-html40-971205/struct/text.html#h-9.3.4

IMG is excluded from PRE:

    <!ENTITY % pre.exclusion "IMG|OBJECT|BIG|SMALL|SUB|SUP">

Which makes the following type of page invalid:

    http://ugweb.cs.ualberta.ca/~gerald/validate/lib/

(this is a directory index generated by Apache or NCSA; it must be
in use in millions of pages on the Web.) This is a safe, reasonable
use of IMG within PRE.

I still think it's a mistake to make this type of thing invalid,
but I've tried to get this changed a number of times in the past
(in HTML 3.2 as well), with no luck. Maybe I should just give up?

That's all I have. I think this may be the last you hear from me
on HTML 4.0!

This specification is a wonderful piece of work (both the spec
itself and the tools you created to maintain it); great job!

Gerald
-- 
Gerald Oskoboiny            <gerald@w3.org>  +1 617 253 2920
System Administrator, W3C   http://www.w3.org/People/Gerald/
World Wide Web Consortium, MIT Labatory for Computer Science
545 Technology Square, Room NE43-353  Cambridge MA 02139 USA
Received on Wednesday, 10 December 1997 19:41:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:43 GMT