- From: Adam M. Costello <amc@cs.berkeley.edu>
- Date: Sat, 27 Dec 1997 23:23:40 -0800 (PST)
- To: www-html-editor@w3.org
Text indented 4 spaces is mine. Text indented 8 spaces is quoted
from the spec. Unindented section headings provide context for the
subsequent comments.
Many of the comments point out typos. Some point out confusing,
misleading, or imprecise parts of the spec, and suggest
clarifications or additions (unless I was baffled).
Sorry I didn't look at the spec when it was still a Proposed
Recommendation, but the semester just ended.
AMC
2.1.3 Relative URIs
Relative URIsare resolved to full URIs using a base URI.
^^^^^^^
Should be "URIs are".
3.3.3 Element declarations
A few HTML element types use an additional SGML feature to
exclude elements from content model.
^^^^^^^^^^^^^^^^^^
Should be "from a content model" or "from content models".
3.3.4 Attribute declarations
In HTML, boolean attributes may be appear in minimized form --
^^
Remove.
6.3 Text strings
For introductory about attributes,
Reword.
7.4.4 Meta data
The meaning of a property and the set of legal values for
that property should be defined in a reference lexicon
called profile.
^^^^^^^^^^^^^^
Should be "called a profile", right?
8.1 Specifying the language of content: the lang attribute
<P><Q lang="en">"Her super-powers were the result of
^
Remove the quotation mark.
8.2.4 Overriding the bidirectional algorithm: the BDO element
One reason for this may be that the MIME standard ([RFC2045],
[RFC1556]) favors visual order, i.e., that right-to-left
character sequences are inserted right-to-left in the byte
stream.
I don't think this means what was intended. My best-effort
interpretation of "right-to-left character sequences are inserted
right-to-left in the byte stream" is that the rightmost character
appears first in the byte stream. But that is the opposite of RFC
1556 visual directionality, which requires the leftmost character
to appear first in the byte stream. I strongly recommend using
phrases like "leftmost character first" and avoiding phrases like
"right-to-left in the byte stream", because byte streams do not have
a left and right, only an earlier and later.
8.2.5 Character references for directionality and joining control
Mirrored character glyphs. In general, the bidirectional
algorithm does not mirror character glyphs but leaves them
unaffected. An exception are characters such as parentheses (see
[UNICODE], table 4-7).
Although the Unicode character names and example glyphs are
available online, the text of the spec is not, so I wish the HTML
spec would elaborate a bit on the mirroring of parentheses. If
characters the characters ( and ) were called "open parenthesis" and
"close parenthesis", I could understand why their appearance would
depend on the directionality of the text. But they're called "left
parenthesis" and "right parenthesis", so I don't see why they would
ever be mirrored. In right-to-left text, you would obviously begin
a parenthetical with a right parenthesis, and end it with a left
parenthesis, correct?
9.1 White space
authors should not rely on user agents to render white space
immediately after a start tag or immediately before an end tag.
What about the converse? Should authors also not rely on user
agents *not* to render whitespace immediately after a start tag?
For example, may authors assume that these will be rendered the
same:
<li>foo
<li> foo
or should authors always use the first form?
9.2.1 Phrase elements: EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE,
ABBR, and ACRONYM
The HTML 2.0 spec contains more description and examples for these
elements. I think they should have been retained.
<ABBR lang="es" title="Doña">Doña</ABBR>
The title is identical to the content.
9.3.4 Preformatted text: The PRE element
width = number [CN]
This attribute provides a hint to visual user agents about
the desired width of the formatted block.
By definition, preformatted text already has a width, which can be
determined by scanning it and noticing the length of the longest
line. Maybe you mean that this attribute provides a hint, not about
the width of the text, but about the width of the window for which
the text was formatted.
When handling preformatted text, visual user agents:
May leave white space intact.
May render text with a fixed-pitch font.
May disable automatic word wrap.
Shouldn't each "may" be "should"? Authors usually depend on these
for vertical alignment.
11.2.4 Column groups: the COLGROUP and COL elements
The table in this example contains six columns. The first one
does not belong to an explicit column group.
But later:
<TABLE>
<COLGROUP>
<COL width="30">
<COLGROUP>
<COL width="30">
<COL width="0*">
<COL width="2*">
<COLGROUP align="center">
<COL width="1*">
<COL width="3*" align="char" char=":">
<THEAD>
<TR><TD> ...
...rows...
</TABLE>
And then:
We have set the value of the align attribute in the second
column group to "center".
It looks like the text and the example do not agree.
11.4.2 Categorizing cells
In order to determine, for example, the costs of meals on 25
August, the user agent must know which table cells refer to
"Meals" (all of them)
^^^^^^^^^^^^^^^^^^^^^
No, only cells in the Meals column refer to meals. Maybe you meant
"which table cells refer to "Expenses" (specifically, Meals)".
12.1.1 Visiting a linked resource
Note that the hrefattribute in each source anchor
^^^^^^^^^^^^^
Insert a space.
12.1.2 Other link relationships
Links that express other types of relationships have one or more
link type specified in their source anchor.
^^ ^^
These nouns should be plural.
13.2 Including an image: the IMG element
User agents must render alternate next when they cannot support
^^^^
Should be "text".
13.3.2 Object initialization: the PARAM element
Any number of PARAM elements may appear in the content of an
OBJECT or APPLETelement,
^^
Insert a space.
13.3.4 Object declarations and instantiations
<P><OBJECT declare id="tribune" ...
<PARAM name="font" valuetype="object" value="#tribune">
Is the pound sign supposed to be there? Section 13.3.2 said:
object: The value specified by value is an identifier that
refers to an OBJECT declaration in the same document. The
identifier must be the value of the id attribute set for the
declared OBJECT element.
That suggests to me that the PARAM element should have
value="tribune", with no pound sign.
13.6.1 Client-side image maps: the MAP and AREA elements
usemap = uri [CT]
This attribute associates an image map with an element. The
image map is defined by a MAP element. The value of usemap
must match the value of the name attribute of the associated
MAP element.
Since the value of the usemap attribute is a URI, it should be
permissible to refer to a MAP element from another document. None
of the examples do this. Is it allowed?
By the way, the idea of allowing the shape and coords attributes in
A elements is brilliant!
13.7 Visual presentation of images, objects, and applets
All IMG and OBJECT attributes that concern visual alignment and
presentation have been deprecated in favor of style sheets.
This is imprecise. Some of the attributes mentioned in 13.7 are not
deprecated (width, height), some of them are deprecated but don't
say so (vspace, hspace, align), and some of them say that they're
deprecated (border). I suggest removing the above sentence and
inserting explicit "deprecated" indications wherever appropriate.
14.2.3 Header style information: the STYLE element
The title attribute appears in the DTD but is not mentioned in the
text. Later, in section 14.4, there is an example of the title
attribute of a LINK element, but not of a STYLE element. This
leaves the reader unconfident about the use of the title attribute
with the STYLE element.
14.3.2 Specifying external style sheets
For example, to set the preferred style sheet to "compact" (see
the preceding example),
Actually, the previous example used "Compact", and the title
attribute is case sensitive. Since the subsequent examples use
"compact", perhaps the first one should be changed to match.
17.3 The FORM element
The value is a space- and/or comma-delimited list of charset
values.
This attribute specifies a comma-separated list of content types
Throughout the spec, some attribute values are space-separated, some
are comma-separated, and some are space- and/or comma-separated. Is
there a simple rule that one can memorize, rather than consulting
the spec every time? If so, this rule should be stated somewhere.
17.4 The INPUT element
readonly (readonly) #IMPLIED -- for text and passwd --
^^^^^^
Should be "password" (in the actual DTD too).
17.10 Adding structure to forms: the FIELDSET and LEGEND elements
/samp
This must be a typo at the very end of the section.
17.11.2 Access keys
accesskey = character [CN]
How is this case neutral? Doesn't it have to be either case
sensitive or case insensitive? Am I allowed to have one control
with an accesskey of "C" and another with an access key of "c"? (I
vote no.)
By the way, shouldn't the spec say that no two controls in the same
document should have the same accesskey?
We recommend that authors include the access key in label text
or wherever the access key is to apply. User agents should
render the value of an access key in such a way as to emphasize
its role and to distinguish it from other characters (e.g., by
underlining it).
I think this should be more precise. Maybe you mean:
We recommend that authors include the access key in the contents
of the A, AREA, BUTTON, LABEL, or LEGEND element, or in the value
attribute of the INPUT element of type submit, reset, or button.
User agents should render the first occurrence of the access key
(using case-insensitive matching) in such a way as to emphasize
its role and to distinguish it from other characters (e.g., by
underlining it).
17.12.2 Read-only controls
The following elements support the readonly attribute: INPUT,
TEXT, PASSWORD, and TEXTAREA.
There are no such elements as TEXT and PASSWORD. You probably mean
INPUT elements of type text and password. I don't know whether you
mean to include all other types of INPUT as well.
17.13.4 Form content types
1. Control names and values are escaped. Space characters are
replaced by `+', and then reserved characters are escaped
as described in [RFC1738], section 2.2: Non-alphanumeric
characters are replaced by `%HH', a percent sign and two
hexadecimal digits representing the ASCII code of the
character. Line breaks are represented as "CR LF" pairs
(i.e., `%0D%0A').
This was lifted almost verbatim from the HTML 2.0 spec, but changing
"escaped: space" to "escaped. Space" adds confusion (by making the
first sentence seem like a separate step), as does removing the
"that is," before "non-alphanumeric" (making that sentence seem like
a separate step).
The file name may be specified with the "filename" parameter of
the 'Content-Disposition: form-data' header, or, in the case of
multiple files, in a 'Content-Disposition: file' header of the
subpart.
The examples use 'Content-Disposition: attachment' in the subparts,
rather than 'Content-Disposition: file'. Are both correct? Is one
preferred?
18.2.2 Specifying the scripting language
It is also possible to specify the scripting language in each
SCRIPT element via the type attribute. In the absence of a
default scripting language specification, this attribute must be
set on each SCRIPT element.
This makes it sound like the type attribute is optional on SCRIPT
elements, but the DTD says it's required.
a name attribute takes precedence over a id if both are set.
^^^^
Should be "an id".
24.2.1 The list of characters
<!ENTITY not CDATA "¬" -- not sign = discretionary hyphen,
^^^^^^^^^^^^^^^^^^^^^^^
I suspect that's not supposed to be there. It should be removed in
HTMLlat1.ent too.
24.3.1 The list of characters
It would be very nice if the comment for each entity included the
Adobe standard glyph name, since this list of entities was taken
directly from the Adobe Symbol font. Each glyph name begins with a
slash. I think the mapping is given here:
http://www.ams.org/html-math/tr9573-symbols.html
But that page doesn't state explicitly that the slash-names are the
Adobe standard glyph names. The Adobe PostScript reference manual
would be the authoritative source.
<!ENTITY weierp CDATA "℘" -- script capital P = power set
= Weierstrass p, U+2118 ISOamso -->
Is that considered a good mapping, or a compromise? I once looked
for a Unicode character matching this Symbol font glyph, and was not
satisfied with anything I found. If this is a compromise, there
should be a disclaimer to that effect.
24.4 Character entity references for markup-significant and
internationalization characters
Entities have also been added for the remaining characters
occurring in CP-1252 which do not occur in the HTMLlat1 or
HTMLsymbol entity sets. These all occur in the 128 to 159 range
within the cp-1252 charset.
What is CP-1252? It doesn't seem to be defined or referenced
anywhere. Also, either capitalize the second occurrence or
decapitalize the first.
Appendix A: Changes between HTML 3.2 and HTML 4.0
This appendix neglects to mention that the HTML 3.2 DTD allowed
%text in the content of BODY, but the HTML 4.0 DTD does not allow
%inline in the content of BODY. I think that's a noteworthy change.
A.3 Changes for accessibility
(see the longdesc attribute).
For some reason, "longdesc" is not a link in the hypertext spec, but
should be.
A.4 Changes for meta data
Authors may now specify profiles that provide explanations about
meta specified with the META or LINK elements.
^^^^
Should be "meta data".
A.9 Changes for forms
The readonly, allows authors to prohibit changes
^^^^^^^^^
Should be "readonly attribute".
Appendix B: Performance, Implementation, and Design Notes
Despite the appearance of words such as "must" and "should",
all requirements in this section appear elsewhere in the
specification.
Is that true of the requirement that "a line break immediately
following a start tag must be ignored, as must a line break
immediately before an end tag" (B.3.1 Line breaks)?
B.3.2 Specifying non-HTML data
Authors should therefore escape sequences "</" sequence within
the content.
Reword.
B.4 Notes on helping search engines index your Web site
You may help search engines by using the LINK element with
rel="begin" along with a TITLE, as in:
The section on link types recommended using rel=Start for this
purpose. Should authors use one, or the other, or both? Also, I
think you meant "title" (the attribute), not TITLE (the element).
The list of terms in the content is ALL, INDEX, NOFOLLOW,
NOINDEX. The name and the content attribute values are
case-insensitive.
This description is very incomplete, and leaves the reader with a
lot of uncertainty. Brief but complete documentation can be found
here:
http://info.webcrawler.com/mak/projects/robots/meta-user.html
By the way, both that page and a more complete and precise
specification of the robots.txt file are linked from:
http://info.webcrawler.com/mak/projects/robots/exclusion.html
You might want to have a reference to that page.
B.5.1 Design rationale
This can be altered by setting the width-TABLE attribute of the
TABLE element. ^^^^^^^^^^^
Should be "width".
B.5.2 Recommended Layout Algorithms
Rules for handling objects too large for column apply when the
explicit or implied alignment results in a situation where the
data exceeds the assigned width of the column.
"for column" should be "for a column". Which rules are being
referred to here?
The values for theframe attribute have been chosen to avoid
clashes with the rules, align and valign-COLGROUP attributes.
"theframe" should be "the frame", and "valign-COLGROUP" should be
"valign".
Received on Sunday, 28 December 1997 02:24:35 UTC