- From: Adam M. Costello <amc@cs.berkeley.edu>
- Date: Sat, 27 Dec 1997 23:23:40 -0800 (PST)
- To: www-html-editor@w3.org
Text indented 4 spaces is mine. Text indented 8 spaces is quoted from the spec. Unindented section headings provide context for the subsequent comments. Many of the comments point out typos. Some point out confusing, misleading, or imprecise parts of the spec, and suggest clarifications or additions (unless I was baffled). Sorry I didn't look at the spec when it was still a Proposed Recommendation, but the semester just ended. AMC 2.1.3 Relative URIs Relative URIsare resolved to full URIs using a base URI. ^^^^^^^ Should be "URIs are". 3.3.3 Element declarations A few HTML element types use an additional SGML feature to exclude elements from content model. ^^^^^^^^^^^^^^^^^^ Should be "from a content model" or "from content models". 3.3.4 Attribute declarations In HTML, boolean attributes may be appear in minimized form -- ^^ Remove. 6.3 Text strings For introductory about attributes, Reword. 7.4.4 Meta data The meaning of a property and the set of legal values for that property should be defined in a reference lexicon called profile. ^^^^^^^^^^^^^^ Should be "called a profile", right? 8.1 Specifying the language of content: the lang attribute <P><Q lang="en">"Her super-powers were the result of ^ Remove the quotation mark. 8.2.4 Overriding the bidirectional algorithm: the BDO element One reason for this may be that the MIME standard ([RFC2045], [RFC1556]) favors visual order, i.e., that right-to-left character sequences are inserted right-to-left in the byte stream. I don't think this means what was intended. My best-effort interpretation of "right-to-left character sequences are inserted right-to-left in the byte stream" is that the rightmost character appears first in the byte stream. But that is the opposite of RFC 1556 visual directionality, which requires the leftmost character to appear first in the byte stream. I strongly recommend using phrases like "leftmost character first" and avoiding phrases like "right-to-left in the byte stream", because byte streams do not have a left and right, only an earlier and later. 8.2.5 Character references for directionality and joining control Mirrored character glyphs. In general, the bidirectional algorithm does not mirror character glyphs but leaves them unaffected. An exception are characters such as parentheses (see [UNICODE], table 4-7). Although the Unicode character names and example glyphs are available online, the text of the spec is not, so I wish the HTML spec would elaborate a bit on the mirroring of parentheses. If characters the characters ( and ) were called "open parenthesis" and "close parenthesis", I could understand why their appearance would depend on the directionality of the text. But they're called "left parenthesis" and "right parenthesis", so I don't see why they would ever be mirrored. In right-to-left text, you would obviously begin a parenthetical with a right parenthesis, and end it with a left parenthesis, correct? 9.1 White space authors should not rely on user agents to render white space immediately after a start tag or immediately before an end tag. What about the converse? Should authors also not rely on user agents *not* to render whitespace immediately after a start tag? For example, may authors assume that these will be rendered the same: <li>foo <li> foo or should authors always use the first form? 9.2.1 Phrase elements: EM, STRONG, DFN, CODE, SAMP, KBD, VAR, CITE, ABBR, and ACRONYM The HTML 2.0 spec contains more description and examples for these elements. I think they should have been retained. <ABBR lang="es" title="Doña">Doña</ABBR> The title is identical to the content. 9.3.4 Preformatted text: The PRE element width = number [CN] This attribute provides a hint to visual user agents about the desired width of the formatted block. By definition, preformatted text already has a width, which can be determined by scanning it and noticing the length of the longest line. Maybe you mean that this attribute provides a hint, not about the width of the text, but about the width of the window for which the text was formatted. When handling preformatted text, visual user agents: May leave white space intact. May render text with a fixed-pitch font. May disable automatic word wrap. Shouldn't each "may" be "should"? Authors usually depend on these for vertical alignment. 11.2.4 Column groups: the COLGROUP and COL elements The table in this example contains six columns. The first one does not belong to an explicit column group. But later: <TABLE> <COLGROUP> <COL width="30"> <COLGROUP> <COL width="30"> <COL width="0*"> <COL width="2*"> <COLGROUP align="center"> <COL width="1*"> <COL width="3*" align="char" char=":"> <THEAD> <TR><TD> ... ...rows... </TABLE> And then: We have set the value of the align attribute in the second column group to "center". It looks like the text and the example do not agree. 11.4.2 Categorizing cells In order to determine, for example, the costs of meals on 25 August, the user agent must know which table cells refer to "Meals" (all of them) ^^^^^^^^^^^^^^^^^^^^^ No, only cells in the Meals column refer to meals. Maybe you meant "which table cells refer to "Expenses" (specifically, Meals)". 12.1.1 Visiting a linked resource Note that the hrefattribute in each source anchor ^^^^^^^^^^^^^ Insert a space. 12.1.2 Other link relationships Links that express other types of relationships have one or more link type specified in their source anchor. ^^ ^^ These nouns should be plural. 13.2 Including an image: the IMG element User agents must render alternate next when they cannot support ^^^^ Should be "text". 13.3.2 Object initialization: the PARAM element Any number of PARAM elements may appear in the content of an OBJECT or APPLETelement, ^^ Insert a space. 13.3.4 Object declarations and instantiations <P><OBJECT declare id="tribune" ... <PARAM name="font" valuetype="object" value="#tribune"> Is the pound sign supposed to be there? Section 13.3.2 said: object: The value specified by value is an identifier that refers to an OBJECT declaration in the same document. The identifier must be the value of the id attribute set for the declared OBJECT element. That suggests to me that the PARAM element should have value="tribune", with no pound sign. 13.6.1 Client-side image maps: the MAP and AREA elements usemap = uri [CT] This attribute associates an image map with an element. The image map is defined by a MAP element. The value of usemap must match the value of the name attribute of the associated MAP element. Since the value of the usemap attribute is a URI, it should be permissible to refer to a MAP element from another document. None of the examples do this. Is it allowed? By the way, the idea of allowing the shape and coords attributes in A elements is brilliant! 13.7 Visual presentation of images, objects, and applets All IMG and OBJECT attributes that concern visual alignment and presentation have been deprecated in favor of style sheets. This is imprecise. Some of the attributes mentioned in 13.7 are not deprecated (width, height), some of them are deprecated but don't say so (vspace, hspace, align), and some of them say that they're deprecated (border). I suggest removing the above sentence and inserting explicit "deprecated" indications wherever appropriate. 14.2.3 Header style information: the STYLE element The title attribute appears in the DTD but is not mentioned in the text. Later, in section 14.4, there is an example of the title attribute of a LINK element, but not of a STYLE element. This leaves the reader unconfident about the use of the title attribute with the STYLE element. 14.3.2 Specifying external style sheets For example, to set the preferred style sheet to "compact" (see the preceding example), Actually, the previous example used "Compact", and the title attribute is case sensitive. Since the subsequent examples use "compact", perhaps the first one should be changed to match. 17.3 The FORM element The value is a space- and/or comma-delimited list of charset values. This attribute specifies a comma-separated list of content types Throughout the spec, some attribute values are space-separated, some are comma-separated, and some are space- and/or comma-separated. Is there a simple rule that one can memorize, rather than consulting the spec every time? If so, this rule should be stated somewhere. 17.4 The INPUT element readonly (readonly) #IMPLIED -- for text and passwd -- ^^^^^^ Should be "password" (in the actual DTD too). 17.10 Adding structure to forms: the FIELDSET and LEGEND elements /samp This must be a typo at the very end of the section. 17.11.2 Access keys accesskey = character [CN] How is this case neutral? Doesn't it have to be either case sensitive or case insensitive? Am I allowed to have one control with an accesskey of "C" and another with an access key of "c"? (I vote no.) By the way, shouldn't the spec say that no two controls in the same document should have the same accesskey? We recommend that authors include the access key in label text or wherever the access key is to apply. User agents should render the value of an access key in such a way as to emphasize its role and to distinguish it from other characters (e.g., by underlining it). I think this should be more precise. Maybe you mean: We recommend that authors include the access key in the contents of the A, AREA, BUTTON, LABEL, or LEGEND element, or in the value attribute of the INPUT element of type submit, reset, or button. User agents should render the first occurrence of the access key (using case-insensitive matching) in such a way as to emphasize its role and to distinguish it from other characters (e.g., by underlining it). 17.12.2 Read-only controls The following elements support the readonly attribute: INPUT, TEXT, PASSWORD, and TEXTAREA. There are no such elements as TEXT and PASSWORD. You probably mean INPUT elements of type text and password. I don't know whether you mean to include all other types of INPUT as well. 17.13.4 Form content types 1. Control names and values are escaped. Space characters are replaced by `+', and then reserved characters are escaped as described in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by `%HH', a percent sign and two hexadecimal digits representing the ASCII code of the character. Line breaks are represented as "CR LF" pairs (i.e., `%0D%0A'). This was lifted almost verbatim from the HTML 2.0 spec, but changing "escaped: space" to "escaped. Space" adds confusion (by making the first sentence seem like a separate step), as does removing the "that is," before "non-alphanumeric" (making that sentence seem like a separate step). The file name may be specified with the "filename" parameter of the 'Content-Disposition: form-data' header, or, in the case of multiple files, in a 'Content-Disposition: file' header of the subpart. The examples use 'Content-Disposition: attachment' in the subparts, rather than 'Content-Disposition: file'. Are both correct? Is one preferred? 18.2.2 Specifying the scripting language It is also possible to specify the scripting language in each SCRIPT element via the type attribute. In the absence of a default scripting language specification, this attribute must be set on each SCRIPT element. This makes it sound like the type attribute is optional on SCRIPT elements, but the DTD says it's required. a name attribute takes precedence over a id if both are set. ^^^^ Should be "an id". 24.2.1 The list of characters <!ENTITY not CDATA "¬" -- not sign = discretionary hyphen, ^^^^^^^^^^^^^^^^^^^^^^^ I suspect that's not supposed to be there. It should be removed in HTMLlat1.ent too. 24.3.1 The list of characters It would be very nice if the comment for each entity included the Adobe standard glyph name, since this list of entities was taken directly from the Adobe Symbol font. Each glyph name begins with a slash. I think the mapping is given here: http://www.ams.org/html-math/tr9573-symbols.html But that page doesn't state explicitly that the slash-names are the Adobe standard glyph names. The Adobe PostScript reference manual would be the authoritative source. <!ENTITY weierp CDATA "℘" -- script capital P = power set = Weierstrass p, U+2118 ISOamso --> Is that considered a good mapping, or a compromise? I once looked for a Unicode character matching this Symbol font glyph, and was not satisfied with anything I found. If this is a compromise, there should be a disclaimer to that effect. 24.4 Character entity references for markup-significant and internationalization characters Entities have also been added for the remaining characters occurring in CP-1252 which do not occur in the HTMLlat1 or HTMLsymbol entity sets. These all occur in the 128 to 159 range within the cp-1252 charset. What is CP-1252? It doesn't seem to be defined or referenced anywhere. Also, either capitalize the second occurrence or decapitalize the first. Appendix A: Changes between HTML 3.2 and HTML 4.0 This appendix neglects to mention that the HTML 3.2 DTD allowed %text in the content of BODY, but the HTML 4.0 DTD does not allow %inline in the content of BODY. I think that's a noteworthy change. A.3 Changes for accessibility (see the longdesc attribute). For some reason, "longdesc" is not a link in the hypertext spec, but should be. A.4 Changes for meta data Authors may now specify profiles that provide explanations about meta specified with the META or LINK elements. ^^^^ Should be "meta data". A.9 Changes for forms The readonly, allows authors to prohibit changes ^^^^^^^^^ Should be "readonly attribute". Appendix B: Performance, Implementation, and Design Notes Despite the appearance of words such as "must" and "should", all requirements in this section appear elsewhere in the specification. Is that true of the requirement that "a line break immediately following a start tag must be ignored, as must a line break immediately before an end tag" (B.3.1 Line breaks)? B.3.2 Specifying non-HTML data Authors should therefore escape sequences "</" sequence within the content. Reword. B.4 Notes on helping search engines index your Web site You may help search engines by using the LINK element with rel="begin" along with a TITLE, as in: The section on link types recommended using rel=Start for this purpose. Should authors use one, or the other, or both? Also, I think you meant "title" (the attribute), not TITLE (the element). The list of terms in the content is ALL, INDEX, NOFOLLOW, NOINDEX. The name and the content attribute values are case-insensitive. This description is very incomplete, and leaves the reader with a lot of uncertainty. Brief but complete documentation can be found here: http://info.webcrawler.com/mak/projects/robots/meta-user.html By the way, both that page and a more complete and precise specification of the robots.txt file are linked from: http://info.webcrawler.com/mak/projects/robots/exclusion.html You might want to have a reference to that page. B.5.1 Design rationale This can be altered by setting the width-TABLE attribute of the TABLE element. ^^^^^^^^^^^ Should be "width". B.5.2 Recommended Layout Algorithms Rules for handling objects too large for column apply when the explicit or implied alignment results in a situation where the data exceeds the assigned width of the column. "for column" should be "for a column". Which rules are being referred to here? The values for theframe attribute have been chosen to avoid clashes with the rules, align and valign-COLGROUP attributes. "theframe" should be "the frame", and "valign-COLGROUP" should be "valign".
Received on Sunday, 28 December 1997 02:24:35 UTC