- From: Rob Burns <robburns1@mac.com>
- Date: Mon, 9 Oct 2006 22:12:52 -0500
- To: www-html-editor@w3.org
- Message-Id: <7BC50364-32FC-423B-96BE-3EE4D43DAE8F@mac.com>
Dear Editors: The current draft of XHTML 2 represents a great leap forward for semantic authoring. Element after element in the draft improves the expressibility of HTML markup. I have a list of comments on the draft. Many of these concepts may have already been discussed by the draft’s participants. In any event, please take them into consideration as this draft progresses towards candidate recommendation. Sincerely, Robert Burns ------------------------------------------ Attributes xml:id and id: Including both the ‘id’ and ‘xml:id’ attributes introduces complexity without any clear benefit. Using “xml:id’ alone will improve the extensibility and interoperability of XHTML with other XMLs. It will also avoid needless confusion among authors over these largely redundant attributes. Finer-grained ‘cite’ element (adding a citation ‘type’ attribute): By adding an ‘type’ attribute to the ‘cite’ element authors could more clearly delineate the type of citation. Currently user-agents typically apply italics to the ‘cite’ element implying its use is meant only for book citations. However, a ‘type’ attribute would allow style sheets to select ‘cite’ elements and apply a greater variety of presentation to the finer-grained “cite” element. So for example ‘type = "(book-title | article-title | webpage-title | author | speaker | collection-title | <QName>)"’. Subordinate Text Element: To leverage the work with CSS3, an element for subordinate text like ‘subtext’ would be useful for authors. This is text that is parenthetical to the main text, but may receive various presentational idioms. A “rank‘ attribute to differentiate between levels of subordination could also be employed. So for example, <subtext rank="0"> might be displayed as a parenthetical with {before: content("("); after: content(")");}. Whereas <subtext rank="1"> would be displayed as an endnote of footnote in printed media or as a popup or tooltip note in screen media. The ‘rank’ attribute could be simply of type number where zero and one would cover most needs, but with the flexibility to extend the subordination of the text if authors needed that. Subordinate text elements might also be used to attach comments to an authored work. Table Cell Elements: Rather than continuing separate elements for header (‘th’) and data (’td’), XHTML2 should introduce a new ’tc’ element for table cell. The distinction between a header cell and a data cell could then be expressed through a new attribute. For example, ‘cell=" (header | data | both)"’. The ‘tc’ element could be supported instead of the ’td’ and ‘th’ elements, and those two should probably then be deprecated. This would allow authors to explicitly indicate whether a cell was both a data cell and a header cell rather than leaving it merely implied. Table ‘col’ and ‘colgroup’ horizontal-alignment: Somewhere in the transition to stricter content models and XHTMl many user-agents dropped support for HTML horizontal alignment within tables. In particular, the ability to define horizontal alignment with the ‘col’ and ‘colgroup’ elements and to align on a single character such as a decimal (“.”). Presumably this was to be picked up by CSS as a presentational attribute. However, overtime the CSS recommendations also dropped support for both defining presentation of tables from the “col” and “colgroup” elements and also in presenting various meanings about a table in terms of columns and column groupings. I'm not sure if XHTML2 should provide a backup mechanism for this, but it should be addressed somewhere. A Data of Type Element: A data element could work similar to an XForms output element except rather than dynamically generated, it's static value would be its content: in other words, the contents of the element. However, a ‘datatype’ attribute would identify the contents as a particular XSD primitive or derived data type. By identifying the data type of the contents stylesheets could be used to change the display of the data. For example, <data datatype='float'>1000</data> might be styled as either "one thousand " or "1,000" or "1000" or "1.0× 10^3" depending on the style declaration selecting this data type. Other attributes might also be included to indicate ad hoc facets of the data. A ‘units’ attribute could also be included with QNames drawn from SI, US or Imperial units and various calendars. Again the display of the units could be determined through style sheets: e.g., “millimeters” or “mm” or “m.m.”. Such an element would further extend the continuity of documents produced through W3C standards. Validators could also be extended to notify authors of invalid content (according to the ‘datatype’ attribute). DOM function could ensure operations were performed on data of comparable type and even ensure units were respected. A Proper Name Element: Similar to a data element, XHTML2 should include a proper name element with types: person, place, organization, institution, etc. Proper names are an important semantic distinction in authoring documents that should not be left to generic elements and ad hoc solutions. Lists: The current semantics of lists could use some simplification. The semantic difference between unordered and ordered lists does not seem great enough to warrant separate elements. Rather both seem maintained largely legacy and presentational reasons. Perhaps a boolean attribute would be better for this distinction in the future. When order matters, ‘order="order"’ could be set on a list element. At the same time, the distinction between definition lists and the other lists seems mostly in the use of the definition item. A more flexible (and I do not think any more cumbersome) approach would be to maintain a single unified list element and one list item element and allow the use of an optional definition term element at the beginning of any list item. CSS sibling selectors would allow the presentation to meet the needs of either legacy presentations while enabling further and more flexible presentational idioms as well. I am not including the new ‘nl’ in this discussion which does seem semantically distinct enough to warrant its own element. However, the ‘ul’, ‘ol’ and ‘dl’ elements could all be merged into one element without losing any expressability. Definition Lists: In any event, if the ‘dl’ element is maintained separately, the ‘value’ attribute should be added to the definition list item to be used similarly to the ‘value’ attribute on list item elements. A Blockparagraph Element: Rather than simply altering the content model of the ‘p’ element, I think it would be better to follow the patterns established in the distinction between ‘q’ versus ‘blockquote’’ and ‘code’ versus ‘blockcode’ by adding a ‘blockparagraph’ element. This way the paragraphs would share similar semantic differences with these other elements. Caption Element Content Model: By distinguishing between ‘p’ and ‘blockparagraph’ elements it would also make sense to add the ‘p’ element to the content model of the caption element. A caption could then simply handle multiple paragraphs of text content. Without that I fear authors may misuse of ‘l’ or ‘separator’ elements or feel the need to reintroduce legacy elements to handle captions requiring multiple paragraphs. While at the same time, excluding blockparagraphs from the caption element will keep captions relatively simple as their semantics require. Paraphrase Elements: To associate newly authored content with one or a few sources, a paraphrase element would be useful: perhaps in both block and non- block forms. Like ‘q’ and ‘blockquote’ these elements could allow authors to associate the paraphrasing with specific sources through the ‘cite’ attribute. A Marker Element: Add a marker element (e.g., <marker id='someNumber'/> as a way to insert an empty element marker into a document where, unlike anchor and span, one wants to refer to a single point in the document rather than a range. This might be used presentationally, like the separator, for a page-break or column-break. Or it may have no presentation at all but serve as a “bookmark” within a document. I think there is a need for such a generic empty element: one whose default presentation has no display, but instead serves as a marker within the document. PCData and Mixed-Content: The current draft is not entirely clear regarding content models: particularly for structural elements and the “Flow” content model. The prose indicates PCData within the content models of several elements that previously contained only child elements. It is not clear whether this is merely due to changes in the XML definitions of PCData; whether it relates only to whitespace; or whether this is a change in the content models of these elements. If this PCData does not mean only white-space characters then its introduction into several content models is unwarranted. For example the ‘section’ element shows a content model of (PCDATA | Flow)* while the prose say “This element defines content to be block-level…”. I think the ‘section’ element’s content model would be better as (Heading | Structural)* which is closer to what the prose suggests. In addition, the ‘blockquote’ and ‘blockcode’ elements list content models of (PCDATA | Text | Heading | Structural | List)* which appears to be equivalent to (PCDATA | Flow). In the case of ‘blockcode’, this tends to blur the distinction between ‘blockquote’ and ‘q’ elements that was stricter in prior recommendations. The ‘img’ Element: While retaining the ‘img’ element for legacy reasons it may be better to include in within the embedding collection as a sort of subclass of the object element. In this way the '‘img’ element could include fall-back content, a standby element, and particularly a caption element. However it could also add the ‘alt’ attribute as an alternative fallback mechanism: a mechanism only used if the element contained no content. In this way authors who have been reluctant to switch to the ‘object’ element could still use the familiarly-named ‘img’ element, but could eventually come to use the contents of the element for fall-back instead of the ‘alt’ attribute.
Received on Wednesday, 11 October 2006 11:28:23 UTC