Re: Comments on the draft dated 26 July 2006 from Steven Pemberton on 2006-10-11 (www-html-editor@w3.org from October to December 2006)

From: Steven Pemberton <steven.pemberton@cwi.nl>
Date: Wed, 11 Oct 2006 13:42:12 +0200
To: "Rob Burns" <robburns1@mac.com>, www-html-editor@w3.org
Message-ID: <op.tg89gmb9smjzpq@acer3010.lan>
Thanks for these detailed comments. We will consider them in due course.

Best wishes,

Steven Pemberton

On Tue, 10 Oct 2006 05:12:52 +0200, Rob Burns <robburns1@mac.com> wrote:

> Dear Editors:
>
> The current draft of XHTML 2 represents a great leap forward for
> semantic authoring. Element after element in the draft improves the
> expressibility of HTML markup. I have a list of comments on the
> draft. Many of these concepts may have already been discussed by the
> draft’s participants. In any event, please take them into
> consideration as this draft progresses towards candidate recommendation.
>
> Sincerely,
> Robert Burns
>
> ------------------------------------------
> Attributes xml:id and id:
>  Including both the ‘id’ and ‘xml:id’ attributes introduces
> complexity without any clear benefit. Using  “xml:id’ alone will
> improve the extensibility and interoperability of XHTML with other
> XMLs. It will also avoid needless confusion among authors over these
> largely redundant attributes.
>
> Finer-grained ‘cite’ element (adding a citation ‘type’ attribute):
> By adding an ‘type’ attribute to the ‘cite’ element authors could
> more clearly delineate the type of citation. Currently user-agents
> typically apply italics to the ‘cite’ element implying its use is
> meant only for book citations. However, a ‘type’ attribute would
> allow style sheets to select ‘cite’ elements and apply a greater
> variety of presentation to the finer-grained “cite” element. So for
> example ‘type = "(book-title | article-title | webpage-title | author
> | speaker | collection-title |  <QName>)"’.
>
> Subordinate Text Element:
> To leverage the work with CSS3, an element for subordinate text like
> ‘subtext’ would be useful for authors. This is text that is
> parenthetical to the main text, but may receive various
> presentational idioms. A “rank‘ attribute to differentiate between
> levels of subordination could also be employed. So for example,
> <subtext rank="0"> might be displayed as a parenthetical with
> {before: content("("); after: content(")");}. Whereas <subtext
> rank="1"> would be displayed as an endnote of footnote in printed
> media or as a popup or tooltip note in screen media. The ‘rank’
> attribute could be simply of type number where zero and one would
> cover most needs, but with the flexibility to extend the
> subordination of the text if authors needed that. Subordinate text
> elements might also be used to attach comments to an authored work.
>
> Table Cell Elements:
> Rather than continuing separate elements for header (‘th’) and data
> (’td’), XHTML2 should introduce a new ’tc’ element for table cell.
> The distinction between a header cell and a data cell could then be
> expressed through a new attribute. For example, ‘cell=" (header |
> data | both)"’. The ‘tc’ element could be supported instead of the
> ’td’ and ‘th’ elements, and those two should probably then be
> deprecated. This would allow authors to explicitly indicate whether a
> cell was both a data cell and a header cell rather than leaving it
> merely implied.
>
> Table ‘col’ and ‘colgroup’ horizontal-alignment:
> Somewhere in the transition to stricter content models and XHTMl many
> user-agents dropped support for HTML horizontal alignment within
> tables. In particular, the ability to define horizontal alignment
> with the ‘col’ and ‘colgroup’ elements and to align on a single
> character such as a decimal (“.”). Presumably this was to be picked
> up by CSS as a presentational attribute. However, overtime the CSS
> recommendations also dropped support for both defining presentation
> of tables from the “col” and “colgroup” elements and also in
> presenting various meanings about a table in terms of columns and
> column groupings. I'm not sure if XHTML2 should provide a backup
> mechanism for this, but it should be addressed somewhere.
>
> A Data of Type Element:
> A data element could work similar to an XForms output element except
> rather than dynamically generated, it's static value would be its
> content: in other words, the contents of the element. However, a
> ‘datatype’ attribute would identify the contents as a particular XSD
> primitive or derived data type. By identifying the data type of the
> contents stylesheets could be used to change the display of the data.
> For example, <data datatype='float'>1000</data> might be styled as
> either "one thousand " or "1,000" or "1000" or "1.0× 10^3" depending
> on the style declaration selecting this data type. Other attributes
> might also be included to indicate ad hoc facets of the data. A
> ‘units’ attribute could also be included with QNames drawn from SI,
> US or Imperial units and various calendars. Again the display of the
> units could be determined through style sheets: e.g., “millimeters”
> or “mm” or “m.m.”. Such an element would further extend the
> continuity of documents produced through W3C standards. Validators
> could also be extended to notify authors of invalid content
> (according to the ‘datatype’ attribute). DOM function could ensure
> operations were performed on data of comparable type and even ensure
> units were respected.
>
> A Proper Name Element:
> Similar to a data element, XHTML2 should include a  proper name
> element with types: person, place, organization, institution, etc.
> Proper names are an important semantic distinction in authoring
> documents that should not be left to generic elements and ad hoc
> solutions.
>
> Lists:
> The current semantics of lists could use some simplification. The
> semantic difference between unordered and ordered lists does not seem
> great enough to warrant separate elements. Rather both seem
> maintained largely legacy and presentational reasons. Perhaps a
> boolean attribute would be better for this distinction in the future.
> When order matters, ‘order="order"’ could be set on a list element.
>
> At the same time, the distinction between definition lists and the
> other lists seems mostly in the use of the definition item. A more
> flexible (and I do not think any more cumbersome) approach would be
> to maintain a single unified list element and one list item element
> and allow the use of an optional definition term element at the
> beginning of any list item. CSS sibling selectors would allow the
> presentation to meet the needs of either legacy presentations while
> enabling further and more flexible presentational idioms as well.
>
> I am not including the new ‘nl’ in this discussion which does seem
> semantically distinct enough to warrant its own element. However, the
> ‘ul’, ‘ol’ and ‘dl’ elements could all be merged into one element
> without losing any expressability.
>
> Definition Lists:
>  In any event, if the ‘dl’ element is maintained separately, the
> ‘value’ attribute should be added to the definition list item to be
> used similarly to the ‘value’ attribute on list item elements.
>
> A Blockparagraph Element:
> Rather than simply altering the content model of the ‘p’ element, I
> think it would be better to follow the patterns established in the
> distinction between ‘q’ versus ‘blockquote’’ and ‘code’ versus
> ‘blockcode’ by adding a ‘blockparagraph’ element. This way the
> paragraphs would share similar semantic differences with these other
> elements.
>
> Caption Element Content Model:
> By distinguishing between ‘p’ and ‘blockparagraph’ elements it would
> also make sense to add the ‘p’ element to the content model of the
> caption element. A caption could then simply handle multiple
> paragraphs of text content. Without that I fear authors may misuse of
> ‘l’ or ‘separator’ elements or feel the need to reintroduce legacy
> elements to handle captions requiring multiple paragraphs. While at
> the same time, excluding blockparagraphs from the caption element
> will keep captions relatively simple as their semantics require.
>
> Paraphrase Elements:
> To associate newly authored content with one or a few sources, a
> paraphrase element would be useful: perhaps in both block and non-
> block forms. Like ‘q’ and ‘blockquote’ these elements could allow
> authors to associate the paraphrasing with specific sources through
> the ‘cite’ attribute.
>
> A Marker Element:
> Add a marker element (e.g., <marker id='someNumber'/> as a way to
> insert an empty element marker into a document where, unlike anchor
> and span, one wants to refer to a single point in the document rather
> than a range. This might be used presentationally, like the
> separator, for a page-break or column-break. Or it may have no
> presentation at all but serve as a “bookmark” within a document. I
> think there is a need for such a generic empty element: one whose
> default presentation has no display, but instead serves as a marker
> within the document.
>
> PCData and Mixed-Content:
> The current draft is not entirely clear regarding content models:
> particularly for structural elements and the “Flow” content model.
> The prose indicates PCData within the content models of several
> elements that previously contained only child elements. It is not
> clear whether this is merely due to changes in the XML definitions of
> PCData; whether it relates only to whitespace; or whether this is a
> change in the content models of these elements.
>
> If this PCData does not mean only white-space characters then its
> introduction into several content models is unwarranted.  For example
> the ‘section’ element shows a content model of (PCDATA | Flow)* while
> the prose say “This element defines content to be block-level…”. I
> think the ‘section’ element’s content model would be better as
> (Heading | Structural)* which is closer to what the prose suggests.
>
> In addition, the ‘blockquote’ and ‘blockcode’ elements list content
> models of (PCDATA | Text | Heading | Structural | List)* which
> appears to be equivalent to (PCDATA | Flow). In the case of
> ‘blockcode’, this tends to blur the distinction between ‘blockquote’
> and ‘q’ elements that was stricter in prior recommendations.
>
> The ‘img’ Element:
>  While retaining the ‘img’ element for legacy reasons it may be
> better to include in within the embedding collection as a sort of
> subclass of the object element. In this way the '‘img’ element could
> include fall-back content, a standby element, and particularly a
> caption element. However it could also add the ‘alt’ attribute as an
> alternative fallback mechanism: a mechanism only used if the element
> contained no content. In this way authors who have been reluctant to
> switch to the ‘object’ element could still use the familiarly-named
> ‘img’ element, but could eventually come to use the contents of the
> element for fall-back instead of the ‘alt’ attribute.
Received on Wednesday, 11 October 2006 11:42:48 UTC