- From: Rob Burns <robburns1@mac.com>
- Date: Mon, 9 Oct 2006 22:12:52 -0500
- To: www-html-editor@w3.org
- Message-Id: <7BC50364-32FC-423B-96BE-3EE4D43DAE8F@mac.com>
Dear Editors:
The current draft of XHTML 2 represents a great leap forward for
semantic authoring. Element after element in the draft improves the
expressibility of HTML markup. I have a list of comments on the
draft. Many of these concepts may have already been discussed by the
draft’s participants. In any event, please take them into
consideration as this draft progresses towards candidate recommendation.
Sincerely,
Robert Burns
------------------------------------------
Attributes xml:id and id:
Including both the ‘id’ and ‘xml:id’ attributes introduces
complexity without any clear benefit. Using “xml:id’ alone will
improve the extensibility and interoperability of XHTML with other
XMLs. It will also avoid needless confusion among authors over these
largely redundant attributes.
Finer-grained ‘cite’ element (adding a citation ‘type’ attribute):
By adding an ‘type’ attribute to the ‘cite’ element authors could
more clearly delineate the type of citation. Currently user-agents
typically apply italics to the ‘cite’ element implying its use is
meant only for book citations. However, a ‘type’ attribute would
allow style sheets to select ‘cite’ elements and apply a greater
variety of presentation to the finer-grained “cite” element. So for
example ‘type = "(book-title | article-title | webpage-title | author
| speaker | collection-title | <QName>)"’.
Subordinate Text Element:
To leverage the work with CSS3, an element for subordinate text like
‘subtext’ would be useful for authors. This is text that is
parenthetical to the main text, but may receive various
presentational idioms. A “rank‘ attribute to differentiate between
levels of subordination could also be employed. So for example,
<subtext rank="0"> might be displayed as a parenthetical with
{before: content("("); after: content(")");}. Whereas <subtext
rank="1"> would be displayed as an endnote of footnote in printed
media or as a popup or tooltip note in screen media. The ‘rank’
attribute could be simply of type number where zero and one would
cover most needs, but with the flexibility to extend the
subordination of the text if authors needed that. Subordinate text
elements might also be used to attach comments to an authored work.
Table Cell Elements:
Rather than continuing separate elements for header (‘th’) and data
(’td’), XHTML2 should introduce a new ’tc’ element for table cell.
The distinction between a header cell and a data cell could then be
expressed through a new attribute. For example, ‘cell=" (header |
data | both)"’. The ‘tc’ element could be supported instead of the
’td’ and ‘th’ elements, and those two should probably then be
deprecated. This would allow authors to explicitly indicate whether a
cell was both a data cell and a header cell rather than leaving it
merely implied.
Table ‘col’ and ‘colgroup’ horizontal-alignment:
Somewhere in the transition to stricter content models and XHTMl many
user-agents dropped support for HTML horizontal alignment within
tables. In particular, the ability to define horizontal alignment
with the ‘col’ and ‘colgroup’ elements and to align on a single
character such as a decimal (“.”). Presumably this was to be picked
up by CSS as a presentational attribute. However, overtime the CSS
recommendations also dropped support for both defining presentation
of tables from the “col” and “colgroup” elements and also in
presenting various meanings about a table in terms of columns and
column groupings. I'm not sure if XHTML2 should provide a backup
mechanism for this, but it should be addressed somewhere.
A Data of Type Element:
A data element could work similar to an XForms output element except
rather than dynamically generated, it's static value would be its
content: in other words, the contents of the element. However, a
‘datatype’ attribute would identify the contents as a particular XSD
primitive or derived data type. By identifying the data type of the
contents stylesheets could be used to change the display of the data.
For example, <data datatype='float'>1000</data> might be styled as
either "one thousand " or "1,000" or "1000" or "1.0× 10^3" depending
on the style declaration selecting this data type. Other attributes
might also be included to indicate ad hoc facets of the data. A
‘units’ attribute could also be included with QNames drawn from SI,
US or Imperial units and various calendars. Again the display of the
units could be determined through style sheets: e.g., “millimeters”
or “mm” or “m.m.”. Such an element would further extend the
continuity of documents produced through W3C standards. Validators
could also be extended to notify authors of invalid content
(according to the ‘datatype’ attribute). DOM function could ensure
operations were performed on data of comparable type and even ensure
units were respected.
A Proper Name Element:
Similar to a data element, XHTML2 should include a proper name
element with types: person, place, organization, institution, etc.
Proper names are an important semantic distinction in authoring
documents that should not be left to generic elements and ad hoc
solutions.
Lists:
The current semantics of lists could use some simplification. The
semantic difference between unordered and ordered lists does not seem
great enough to warrant separate elements. Rather both seem
maintained largely legacy and presentational reasons. Perhaps a
boolean attribute would be better for this distinction in the future.
When order matters, ‘order="order"’ could be set on a list element.
At the same time, the distinction between definition lists and the
other lists seems mostly in the use of the definition item. A more
flexible (and I do not think any more cumbersome) approach would be
to maintain a single unified list element and one list item element
and allow the use of an optional definition term element at the
beginning of any list item. CSS sibling selectors would allow the
presentation to meet the needs of either legacy presentations while
enabling further and more flexible presentational idioms as well.
I am not including the new ‘nl’ in this discussion which does seem
semantically distinct enough to warrant its own element. However, the
‘ul’, ‘ol’ and ‘dl’ elements could all be merged into one element
without losing any expressability.
Definition Lists:
In any event, if the ‘dl’ element is maintained separately, the
‘value’ attribute should be added to the definition list item to be
used similarly to the ‘value’ attribute on list item elements.
A Blockparagraph Element:
Rather than simply altering the content model of the ‘p’ element, I
think it would be better to follow the patterns established in the
distinction between ‘q’ versus ‘blockquote’’ and ‘code’ versus
‘blockcode’ by adding a ‘blockparagraph’ element. This way the
paragraphs would share similar semantic differences with these other
elements.
Caption Element Content Model:
By distinguishing between ‘p’ and ‘blockparagraph’ elements it would
also make sense to add the ‘p’ element to the content model of the
caption element. A caption could then simply handle multiple
paragraphs of text content. Without that I fear authors may misuse of
‘l’ or ‘separator’ elements or feel the need to reintroduce legacy
elements to handle captions requiring multiple paragraphs. While at
the same time, excluding blockparagraphs from the caption element
will keep captions relatively simple as their semantics require.
Paraphrase Elements:
To associate newly authored content with one or a few sources, a
paraphrase element would be useful: perhaps in both block and non-
block forms. Like ‘q’ and ‘blockquote’ these elements could allow
authors to associate the paraphrasing with specific sources through
the ‘cite’ attribute.
A Marker Element:
Add a marker element (e.g., <marker id='someNumber'/> as a way to
insert an empty element marker into a document where, unlike anchor
and span, one wants to refer to a single point in the document rather
than a range. This might be used presentationally, like the
separator, for a page-break or column-break. Or it may have no
presentation at all but serve as a “bookmark” within a document. I
think there is a need for such a generic empty element: one whose
default presentation has no display, but instead serves as a marker
within the document.
PCData and Mixed-Content:
The current draft is not entirely clear regarding content models:
particularly for structural elements and the “Flow” content model.
The prose indicates PCData within the content models of several
elements that previously contained only child elements. It is not
clear whether this is merely due to changes in the XML definitions of
PCData; whether it relates only to whitespace; or whether this is a
change in the content models of these elements.
If this PCData does not mean only white-space characters then its
introduction into several content models is unwarranted. For example
the ‘section’ element shows a content model of (PCDATA | Flow)* while
the prose say “This element defines content to be block-level…”. I
think the ‘section’ element’s content model would be better as
(Heading | Structural)* which is closer to what the prose suggests.
In addition, the ‘blockquote’ and ‘blockcode’ elements list content
models of (PCDATA | Text | Heading | Structural | List)* which
appears to be equivalent to (PCDATA | Flow). In the case of
‘blockcode’, this tends to blur the distinction between ‘blockquote’
and ‘q’ elements that was stricter in prior recommendations.
The ‘img’ Element:
While retaining the ‘img’ element for legacy reasons it may be
better to include in within the embedding collection as a sort of
subclass of the object element. In this way the '‘img’ element could
include fall-back content, a standby element, and particularly a
caption element. However it could also add the ‘alt’ attribute as an
alternative fallback mechanism: a mechanism only used if the element
contained no content. In this way authors who have been reluctant to
switch to the ‘object’ element could still use the familiarly-named
‘img’ element, but could eventually come to use the contents of the
element for fall-back instead of the ‘alt’ attribute.
Received on Wednesday, 11 October 2006 11:28:23 UTC