Getting <l> right (Was: Unicode line and paragraph separators)

What is <l> supposed to be?

If I read the 20030131 draft at section 8.12, I see:

>   8.12. The l element
> 
>   The l element contains a sub-paragraph that represents a single
>   line of text.

[ERROR in draft: "single" is misspelled.]

The intended meaning would be clearer if "sub-paragraph" were replaced
by "sub-unit of a paragraph".  But what is meant by "line"?  Is that a
presentational concept?

>                 It is intended as a structured replacement for the
>   br element.

Because "br" is viewed as presentational?

>               It contains a piece of text that when visually
>   represented should start on a new line, and have a line break at
>   the end. Whether the line should wrap or not visually depends on
>   styling properties of the element.

In other words, so far the description of "l" is all about presentation.

  . . .

>   By retaining structure in text that has to be broken over lines,
>   you retain essential information about its makeup. This gives you
>   greater freedom with styling the content. For instance, line
>   numbers can be generated automatically from the stylesheet if
>   needed.

Aha!  Structure in text.  Then an example that is not bad but less
than good:

>   For instance, for a document with the following structure:
> 
>   <p class="program">
>   <l>program p(input, output);</l>
>   <l>begin</l>
>   <l>   writeln("Hello world");</l>
>   <l>end.</l>
>   </p>

Have we been told how "l" fits in the content model of "p"?  Would it
be OK if the preceding "p" instance had children other than "l"?
(I hope not.)

Rather than write the content model of "p" so that if "l" is a child,
then all other children must be "l", it would be better to ensure that
"l" is always suitably wrapped in a parent that specializes in "l"
children.

Tantek provides a good example:

>   <blockquote>
>    <l>Peace, peace, Mercutio, peace!</l>
>    <l>Thou talk'st of nothing.</l>
>   </blockquote>

"address" also comes quickly to mind as a suitable parent having
only "l" children.  In fact, "address" and "blockquote" should be
understood as list structures.

Also the program listing in 8.12 should be modeled as a list
structure.

Lists are among the most useful of all models for document markup.

Perhaps

  <list class="program">
  <l>program p(input, output);</l>
  <l>begin</l>
  <l>   writeln("Hello world");</l>
  <l>end.</l>
  </list>

or maybe

  <verbatimlist class="program">
  <l>program p(input, output);</l>
  <l>begin</l>
  <l>   writeln("Hello world");</l>
  <l>end.</l>
  </verbatimlist>

Please note that in suggesting "list" as a generic list structure I am
not suggesting that it be used to encompass present list structures
such as "ol" just as I would not suggest that "span" be used to
encompass present inline structures such as "em".

"verbatimlist" would be a specialized list keyed to the idea that
default presentation of its items (l's) should be in a fixed width
font.

Finally, a non bogus "l" has nothing at all to do with replacing "br".
"br" is still needed occasionally when one needs to punt -- such as
when a long head, whether the old "h1" or new "h", is susceptible
to a bad split.  It is simply not sensible to propose that the content
model for "h1" or "h" contain "l".

And while it is not sensible to allow "l" to be mixed in "p" with
children other than "l", it is sensible to permit "br" in "p" for
occasional use.

"br" should be left in XHTML forever.  Its description in the spec
should, however, say that its use is almost always both undesirable
and unnecessary.  The description should also point out that the
effect of cascading style might be that it is ignored altogether.

                                    -- Bill

Received on Saturday, 5 April 2003 22:38:58 UTC