Should <br> be removed from the HTML5 spec?

Following the correspondence on HTML5 bugs
#10828<http://www.w3.org/Bugs/Public/show_bug.cgi?id=10828>and
#11211 <http://www.w3.org/Bugs/Public/show_bug.cgi?id=11211>,
I started to think that maybe a more radical approach is the way to go.

Quick summary:

1) The issue at hand is the current problems and incompatibility issues with
<br>:
       * HTML4 spec (implemented by Gecko and Opera) says that <br> should
be "soft" (not introduce a bidi paragraph break).
       * Implementation by IE and Webkit (as well as common usage on
websites) assume "hard" <br>  (considered bidi paragraph separator).

2) Our suggestion, originally in #10828, was split into two bugs:
    * #10828 suggests that <br> should be "hard" by default (contrary to
HTML4).
    * #11211 suggests that some way should be provided to produce a "soft
<br>".

3) While #10828 seems to be on the road to acceptance, we are having trouble
with use cases for #11211:
    *  The "natural" use case (I assume, based on old HTML tutorials, that
this was the original purpose of <br>  ) is poems.
        However, while there are *a lot* of websites with poems, I could not
find an RTL poem that contains numbers and LTR words (I am sure there
        exist such sites - would appreciate a link if you find one).
        Same goes for mail addresses (they do include numbers and Latin
words, but the <br>'s are normally positioned in non-sensitive places.
<http://www.w3.org/Bugs/Public/show_bug.cgi?id=11211>    * A use case
brought up by Adil (comment 18 on #10828), a site displaying a newspaper
page, in a way that should match the actual printed page.
       This was criticized by Ian Hickson for being "a bit of an abuse of
HTML".

Now, considering the last point, the editor does have a point. These
linebreaks are a matter of display. However, the thing that troubles me is
that
*the same argument applies equally well to the "hard" <br> * (bug #102828).
As opposed to <p>, which reflects a logical partition of the text into DOM
nodes, <br> is a matter of textual display.
Since HTML5 aims to represent the pure logical structure of document, maybe
the right place for such things is in the interaction between the contents
(text) and CSS.

Hence, the new suggestion:

(1) Add mandatory entities (temporary names):
    &ps; for U+2029 (paragraph separator, replacement for "hard" <br>)
    &ls; for U+2028 (line separator, replacement for "soft" <br>)

(2) Remove <br> from HTML5 spec.
    Add a comment saying that <br> was deprecated, stating explicitly that
when upgrading from ׁ‎HTML4, <br> should be replaced with &ps; if the
intended
    use was a hard break, or with &ls; if the intended use was compliant
with the HTML4 spec (i.e. soft break).

In a private discussion with Aharon, he said that this approach might, in
practice, work against our goal of reaching compatibility. I'm sure he can
explain this better than me.
The purpose of this post is trying to reach some concensus before posting
the idea in bugzilla.

 thanks for taking the time to read,
            Amit A.

On Thu, Oct 21, 2010 at 2:00 AM, Amit Aronovitch <aronovitch@gmail.com>wrote:

> (just an idea - not replying in bugzilla, because I'm not entirely sure it
> is a good one)
>
> Seems like the gatekeepers are willing to change the spec of <br> to become
> "hard" (like ie/webkit),
> but are reluctant to add a new attribute, or even a new element, which
> would make the soft break functionality available,
> suggesting &#x2028; as an alternative (i.e. adding to the spec a
> requirement that it should be supported).
>
> A possible compromise might be to add named entities, perhaps &sbr; (soft
> break) or &vbr; (visual break) for U+2028, and &br; for U+2029 (this might
> increase the chance that browser-makers would actually add support for these
> characters).
>
> Personally, I think that poetry and addresses alone (not to mention other
> use cases mentioned in the bug) are good enough use cases for adding the
> missing functionality as a 1st class citizen (element or attribute). At
> least it seems that it was important enough to be added to early versions of
> HTML.
>
> Amit
>
>

Received on Friday, 5 November 2010 08:04:48 UTC