[Bug 19067] New: i18n comment 4 : at least by default, <br> should constitute a bidi paragraph break

https://www.w3.org/Bugs/Public/show_bug.cgi?id=19067

           Summary: i18n comment 4 : at least by default, <br> should
                    constitute a bidi paragraph break
           Product: HTML WG
           Version: unspecified
          Platform: PC
               URL: http://www.w3.org/Bugs/Public/show_bug.cgi?id=10828
        OS/Version: Windows XP
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HTML5 spec
        AssignedTo: dave.null@w3.org
        ReportedBy: contributor@whatwg.org
         QAContact: public-html-bugzilla@w3.org
                CC: ian@hixie.ch, bzbarsky@mit.edu,
                    lachlan.hunt@lachy.id.au, mjs@apple.com,
                    ap@webkit.org, rubys@intertwingly.net,
                    fantasai.bugs@inkedblade.net, mike@w3.org,
                    public-html-wg-issue-tracking@w3.org,
                    public-html@w3.org, simonp@opera.com,
                    w3c@adambarth.com, annevk@annevk.nl, jonas@sicking.cc,
                    adrianba@microsoft.com, Ms2ger@gmail.com,
                    ayg@aryeh.name, eric@webkit.org, dbaron@dbaron.org,
                    franko@microsoft.com, public-i18n-bidi@w3.org,
                    ehsan@mozilla.com, adil@diwan.com,
                    cewcathar@hotmail.com, aharon.lists.lanin@gmail.com,
                    glenn@skynav.com, shachar@shemesh.biz,
                    leviw@chromium.org


This was was cloned from bug 10828 as part of operation LATER convergence.
Originally filed: 2010-09-29 13:17:00 +0000
Original reporter: i18n bidi group <public-i18n-bidi@w3.org>

================================================================================
 #0   i18n bidi group                                 2010-09-29 13:17:48 +0000 
--------------------------------------------------------------------------------
Comment from the i18n review of:
http://dev.w3.org/html5/spec/

Comment 4
At http://www.w3.org/International/reviews/html5-bidi/
Editorial/substantive: S
Tracked by: AL

Location in reviewed document:
undefined [http://dev.w3.org/html5/spec/spec.html#contents]

Comment:This is a part of the proposals made by the "Additional Requirements
for Bidi in HTML" W3C First Public Working Draft. For a full description of the
use cases, please see 
http://www.w3.org/International/docs/html-bidi-requirements/#br-as-separator
[http://www.w3.org/International/docs/html-bidi-requirements/#br-as-separator]
. Here is the proposal made there:

Support a new HTML element attribute, bidibreak=hard|soft. On a <br> element,
the "soft" value means that the <br> is to be treated as a UBA bidi class WS
(whitespace) character, as was 
required in HTML 4 [http://www.w3.org/TR/html4/struct/text.html#edef-BR]
. The "hard" value means that the <br> is to be treated as UBA bidi class B,
i.e. paragraph break. If neither is specified, the bidibreak attribute value is
inherited from the parent. Thus, when specified on an element other than <br>,
bidibreak serves to determine the behavior of descendant <br> elements. For the
root element, the default is "hard" (which, of course, spreads to every <br>
element in the document, unless an intervening element sets bidibreak
otherwise).

Alternatively, if and only if all major browser makers reach unanimous
consensus that the default value for the root element should be "soft" and
commit to implementing it as such to the HTML WG prior to the new HTML
specification publication, that too would be fine.

When the author wants to use <br> just to wrap a line without adding bidi
separation, <br bidibreak="soft"> will do the trick.

Reasonable use cases for specifying bidibreak="soft" on non-<br> elements would
include an element containing poetry, as well as the root element of a document
that relies on the bidi behavior specified for <br> by HTML 4.

When <br> introduces a UBA paragraph break, the base direction of the new UBA
paragraph will be determined by the computed direction of the nearest ancestor
element whose bidi properties require its contents to be in a separate UBA
paragraph (or sequence of paragraphs), e.g. a block element or an element
directionally isolated by the ubi attribute (which is being proposed in a
separate bug). Furthermore, for every element between there and the <br> that
results in the creation of an embedding or override level, e.g. a <bdo> element
or any element with a dir attribute or a value other than "normal" for the
unicode-bidi CSS property, the correspondeng embedding or override level is
re-introduced at the start of the new UBA paragraph (to be closed at the end of
the element or the UBA paragraph, whichever comes first).
================================================================================
 #1   Maciej Stachowiak                               2010-09-29 16:36:18 +0000 
--------------------------------------------------------------------------------
If this really needs to be expressed in markup, perhaps a new element would be
better.

In particular, having a markup attribute that doesn't correspond to a CSS
property but still inherits and affects rendering of other elements is an
unusual pattern and would be awkward to implement.

Is there a Unicode character that creates a line break but has Unicode class WS
instead of B? If so, that would make it easier to define what happens for the
proposed "soft" line breaks.
================================================================================
 #2   CE Whitehead                                    2010-10-06 00:57:04 +0000 
--------------------------------------------------------------------------------
Hi I see no reason not to have this element adopt -- in the specifications --
the behavior that it currently has in ie -- this works well for most use cases,
and so should probably be the default break behavior.

We can then add a soft break that would have -- in the specifications -- the
behavior that this element currently is specified as having.  

Best,

--C. E. Whitehead
cewcathar@hotmail.com
================================================================================
 #3   Aharon Lanin                                    2010-10-06 20:59:23 +0000 
--------------------------------------------------------------------------------
(In reply to comment #1)
> If this really needs to be expressed in markup, perhaps a new element would be
> better.
> 
> In particular, having a markup attribute that doesn't correspond to a CSS
> property but still inherits and affects rendering of other elements is an
> unusual pattern and would be awkward to implement.
> 
> Is there a Unicode character that creates a line break but has Unicode class WS
> instead of B? If so, that would make it easier to define what happens for the
> proposed "soft" line breaks.

The equivalent "soft" line break Unicode character is LINE SEPARATOR, U+2028.

Regarding doing this through a new element, it would get the job done, but I
have been warned that new elements are problematic in terms of support from
existing software (e.g. how would an existing browser know that the new element
does not need a closing tag?) and generally very hard to get in.
================================================================================
 #4   Maciej Stachowiak                               2010-10-06 22:51:49 +0000 
--------------------------------------------------------------------------------
(In reply to comment #3)
> (In reply to comment #1)
> > If this really needs to be expressed in markup, perhaps a new element would be
> > better.
> > 
> > In particular, having a markup attribute that doesn't correspond to a CSS
> > property but still inherits and affects rendering of other elements is an
> > unusual pattern and would be awkward to implement.
> > 
> > Is there a Unicode character that creates a line break but has Unicode class WS
> > instead of B? If so, that would make it easier to define what happens for the
> > proposed "soft" line breaks.
> 
> The equivalent "soft" line break Unicode character is LINE SEPARATOR, U+2028.
> 
> Regarding doing this through a new element, it would get the job done, but I
> have been warned that new elements are problematic in terms of support from
> existing software (e.g. how would an existing browser know that the new element
> does not need a closing tag?) and generally very hard to get in.

New global attributes are also hard to get in. And in this case, I think an
inheriting global attribute is not as clean an approach.

Question: does including U+2028, either as a literal unicode character or as a
numeric character reference, get the job done? Or does that character get
affected by whitespace collapsing?
================================================================================
 #5   fantasai                                        2010-10-07 09:22:07 +0000 
--------------------------------------------------------------------------------
I believe the LINE SEPARATOR character is not supposed to be affected by white
space handling in CSS unless the source document language defines it as
equivalent to a LINE FEED or SGML RECORD-START/END token or similar.

(I'll note that implementations don't currently support it very well, though.
It's usually either ignored or turned into boxes.)
================================================================================
 #6   Aharon Lanin                                    2010-10-11 08:17:22 +0000 
--------------------------------------------------------------------------------
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #1)
> > > If this really needs to be expressed in markup, perhaps a new element would be
> > > better.
> > > 
> > > In particular, having a markup attribute that doesn't correspond to a CSS
> > > property but still inherits and affects rendering of other elements is an
> > > unusual pattern and would be awkward to implement.
> > > 
> > > Is there a Unicode character that creates a line break but has Unicode class WS
> > > instead of B? If so, that would make it easier to define what happens for the
> > > proposed "soft" line breaks.
> > 
> > The equivalent "soft" line break Unicode character is LINE SEPARATOR, U+2028.
> > 
> > Regarding doing this through a new element, it would get the job done, but I
> > have been warned that new elements are problematic in terms of support from
> > existing software (e.g. how would an existing browser know that the new element
> > does not need a closing tag?) and generally very hard to get in.
> 
> New global attributes are also hard to get in. And in this case, I think an
> inheriting global attribute is not as clean an approach.
> 
> Question: does including U+2028, either as a literal unicode character or as a
> numeric character reference, get the job done? Or does that character get
> affected by whitespace collapsing?

As far as I am concerned, either bidibreak or and a new element is fine, and I
would prefer to leave the choice up to the experts here.

Regarding LINE SEPARATOR, I guess what Maciej is proposing is a change in the
spec that explicitly says that it is to be treated as a (bidi-soft) line break
in all contexts and is not subject to whitespace collapsing. If so, the
PARAGRAPH SEPARATOR (U+2029) should be treated similarly: a bidi-hard line
break that is not subject to whitespace collapsing, i.e. exactly the same
effect as <br>. That's because these two characters are a pair introduced into
Unicode at the same time for the same reason: to provide unambiguous
alternatives to newline (and the othet line break characters).

Such a solution would also be fine with me (as long as the <br> spec is changed
to make it bidi-hard - or the browser manufacturers achieve a unanimous
commitment to treat it as bidi-soft).

However, please note that http://unicode.org/reports/tr20/#Line currently says
the following about U+2028 and U+2029:

"Problems when used in markup: Including these characters in markup text does
not work where it would duplicate the existing markup commands for delimiting
paragraphs and lines."

It is up to the HTML experts here to judge whether starting to support these
characters in HTML contexts where appropriate mark-up can be used instead would
be in keeping with the spirit of HTML, given that apparently this was not
considered to be the case at some point in the past.
================================================================================
 #7   Ian 'Hixie' Hickson                             2010-10-12 10:36:29 +0000 
--------------------------------------------------------------------------------
Given how rarely <br> is allowed to be used (basically only in poems and
addresses), what's the use case here?
================================================================================
 #8   Aharon Lanin                                    2010-10-13 16:53:12 +0000 
--------------------------------------------------------------------------------
(In reply to comment #7)
> Given how rarely <br> is allowed to be used (basically only in poems and
> addresses), what's the use case here?

There is a huge gap between how <br> is supposed to be used and how it is used
in practice.
================================================================================
 #9   Ian 'Hixie' Hickson                             2010-10-13 18:21:55 +0000 
--------------------------------------------------------------------------------
Granted, but the idea when adding new features is to support use cases in
whatever way leads to best practices, not to add band-aids to help people
continue to write hard-to-maintain code.

What's the use case here?
================================================================================
 #10  Ian 'Hixie' Hickson                             2010-10-15 00:15:15 +0000 
--------------------------------------------------------------------------------
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the tracker issue; or you may create a tracker issue
yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Did Not Understand Request
Change Description: no spec change
Rationale: It's unclear what the use case is. Please describe the use case so
that this proposal can be properly evaluated.
================================================================================
 #11  Aharon Lanin                                    2010-10-18 11:50:25 +0000 
--------------------------------------------------------------------------------
I have heard that part of what HTML5 is about is bringing what the spec says
and what the browsers do more into alignment.

What IE and Webkit do for <br> is treat it as a bidi paragraph break, despite
the spec saying otherwise. This is not about to change, because what RTL users
expect from <br> is bidi paragraph separation.

For the same reason, Firefox regularly gets bug reports about its treatment of
<br> as bidi whitespace. As far as I understand, the developers there would
like to accede but don't want to do so as long as the spec says otherwise.

The results is lack of interoperability that has lasted for many years and will
continue to last as long as <br> is specified to be bidi whitespace.

And <br> is used all the time. One instance of use is not even due to poorly
educated users, but to the automated translation of plain-text newlines into
mark-up. One example of that is Gmail, which generates a <br> every time one
enters a newline in a rich-text message. This does not even sound to me like
the abuse of <br>. How is Gmail to know whether the author meant the newline to
signify the end of a paragraph or simply the means to force the wrapping of a
line (as in when manually transforming a paragraph of text to which one is
replying into short lines with an > at the beginning of each)?

To leave things as they are is to perpetuate the current lack of
interoperability.
================================================================================
 #12  Ms2ger                                          2010-10-18 12:17:18 +0000 
--------------------------------------------------------------------------------
So, what you're saying is the following:

* Spec should say to always treat br as a bidi paragraph break
* IE and WebKit do this (test cases to prove this?)
* Gecko gets bug reports about its differing behaviour (link?)
* Opera follows Gecko

Is this correct? That sounds like a much saner solution than adding an
attribute.
================================================================================
 #13  Aharon Lanin                                    2010-10-18 14:31:06 +0000 
--------------------------------------------------------------------------------
Created attachment 925
Test case for whether a browser treats <br> as a UBA pargraph break. If it
does, the two arrows will point right. If not, they point left.
================================================================================
 #14  Aharon Lanin                                    2010-10-18 14:44:28 +0000 
--------------------------------------------------------------------------------
(In reply to comment #12)
> So, what you're saying is the following:
> 
> * Spec should say to always treat br as a bidi paragraph break
> * IE and WebKit do this (test cases to prove this?)
> * Gecko gets bug reports about its differing behaviour (link?)
> * Opera follows Gecko
> 
> Is this correct? That sounds like a much saner solution than adding an
> attribute.

This is correct, and is the core of what is being suggested.

However, given that the spec has up to now defined <br> as bidi whitespace, it
would seem that a line break with bidi whitespace semantics is apparently
useful enough to warrant some way of getting it. I do not feel comfortable
getting rid of it completely, without providing some opt-in way of getting it.
================================================================================
 #15  Maciej Stachowiak                               2010-10-18 19:14:00 +0000 
--------------------------------------------------------------------------------
(In reply to comment #12)
> So, what you're saying is the following:
> 
> * Spec should say to always treat br as a bidi paragraph break
> * IE and WebKit do this (test cases to prove this?)
> * Gecko gets bug reports about its differing behaviour (link?)
> * Opera follows Gecko
> 
> Is this correct? That sounds like a much saner solution than adding an
> attribute.

Seems like that could be accomplished simply by removing this line:

"A br element does not separate paragraphs for the purposes of the Unicode
bidirectional algorithm. [BIDI]"

That's arguably a separate request from a new mechanism that breaks the line
without acting as a paragraph break. For the new mechanism, is &#x2028; a
sufficient solution? While not as memorable as <br>, it nonetheless seems less
complicated than the bidibreak proposal.
================================================================================
 #16  Ehsan Akhgari [:ehsan]                          2010-10-18 21:38:37 +0000 
--------------------------------------------------------------------------------
(In reply to comment #7)
> Given how rarely <br> is allowed to be used (basically only in poems and
> addresses), what's the use case here?

Also, please note that the problem in comment 0 can also happen in these two
use cases.
================================================================================
 #17  Ian 'Hixie' Hickson                             2010-10-19 06:27:42 +0000 
--------------------------------------------------------------------------------
If this request is just to change the <br> element's definition to match IE,
then that is definitely something we can do. Should I just change the spec to
instead say "A br element must separate paragraphs for the purposes of the
Unicode bidirectional algorithm. [BIDI]" ?
================================================================================
 #18  Adil                                            2010-10-20 21:31:56 +0000 
--------------------------------------------------------------------------------
(In reply to comment #17)
> If this request is just to change the <br> element's definition to match IE,
> then that is definitely something we can do. Should I just change the spec to
> instead say "A br element must separate paragraphs for the purposes of the
> Unicode bidirectional algorithm. [BIDI]" ?

I would like to see <br> defined as paragraph separator by default. However,
this alone does not solve a specific use case that affects my work. I am
developing a web app that displays text extracted from a book or a newspaper in
a similar way to this site:
http://newspapers.nla.gov.au/ndp/del/article/1118868. 

The requirement is to match exactly the line breaks in the original document
regardless of the font width. The problem is, for mixed rtl-ltr text, I need to
insert a line break that is not a bidi paragraph break.

If <br> is redefined as a bidi paragraph break instead of a line-break then, in
this case, the <br> will give the wrong reordering for the broken line.
================================================================================
 #19  Simon Pieters                                   2010-10-21 06:37:03 +0000 
--------------------------------------------------------------------------------
Would it work to make *two* subsequent <br>s (possibly with whitespace-only
text nodes between) a bidi paragraph break?
================================================================================
 #20  Maciej Stachowiak                               2010-10-21 06:47:49 +0000 
--------------------------------------------------------------------------------
(In reply to comment #17)
> If this request is just to change the <br> element's definition to match IE,
> then that is definitely something we can do. Should I just change the spec to
> instead say "A br element must separate paragraphs for the purposes of the
> Unicode bidirectional algorithm. [BIDI]" ?

Absent the explicit requirement to the contrary, doesn't this follow from what
the rendering section says about <br> (since it renders as a newline character,
which is unicode class B)?
================================================================================
 #21  fantasai                                        2010-10-21 07:33:48 +0000 
--------------------------------------------------------------------------------
You could indeed let it be defined implicitly by the rendering section, if
that's what it says. However, given that previous versions of HTML defined <br>
as a soft break, and the bidi spec itself cites <br> as an example of a soft
break, it's probably better to make it explicit. :)
================================================================================
 #22  Aharon Lanin                                    2010-10-26 03:12:12 +0000 
--------------------------------------------------------------------------------
(In reply to comment #17)
> If this request is just to change the <br> element's definition to match IE,
> then that is definitely something we can do. Should I just change the spec to
> instead say "A br element must separate paragraphs for the purposes of the
> Unicode bidirectional algorithm. [BIDI]" ?

How about this: define <br> to be a bidi paragraph separator, but define <br
ubi> to be a "soft" line separator. This would seem to follow from a part of
ubi's definition, which is to make the element act on its surroundings as a
bidi-neutral character. That way, you don't have to add bidibreak, but we still
get a soft <br> when we want one.
================================================================================
 #23  Ian 'Hixie' Hickson                             2010-11-02 22:17:19 +0000 
--------------------------------------------------------------------------------
People. Please. Stop proposing solutions before the problem is clearly stated.

Is the use case in comment 18 the use case that this bug is about? It seems
different than the previous discussed problems. Is comment 11's second
paragraph the problem? That's a very clearly defined problem, is it the one for
which the bug was filed?

Could someone clearly state what the problem is that this bug is about and
avoid the temptation to discuss possible solutions?
================================================================================
 #24  Ian 'Hixie' Hickson                             2010-11-03 08:27:27 +0000 
--------------------------------------------------------------------------------
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the tracker issue; or you may create a tracker issue
yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Did Not Understand Request
Change Description: no spec change
Rationale: see comment 23. Awaiting clear problem description.
================================================================================
 #25  fantasai                                        2010-11-03 11:37:12 +0000 
--------------------------------------------------------------------------------
This bug seems to have been filed about two related issues:
  1. Current definition of <br> is incompatible with widespread usage and
     implementation. (Comment 11)
  2. Current definition of <br> solves real use cases and its behavior
     needs to be available in HTML. (Comment 18)
================================================================================
 #26  Aharon Lanin                                    2010-11-03 17:23:15 +0000 
--------------------------------------------------------------------------------
(In reply to comment #25)
> This bug seems to have been filed about two related issues:
>   1. Current definition of <br> is incompatible with widespread usage and
>      implementation. (Comment 11)
>   2. Current definition of <br> solves real use cases and its behavior
>      needs to be available in HTML. (Comment 18)

Yes, this is an excellent summary. We want <br>'s default behavior changed to
match widespread usage (comment 11), but still want some way to deal with use
cases like the one in comment 18.
================================================================================
 #27  Ian 'Hixie' Hickson                             2010-11-03 18:47:49 +0000 
--------------------------------------------------------------------------------
Please file a separate bug for the separate issue. Each bug should be about
exactly one issue.
================================================================================
 #28  Aharon Lanin                                    2010-11-03 22:02:41 +0000 
--------------------------------------------------------------------------------
(In reply to comment #27)

This bug is now purely about the need to make <br> a bidi paragraph break, at
least by default. Will file a separate bug for the need to be able to force a
line wrap with the bidi semantics of LINE SEPARATOR when necessary.
================================================================================
 #29  Aharon Lanin                                    2010-11-03 22:43:19 +0000 
--------------------------------------------------------------------------------
(In reply to comment #28)
> Will file a separate bug for the need to be able to force a
> line wrap with the bidi semantics of LINE SEPARATOR when necessary.

Filed as bug 11211.
================================================================================
 #30  Ian 'Hixie' Hickson                             2010-11-04 06:08:12 +0000 
--------------------------------------------------------------------------------
Gecko and Opera people: could you comment on whether you are ok with changing
how <br> works from what you currently do (no effect on bidi) to what WebKit
and IE do (treat <br> as a paragraph separator)?

Could you also comment on whether you would like linebreaks in <pre> to be
treated the same way? (I don't know that we currently define how those are
supposed to be processed from an HTML perspective.)
================================================================================
 #31  Boris Zbarsky                                   2010-11-04 06:13:46 +0000 
--------------------------------------------------------------------------------
Elika, do you recall what we decided here?
================================================================================
 #32  Anne                                            2010-11-04 13:09:56 +0000 
--------------------------------------------------------------------------------
Not having checked with anyone in particular I think we would be fine with
making that change. The rendering of <pre> should probably be defined entirely
by CSS as it matters for everything with white-space:pre.
================================================================================
 #33  fantasai                                        2010-11-04 13:27:12 +0000 
--------------------------------------------------------------------------------
Per CSS2.1, block boundaries and forced line breaks of bidi class B (everything
except LINE SEPARATOR) break the bidi paragraph. So line breaks in <pre> follow
the default UAX9 rules.
================================================================================
 #34  fantasai                                        2010-11-04 13:58:08 +0000 
--------------------------------------------------------------------------------
Wrt Gecko, here's a summary of our discussions on the topic:
  http://groups.google.com/group/mozilla.dev.tech.layout/msg/2f14fe783b737cec?

I'll note additionally that the CSSWG resolution to clarify CSS2.1 on this
point was after those discussions. This was Issue 145 here:
  http://wiki.csswg.org/spec/css2.1#issue-145
I would expect any CSSWG Members to have spoken up during those discussions if
the proposed behavior was a problem. :)
================================================================================
 #35  Ian 'Hixie' Hickson                             2010-11-05 01:02:36 +0000 
--------------------------------------------------------------------------------
I'll take that as a "yes".

Ok, I'll make the change described in comment 30.
================================================================================
 #36  contributor@whatwg.org                          2010-11-05 20:07:09 +0000 
--------------------------------------------------------------------------------
Checked in as WHATWG revision r5670.
Check-in comment: Update <br>'s bidi behavior to match WebKit and IE rather
than Gecko and Opera.
http://html5.org/tools/web-apps-tracker?from=5669&to=5670
================================================================================
 #37  Ian 'Hixie' Hickson                             2010-11-05 20:15:34 +0000 
--------------------------------------------------------------------------------
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the tracker issue; or you may create a tracker issue
yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: see diff given above
Rationale: see long discussion above
================================================================================
 #38  Aharon Lanin                                    2010-11-08 12:14:22 +0000 
--------------------------------------------------------------------------------
The change is great. It actually also addresses 10812, although where it uses
the term "newline", the precise terminology would have been (or at least used
to be) "line break", thus covering <CR>, <LF>, and combinations.
================================================================================
 #39  Aharon Lanin                                    2010-12-23 16:20:53 +0000 
--------------------------------------------------------------------------------
The reason that this bug was filed is expressed (by me) in comment 11:

"What IE and Webkit do for <br> is treat it as a bidi paragraph break, despite
the spec saying otherwise. This is not about to change, because what RTL users
expect from <br> is bidi paragraph separation."

Or, as fanatasai summarized it in comment 25, "Current definition of <br> is
incompatible with widespread usage and implementation."

Well, it turns out that part of these statements is out of date. As I learned a
few days ago from Simon Mantagu, IE did change, to an extent. While IE7 indeed
treated <br> as a bidi paragraph break, even in its standards mode, IE8 treats
it as bidi whitespace (i.e. per the HTML4 spec) - when in *its* standards mode.
IE8 continues to treat <br> as a bidi paragraph break in its quirks mode and
its IE7 compatibility mode.

I guess that this should not be surprising, given that IE's standards mode is
about following standards, and HTML4 is the current standard. However, it did
surprise me, since I thought I had tested this in IE8 before filing the bug. (I
wasn't careful to make sure I was in IE8 standards mode.)

Please note that the other part of the reason for filing this bug remains
unaffected: despite the HTML spec limiting <br> to esoteric uses like poetry
and addresses, most of the time that <br> is used, it is used as the HTML
equivalent of a plain text newline. When used that way in a bidi document, it
does not work as intended unless it is a bidi paragraph break.
================================================================================
 #40  Ian 'Hixie' Hickson                             2011-01-08 22:23:30 +0000 
--------------------------------------------------------------------------------
I'm confused as to why this bug is reopened. The original bug is fixed, no? Is
it reopened to revert the fix so that instead of being compatible with IE and
WebKit, we go back to being compatible with IE and Firefox?
================================================================================
 #41  Shachar Shemesh                                 2011-01-09 03:19:02 +0000 
--------------------------------------------------------------------------------
(In reply to comment #40)
> I'm confused as to why this bug is reopened. The original bug is fixed, no? Is
> it reopened to revert the fix so that instead of being compatible with IE and
> WebKit, we go back to being compatible with IE and Firefox?

>From what I understood, the original bug said "please change what HTML4 says
should happen as it is incompatible with both what implementations do and what
happens in real life".

What Aharon is saying is that he was wrong about the first half of the
statement. Upon a re-check, current implementations do follow HTML4 when
working in 'standards' mode. This means, to me, that we should avoid causing
previously standard compliant behavior to suddenly become non-standard.

In other words, I believe this bug should, under these changed circumstances,
be marked "Invalid", and its solution reverted.

I should point out that, as far as I know, Aharon does not share this belief.
He thinks that the bug should still be fixed (i.e., the situation should remain
as it is), but he was honest enough to state that with the different state of
affairs, the discussion should be re-opened.

Shachar
================================================================================
 #42  CE Whitehead                                    2011-01-09 03:44:32 +0000 
--------------------------------------------------------------------------------
(In reply to comment #41)  Hi, will there be a way for a br element to still
sometimes constitute a bidi-paragraph break, although no longer by default?  

Thanks.

Best,

--C. E. Whitehead
cewcathar@hotmail.com 
> (In reply to comment #40)
> > I'm confused as to why this bug is reopened. The original bug is fixed, no? Is
> > it reopened to revert the fix so that instead of being compatible with IE and
> > WebKit, we go back to being compatible with IE and Firefox?
> From what I understood, the original bug said "please change what HTML4 says
> should happen as it is incompatible with both what implementations do and what
> happens in real life".
> What Aharon is saying is that he was wrong about the first half of the
> statement. Upon a re-check, current implementations do follow HTML4 when
> working in 'standards' mode. This means, to me, that we should avoid causing
> previously standard compliant behavior to suddenly become non-standard.
> In other words, I believe this bug should, under these changed circumstances,
> be marked "Invalid", and its solution reverted.
> I should point out that, as far as I know, Aharon does not share this belief.
> He thinks that the bug should still be fixed (i.e., the situation should remain
> as it is), but he was honest enough to state that with the different state of
> affairs, the discussion should be re-opened.
> Shachar
================================================================================
 #43  Shachar Shemesh                                 2011-01-09 04:10:32 +0000 
--------------------------------------------------------------------------------
(In reply to comment #42)
> (In reply to comment #41)  Hi, will there be a way for a br element to still
> sometimes constitute a bidi-paragraph break, although no longer by default?  
> 

To me, that seems broken. The whole point behind bidi break on <br> was to make
pages that would not consider BiDi "do the right thing". If you have a
non-default option, then the pages that would not consider BiDi still wouldn't,
and the pages that do can use <p>. The only real use I see for <br> as an
optional BiDi break is for applications such as greasemonkey, where the user
add our hypothetical CSS (or whatever) to the <br> in order to fix a broken
page.

Shachar
================================================================================
 #44  Aharon Lanin                                    2011-01-09 08:36:27 +0000 
--------------------------------------------------------------------------------
I reopened the bug because I was the one who opened it, and it turns that part
of the information based on which I opened it was incorrect. This deserved to
be brought to your attention, and let the chips fall where they may.

In my opinion things should stay as they currently are, i.e. with <br> defined
as a bidi paragraph break. As always, I admit that this is theoretically
inconsistent with the recommended use of <br>, for things like poetry and
addresses. However:

1. This is consistent with the way <br> is actually used, which is as the HTML
equivalent of a plain-text line break, which is most often actually a paragraph
break. Authors and applications that use <br> that way, e.g. when taking user
input into a contentEditable element, do so because it is a lot more convenient
to use than <p> or <div>, and (besides the bidi aspect, which most often is at
best a secondary concern) it works regardless of how the spec says <br> should
be used.

2. The bidi mis-ordering that is caused by the treatment as whitespace of a
line break that the author meant as a paragraph break is far worse than the
mis-ordering in the opposite case.
================================================================================
 #45  Shachar Shemesh                                 2011-01-09 08:46:43 +0000 
--------------------------------------------------------------------------------
(In reply to comment #44)
> 2. The bidi mis-ordering that is caused by the treatment as whitespace of a
> line break that the author meant as a paragraph break is far worse than the
> mis-ordering in the opposite case.

I should point out that this point is arguable. Also, the misordering done by
treating <br> as it should is fixable by placing an RLM/LRM (depending on the
desired paragraph direction) before and after the <br>, whereas the misordering
as a result of treating <br> as a paragraph break is not fixable at all.

Shachar
================================================================================
 #46  Adil                                            2011-01-10 11:57:16 +0000 
--------------------------------------------------------------------------------
(In reply to comment #44)
> I reopened the bug because I was the one who opened it, and it turns that part
> of the information based on which I opened it was incorrect. This deserved to
> be brought to your attention, and let the chips fall where they may.
> 

Talking of falling chips - a strong determining factor should be what Microsoft
says. Has any attempt been made to contact them? I have contacts within their
bidi group and I can ask if you need.
================================================================================
 #47  Aharon Lanin                                    2011-01-10 14:46:35 +0000 
--------------------------------------------------------------------------------
(In reply to comment #46)
> (In reply to comment #44)
> > I reopened the bug because I was the one who opened it, and it turns that part
> > of the information based on which I opened it was incorrect. This deserved to
> > be brought to your attention, and let the chips fall where they may.
> > 
> 
> Talking of falling chips - a strong determining factor should be what Microsoft
> says. Has any attempt been made to contact them? I have contacts within their
> bidi group and I can ask if you need.

A good idea. I have only recently made contact with them, and have not asked
about this specifically. I will do so now, but you might as well too.
================================================================================
 #48  Ian 'Hixie' Hickson                             2011-01-13 19:40:48 +0000 
--------------------------------------------------------------------------------
I wouldn't be above making this a quirks vs standards thing, if it meant we
could make <br> work right... but that might be tilting at windmills.

Microsoft people: some of the earlier comments request your input.
================================================================================
 #49  Sam Ruby                                        2011-01-17 21:54:38 +0000 
--------------------------------------------------------------------------------
Reminder: - Jan 22, 2010 is the cutoff for escalating bugs for pre-LC
consideration - all issues in tracker, calls for proposal issued by this date.
Consequences of missing this date: any further escalations will be treated as a
Last Call comment.
================================================================================
 #50  Ian 'Hixie' Hickson                             2011-02-15 01:18:57 +0000 
--------------------------------------------------------------------------------
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are
satisfied with this response, please change the state of this bug to CLOSED. If
you have additional information and would like the editor to reconsider, please
reopen this bug. If you would like to escalate the issue to the full HTML
Working Group, please add the TrackerRequest keyword to this bug, and suggest
title and text for the tracker issue; or you may create a tracker issue
yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: 

Well as much as I want to change this, realistically it seems that
compatibility with IE quirks mode for <br> is going to be more important than
compatibility with its standards mode, and I doubt Microsoft are willing to
change their quirks mode.

So I guess this gets left as is, unless any of the browsers are willing to
actually change the spec to the more sensible model and compatibility be
damned. In particular, if WebKit is willing to change to match what the spec
used to say (that BR doesn't reset the bidi paragraph level) then that would be
a compelling argument to change the spec here.
================================================================================
 #51  Adrian Bateman [MSFT]                           2011-02-15 01:32:09 +0000 
--------------------------------------------------------------------------------
Sorry for the delay in reviewing this - I had to track down the correct people.

We think IE standards mode is the correct behaviour. <br> is intended to be a
line break and not a paragraph break. <p> is for paragraphs. The spec says "br
elements must not be used for separating thematic groups in a paragraph" and
further says that this is an abuse of the <br> element.

<br> should mean a line break within a paragraph and not be treated as a
paragraph break. We don't currently plan to change this behaviour in IE9.
================================================================================
 #52  CE Whitehead                                    2011-02-15 21:35:49 +0000 
--------------------------------------------------------------------------------
(In reply to comment #51)
> Sorry for the delay in reviewing this - I had to track down the correct people.
> We think IE standards mode is the correct behaviour. <br> is intended to be a
> line break and not a paragraph break. <p> is for paragraphs. The spec says "br
> elements must not be used for separating thematic groups in a paragraph" and
> further says that this is an abuse of the <br> element.
> <br> should mean a line break within a paragraph and not be treated as a
> paragraph break. We don't currently plan to change this behaviour in IE9.

Hi.  This is o.k. for me I guess.  I do wonder, however, would it be worthwhile
to have a non-default option for break [br] in css that behaved as a paragraph
break rather than as simply a line break?  (The problem with [p] [/p] for some
coders is it's considered best to close it and it is declared at the beginning
of the line not the end; however for poetry [br] only makes sense as a line and
not a paragraph break.)

Best,

--C. E. Whitehead
cewcathar@hotmail.com
================================================================================
 #53  CE Whitehead                                    2011-02-15 21:36:16 +0000 
--------------------------------------------------------------------------------
(In reply to comment #51)
> Sorry for the delay in reviewing this - I had to track down the correct people.
> We think IE standards mode is the correct behaviour. <br> is intended to be a
> line break and not a paragraph break. <p> is for paragraphs. The spec says "br
> elements must not be used for separating thematic groups in a paragraph" and
> further says that this is an abuse of the <br> element.
> <br> should mean a line break within a paragraph and not be treated as a
> paragraph break. We don't currently plan to change this behaviour in IE9.

Hi.  This is o.k. for me I guess.  I do wonder, however, would it be worthwhile
to have a non-default option for break [br] in css that behaved as a paragraph
break rather than as simply a line break?  (The problem with [p] [/p] for some
coders is it's considered best to close it and it is declared at the beginning
of the line not the end; however for poetry [br] only makes sense as a line and
not a paragraph break.)

Best,

--C. E. Whitehead
cewcathar@hotmail.com
================================================================================
 #54  Ian 'Hixie' Hickson                             2011-02-25 10:09:45 +0000 
--------------------------------------------------------------------------------
(In reply to comment #51)
> We think IE standards mode is the correct behaviour.

Would you change IE's quirks modes to the same behaviour?

The question is not really what the right behaviour is _in theory_, but what
behaviour browsers should apply to existing Web content. If you would not
change your quirks mode behaviour, then that is a pretty strong signal that you
think that would browsers need to do in practice is what IE's quirks modes do.
================================================================================
 #55  Adrian Bateman [MSFT]                           2011-02-25 14:20:15 +0000 
--------------------------------------------------------------------------------
No, we won't change IE's quirks mode. Quirks mode is supposed to be quirky -
not changing it is a pretty strong signal that we don't want pages written 10+
years ago to start looking different. On the other hand we don't think changing
standards mode is the right thing to do. Not changing standards mode is a
pretty strong signal that we think what standards mode does is the right
behaviour.
================================================================================
 #56  Anne                                            2011-02-25 14:50:46 +0000 
--------------------------------------------------------------------------------
Opera would like to minimize the differences between the various modes. And
although I cannot speak for other non-Microsoft vendors I believe they feel the
same. Introducing new quirks is not nice.
================================================================================
 #57  Adrian Bateman [MSFT]                           2011-02-25 16:14:00 +0000 
--------------------------------------------------------------------------------
Opera, Firefox and IE standards mode all have the same behaviour.
================================================================================
 #58  Boris Zbarsky                                   2011-02-25 17:07:26 +0000 
--------------------------------------------------------------------------------
For what it's worth, last I checked we were strongly considering changing the
Gecko behavior.  We just hadn't gotten to it yet.
================================================================================
 #59  fantasai                                        2011-03-23 02:40:47 +0000 
--------------------------------------------------------------------------------
Here's the implementation data from smontagu's tests:

Impl      <BR>     <PRE> CR/LF
===============================
IE7        PS          PS
IE8        LS          LS
IE9        LS          PS
Chrome9   PS/LS       PS/LS
Safari5   PS/LS       PS/LS
FF3.6      LS          LS
Opera11    LS          LS

WebKit's behavior is really weird. Whether the break is LS or PS seems to
depend on what type of content is near the break: if there is an *embedded*
element after the <br>, it's treated as LS (the RTL effect passes through the
<br>).

To summarize, the ideal behavior would be IE9's, i.e.
Ideal      LS          PS
The safest behavior is probably IE7's,
Safe       PS          PS
================================================================================
 #60  CE Whitehead                                    2011-03-24 16:41:31 +0000 
--------------------------------------------------------------------------------
(In reply to comment #59)
> Here's the implementation data from smontagu's tests:
> Impl      <BR>     <PRE> CR/LF
> ===============================
> IE7        PS          PS
> IE8        LS          LS
> IE9        LS          PS
> Chrome9   PS/LS       PS/LS
> Safari5   PS/LS       PS/LS
> FF3.6      LS          LS
> Opera11    LS          LS
> WebKit's behavior is really weird. Whether the break is LS or PS seems to
> depend on what type of content is near the break: if there is an *embedded*
> element after the <br>, it's treated as LS (the RTL effect passes through the
> <br>).
> To summarize, the ideal behavior would be IE9's, i.e.
> Ideal      LS          PS

O.k; under this p is the only way to get paragraph breaks; this does solve the
use case described in comment 18 (I assume br clear="all" which is for images
would at least force a hard break).
> The safest behavior is probably IE7's,
> Safe       PS          PS

I thought this behavior however was to be relegated to quirks mode only,
and that people who wanted to use break as a paragraph separator would have to
be in quirks mode from now on.  But I still have my question about br
clear="all"  (but I drop my request to have any other hard bidi break as you
all are right; people who use it would know to use the p element).

Best,

--C. E. Whitehead
cewcathar@hotmail.com


(In reply to comment #59)
> Here's the implementation data from smontagu's tests:
> Impl      <BR>     <PRE> CR/LF
> ===============================
> IE7        PS          PS
> IE8        LS          LS
> IE9        LS          PS
> Chrome9   PS/LS       PS/LS
> Safari5   PS/LS       PS/LS

> FF3.6      LS          LS
> Opera11    LS          LS
> WebKit's behavior is really weird. Whether the break is LS or PS seems to
> depend on what type of content is near the break: if there is an *embedded*
> element after the <br>, it's treated as LS (the RTL effect passes through the
> <br>).

> To summarize, the ideal behavior would be IE9's, i.e.
> Ideal      LS          PS
> The safest behavior is probably IE7's,
> Safe       PS          PS
================================================================================
 #61  Levi Weintraub                                  2011-04-19 23:18:23 +0000 
--------------------------------------------------------------------------------
In tip of tree WebKit, we now treat br as a paragraph separator that clears all
state from Unicode control characters, but has no effect on state from
style/DOM (like dir=rtl).
================================================================================
 #62  Levi Weintraub                                  2011-04-19 23:19:17 +0000 
--------------------------------------------------------------------------------
In tip of tree WebKit, we now treat br as a paragraph separator that clears all
state from Unicode control characters, but has no effect on state from
style/DOM (like dir=rtl). This is not tied to quirks mode.
================================================================================

-- 
Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

Received on Tuesday, 25 September 2012 22:03:51 UTC