[Bug 10828] i18n comment 4 : new attribute: bidibreak


--- Comment #6 from Aharon Lanin <aharon.lists.lanin@gmail.com> 2010-10-11 08:17:22 UTC ---
(In reply to comment #4)
> (In reply to comment #3)
> > (In reply to comment #1)
> > > If this really needs to be expressed in markup, perhaps a new element would be
> > > better.
> > > 
> > > In particular, having a markup attribute that doesn't correspond to a CSS
> > > property but still inherits and affects rendering of other elements is an
> > > unusual pattern and would be awkward to implement.
> > > 
> > > Is there a Unicode character that creates a line break but has Unicode class WS
> > > instead of B? If so, that would make it easier to define what happens for the
> > > proposed "soft" line breaks.
> > 
> > The equivalent "soft" line break Unicode character is LINE SEPARATOR, U+2028.
> > 
> > Regarding doing this through a new element, it would get the job done, but I
> > have been warned that new elements are problematic in terms of support from
> > existing software (e.g. how would an existing browser know that the new element
> > does not need a closing tag?) and generally very hard to get in.
> New global attributes are also hard to get in. And in this case, I think an
> inheriting global attribute is not as clean an approach.
> Question: does including U+2028, either as a literal unicode character or as a
> numeric character reference, get the job done? Or does that character get
> affected by whitespace collapsing?

As far as I am concerned, either bidibreak or and a new element is fine, and I
would prefer to leave the choice up to the experts here.

Regarding LINE SEPARATOR, I guess what Maciej is proposing is a change in the
spec that explicitly says that it is to be treated as a (bidi-soft) line break
in all contexts and is not subject to whitespace collapsing. If so, the
PARAGRAPH SEPARATOR (U+2029) should be treated similarly: a bidi-hard line
break that is not subject to whitespace collapsing, i.e. exactly the same
effect as <br>. That's because these two characters are a pair introduced into
Unicode at the same time for the same reason: to provide unambiguous
alternatives to newline (and the othet line break characters).

Such a solution would also be fine with me (as long as the <br> spec is changed
to make it bidi-hard - or the browser manufacturers achieve a unanimous
commitment to treat it as bidi-soft).

However, please note that http://unicode.org/reports/tr20/#Line currently says
the following about U+2028 and U+2029:

"Problems when used in markup: Including these characters in markup text does
not work where it would duplicate the existing markup commands for delimiting
paragraphs and lines."

It is up to the HTML experts here to judge whether starting to support these
characters in HTML contexts where appropriate mark-up can be used instead would
be in keeping with the spirit of HTML, given that apparently this was not
considered to be the case at some point in the past.

Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Monday, 11 October 2010 08:17:25 UTC