W3C home > Mailing lists > Public > www-international@w3.org > October to December 2014

Re: [css-text] I18N-ISSUE-316: Line breaking defaults

From: Richard Ishida <ishida@w3.org>
Date: Wed, 22 Oct 2014 17:51:02 +0100
Message-ID: <5447E076.3050505@w3.org>
To: fantasai <fantasai.lists@inkedblade.net>, Koji Ishii <kojiishi@gluesoft.co.jp>
CC: "Phillips, Addison" <addison@lab126.com>, "CSS WWW Style (www-style@w3.org)" <www-style@w3.org>, www International <www-international@w3.org>
The i18n WG discussed this at 
http://www.w3.org/2014/10/16-i18n-minutes.html#item06 and concluded that 
there should be normative wording to say that UAX14 SHOULD be followed 
except in those specific cases where issues arise (we don't think there 
are many besides the  kana characters, and mostly it's a question of 
encouraging those who haven't implemented UAX14 for a given set of 
characters to catch up those browsers that do).  See the test results 
below for details.

ri


On 08/10/2014 20:07, Richard Ishida wrote:
> On 25/07/2014 19:52, fantasai wrote:
>> On 07/25/2014 07:22 PM, Richard Ishida wrote:
>>> On 25/05/2014 06:28, Koji Ishii wrote:
>>>> I’m very happy to hear feedback where existing implementations do
>>>> differently from UAX#14, so that we could examine each
>>>> issue and decide whether or how to fix them.
>>>
>>> That information is available as follows:
>>>
>>> For general characters:
>>>
>>> Line break, BA: Break after characters
>>> http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba#ba_space
>>>
>>>
>>> (good support on the whole, but some categories not or half-heartedly
>>> supported by Firefox and IE - seems like just a question
>>> of adding them to a list somewhere)
>>> [...]
>>> Hope that helps,
>>
>> Very nice summary, yes. :)
>>
>> One of the main problems is actually the handling of various punctuation
>> like
>> slashes. A lot of these breaks need some amount of prioritization in
>> order to
>> work correctly. See, for example, this bug:
>>
>>    https://bugzilla.mozilla.org/show_bug.cgi?id=389710
>>
>> We do normatively require the behavior defined for the following
>> categories:
>>    BK, CR, LF, CM, NL, SG, WJ, ZW, GL, CJ
>>
>> I think I'd be OK to include the restrictions for opening and closing
>> punctuation... however, since there are very real problems with simply
>> adopting the UAX14 pairs table, I don't want to normatively require
>> its implementation.
>>
>> As Koji says, adopting UAX14 wholesale would require a very detailed
>> review of UAX14, its compatibility with dumb line-breaking algorithms
>> like a pairs table without prioritization, and Web-compatibility. And
>> that is not a task we'd like to tackle right now.
>
> But what we're asking for is that the spec recommend that UAX14 be used
> as the default, ie. in lieu of any other considerations - we're not
> asking that browsers conform to it rigidly.  (We are also suggesting,
> remember, that there be clear indication that tailoring is needed for
> certain characters in certain scripts.)
>
> Falling back to UAX14 as a default would at least (a) prompt
> implementers to consider improving conformance to UAX14 for cases that
> are currently just ignored and not actually controversial, (b) provide
> some kind of consistency and predictability (ie. interop) going forward
> for characters that are not (yet) problematic, and possibly (c) prompt
> people to request special behaviour for particular characters in
> particular scripts - at least they'd be starting from a common base.
>
> For example, as i point out in my summary [1], there there is currently
> a lack of interop in a subset of the 140-odd characters just at [2] and
> [3] that is unlikely to be controversial but which will be currently
> affecting support for content in a number of languages.  It would be
> good to ensure that Firefox and IE do the same as Chrome, Safari and
> Opera for these cases.
>
> At least while we wait for people tell us that something different is
> needed for a given character we would see a consistent handling of that
> character.
>
> ri
>
>
>
> [1]
> http://lists.w3.org/Archives/Public/www-international/2014JulSep/0073.html
>
> [2]
> http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba
>
>
> [3]
> http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-gl-wj#gl
>
>
>
Received on Wednesday, 22 October 2014 16:51:34 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:38 UTC