Re: [css-text] I18N-ISSUE-316: Line breaking defaults

On 25/07/2014 19:52, fantasai wrote:
> On 07/25/2014 07:22 PM, Richard Ishida wrote:
>> On 25/05/2014 06:28, Koji Ishii wrote:
>>> I’m very happy to hear feedback where existing implementations do
>>> differently from UAX#14, so that we could examine each
>>> issue and decide whether or how to fix them.
>>
>> That information is available as follows:
>>
>> For general characters:
>>
>> Line break, BA: Break after characters
>> http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba#ba_space
>>
>> (good support on the whole, but some categories not or half-heartedly
>> supported by Firefox and IE - seems like just a question
>> of adding them to a list somewhere)
>> [...]
>> Hope that helps,
>
> Very nice summary, yes. :)
>
> One of the main problems is actually the handling of various punctuation
> like
> slashes. A lot of these breaks need some amount of prioritization in
> order to
> work correctly. See, for example, this bug:
>
>    https://bugzilla.mozilla.org/show_bug.cgi?id=389710
>
> We do normatively require the behavior defined for the following
> categories:
>    BK, CR, LF, CM, NL, SG, WJ, ZW, GL, CJ
>
> I think I'd be OK to include the restrictions for opening and closing
> punctuation... however, since there are very real problems with simply
> adopting the UAX14 pairs table, I don't want to normatively require
> its implementation.
>
> As Koji says, adopting UAX14 wholesale would require a very detailed
> review of UAX14, its compatibility with dumb line-breaking algorithms
> like a pairs table without prioritization, and Web-compatibility. And
> that is not a task we'd like to tackle right now.

But what we're asking for is that the spec recommend that UAX14 be used 
as the default, ie. in lieu of any other considerations - we're not 
asking that browsers conform to it rigidly.  (We are also suggesting, 
remember, that there be clear indication that tailoring is needed for 
certain characters in certain scripts.)

Falling back to UAX14 as a default would at least (a) prompt 
implementers to consider improving conformance to UAX14 for cases that 
are currently just ignored and not actually controversial, (b) provide 
some kind of consistency and predictability (ie. interop) going forward 
for characters that are not (yet) problematic, and possibly (c) prompt 
people to request special behaviour for particular characters in 
particular scripts - at least they'd be starting from a common base.

For example, as i point out in my summary [1], there there is currently 
a lack of interop in a subset of the 140-odd characters just at [2] and 
[3] that is unlikely to be controversial but which will be currently 
affecting support for content in a number of languages.  It would be 
good to ensure that Firefox and IE do the same as Chrome, Safari and 
Opera for these cases.

At least while we wait for people tell us that something different is 
needed for a given character we would see a consistent handling of that 
character.

ri



[1] 
http://lists.w3.org/Archives/Public/www-international/2014JulSep/0073.html

[2] 
http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-ba

[3] 
http://www.w3.org/International/tests/repository/css3-text/line-break-baspglwj/results-gl-wj#gl

Received on Wednesday, 8 October 2014 19:07:56 UTC