Re: [css-text] I18N-ISSUE-316: Line breaking defaults

On 10/22/2014 3:12 PM, fantasai wrote:
> On 10/22/2014 12:51 PM, Richard Ishida wrote:
>> The i18n WG discussed this at 
>> http://www.w3.org/2014/10/16-i18n-minutes.html#item06 and concluded 
>> that there should be
>> normative wording to say that UAX14 SHOULD be followed except in 
>> those specific cases where issues arise (we don't think there
>> are many besides the  kana characters, and mostly it's a question of 
>> encouraging those who haven't implemented UAX14 for a
>> given set of characters to catch up those browsers that do). See the 
>> test results below for details.
>
> It's not an issue of kana characters, those are actually
> normatively covered in the section on 'line-break'.
>
> It's also not an issue of the non-tailorable sets you are
> citing, since those are already normatively required also.
>
> If you're asking about the BA category, in order to safely
> make a normative requirement, I need it split into two sets:

BA Category 1
> - characters after which a break is always permissible
>     and recommended, such as the visible word separators

BA Category 2
> - characters after which a break is sometimes a good
>     idea but not always, such as hyphens and slashes

Are there any other members of Category 2?

Is the issue "generic" to all kinds of hyphens and slashes,
or is it "specific" to special strings like dates, path names
or identifiers?

If it's the latter, then the proper approach would be to
focus on the fact that you may want either

a) normative default breaking of path names and similar
     identifiers to be different from normative default line
     breaking of ordinary text (but then you'd have to specify
     that in detail)
b) normatively reserve the ability for UAs to do "better"
     for those kinds of strings (and leave it up to the UAs to
     recognize and handle them).
>
> I will not issue a normative recommendation to honor BA
> behavior of the second category. This will result in bad
> line-breaking when implementations try to comply without
> performing a thoughtful survey of each individual case
> and what contextual information the line break may need
> to consider. Please note that this is not a theoretical
> concern: we have already run into this exact problem.
>

I suspect that the issue is more about substrings that represent
some special context, rather than the generic occurrence of
these in running text.

Because of that, I suggest the proper approach would be to
specify that UAs should be allowed (encouraged?) to recognize
patterns like date strings and to apply specific line breaking
logic to them (treat them as embedded objects with their
own rules, in other words).

If I understood the discussion to this point correctly, the use
of UAX#14 was intended as a common default, not as a
limit to what UAs could do to provide more sophisticated
line breaking.

A./


> ~fantasai
>
>

Received on Wednesday, 22 October 2014 22:50:05 UTC