[csswg-drafts] [css-text-4] Don't provide a language parameter for word-boundary-detection (#7193)

r12a has just created a new issue for https://github.com/w3c/csswg-drafts:

== [css-text-4] Don't provide a language parameter for word-boundary-detection ==
2.2.1. Detecting Word Boundaries: the [word-boundary-detection](https://www.w3.org/TR/css-text-4/#propdef-word-boundary-detection) property

>auto([<lang>](https://www.w3.org/TR/css-text-4/#typedef-word-boundary-detection-lang))
>    This value directs the user agent to perform language-specific content analysis to determine where to insert [virtual word boundaries](https://www.w3.org/TR/css-text-4/#virtual-word-boundary).
>
> [`<lang>`](https://www.w3.org/TR/css-text-4/#typedef-word-boundary-detection-lang) must be a valid CSS [`<ident>`](https://www.w3.org/TR/css-values-4/#typedef-ident) or [`<string>`](https://www.w3.org/TR/css3-values/#string-value). It represents an IETF BCP 47 language range (see [[BCP47]](https://www.w3.org/TR/css-text-4/#biblio-bcp47)). If the UA does not support word-boundary detection for all languages represented by the specified range, that specified value is invalid (and will cause the declaration to be ingored).

Fantasai provided some additional explanation about this feature, which explains why a range of language tags would be used:

> I think the idea of this parameter was that it enables word-boundary-detection for the specified language(s) only; other languages are not affected. What language the element is actually in is what determines the language used for the word-boundary-detection. For example, if I have a trilingual document in English, Japanese, and French, if I set word-boundary-detection: auto(ja) then it will enable detection for Japanese paragraphs only. If there are no Japanese paragraphs, it won't have any effect.

The i18n WG thinks that the choice of whether or not to apply the word boundary detection algorithm should be set by applying the word-boundary-detection styling to the relevant content. The language information used should be that provided by the `lang` attribute, and not supplied as a parameter with this property value.

We don't think the content author is able to guess what languages are supported by the user agent, so it doesn't seem useful to make them specify language in the property value.  We think that the approach currently described in the spec also requires the content author to have a level of understanding about language tagging that is too high (see the examples about Cantonese). 

We think that there should also be a simple recommendation that user agents SHOULD NOT apply a boundary detection algorithm to text in a language for which the algorithm is not defined (modulo decisions wrt dialect support).

Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/7193 using your GitHub account


-- 
Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config

Received on Friday, 1 April 2022 14:50:19 UTC