Re: [css3-text] script-specific functionality

On 04/06/2011 01:30 PM, Håkon Wium Lie wrote:
> In today's telcon I took an action point to suggest wording for an
> issue to be added to css3-text:
>
>    http://dev.w3.org/csswg/css3-text/
>    http://www.w3.org/TR/css3-text/
>
> I suggest we add this note:
>
>    This draft describes features that are specific to certain scripts.
>    There is an ongoing discussion about where these features belong: in
>    existing CSS properties, in new CSS properties, or perhaps in other
>    specifications.

Added.

> To explain why this is necessary, and to continue the ongoing
> discussion, I will make some remarks about the two new keyword values
> on 'text-transform': "fullwidth" and "fullsize-kana". They expose
> several issues:
>
>    - 'text-transform: fullsize-kana' only applies to one script (Kana),
>      and (seemingly only) to a certain context (ruby). The scope of
>      this value is vastly different from 'uppercase' and 'lowercase'.

I wouldn't say vastly, but yes it is different in scope.

   kana < bicameral < all scripts

There is a similar jump in magnitude between each pair.

>    - it's unclear what UAs that do not support the script in question
>      should do when encountering these values. While it may be
>      self-evident that UAs ignore 'fullsize-kana' when they don't
>      support/display kana, it's not so clear what should happen when
>      "text-transform: fullwidth" is set. Does 'fullwidth' refer to the
>      U+FF00-FFEF block? Should a UA look for fullwidth characters in
>      the font? Or try to synthesize? How wide is a 'full width' when
>      synthesizing?
>
>        http://en.wikipedia.org/wiki/Halfwidth_and_Fullwidth_Forms#In_Unicode

The behavior here is very clear. The process is a Unicode code point
transformation as dictated by the definitions in UAX11 and UAX44.
Once that transformation takes place, the rendering behavior is
*exactly* the same as if those characters were in the original
document source.

>    - other languages may have other notions of text-transform. For
>      example, Wikipedia notes that:
>
>        Also similar to case is recent usage in Georgian, where some
>        authors use isolated letters from the Asomtavruli alphabet
>        within a text otherwise written in Mkhedruli in a fashion that
>        is reminiscent of modern usage of letter case in the Latin,
>        Greek, and Cyrillic alphabets.
>
>        http://en.wikipedia.org/wiki/Letter_case
>
>      Should we add the "mkhedruli-to-asomtavruli" value? Perhaps. Or
>      not. Do we discriminate against authors in Georgia if we don't?
>      Perhaps. Or not. We used to have these discussions for
>      'list-style-type': what level of use warrants a new keyword value?
>      Then we discovered that we could add a mechanisms so that authors
>      could define their own mappings:
>
>        http://www.w3.org/TR/2007/WD-css3-gcpm-20070504/#named2
>        http://dev.w3.org/csswg/css3-lists/#counter-style

Yes, but even though we defined a mapping mechanism, we still have
predefined keywords for the more common (or perhaps, "less obscure")
cases. If text transformation becomes a list-style-type case study,
where we have many scripts requesting transform maps, then we can
look into creating a generic system and redefine the existing
keywords in terms of that system. But I don't think we're there yet
(personally I doubt we'll get there), and if/when we get there, we
can underlay such a system, just as we did with list-style-type.

>    - the 'fullwidth' value doesn't feel like a text transformation to
>      me. It's more like a 'font-family: monospace' no? Or something
>      else.

It's not really like "font-family: monospace". It looks a bit like that,
but a) there is no change in font (ideally, depends on glyph availability)
     b) there is a change in line breaking behavior associated with the
        code point transformation
     c) there is a change in text orientation behavior associated with
        the code point transformation

(Basically, the fullwidth characters, in addition to take up the space of
an ideographic character, also behave like ideographs rather than Latin
characters in terms of their layout behavior.)

>    - adding new values to existing properties decrease
>      interoperability; we can't avoid it, but other options should be
>      sought

I don't think this is necessarily true. Yes, you can construe cases in
which interoperability is compromised, but in reality, I can't think of
this being a real problem for the set of values here. Also, adding new
values makes it *very easy* to provide fallback behavior:

   text-transform: uppercase;
   text-transform: uppercase fullwidth;

> Therefore I suggest we:
>
>    - add the proposed note/issue to the draft

Done.

>    - consider adding a generic mechanism for glyph transforms. The tr///
>      operator in Perl could give us inspiration without taking us all
>      the way to regular expressions. For example, we could have:
>
>         text-transform-range: "'" "’", "a-z" "\FF41-\FF5A"'

text-replace take II? :-)

>    - consider if the glyph substitutions in question can be described
>      in other languages. For example, I believe that Opentype Features
>      can be used to describe the mappings in question, no?

No. See above.

>    - consider moving "fullsize-kana" to Ruby-centric specification.

I don't think this is a good idea. Presumably we would also move
font-variant: ruby to the Ruby spec. But the Ruby spec is really about
layout mechanisms. Ruby-associated styling that falls under different
scopes should be kept with those scopes. We can have cross-linking and
examples, of course. But I think that splitting up the values of a
property according to their use cases rather than according to their
behavior is not a good way to organize a technology spec.

>    - consider the relationship between "fullwidth" an the generic font
>      families

I don't think "fullwidth" relates to generic font families at all. It
may relate to some OpenType features, however, and this could be
something to investigate.

~fantasai

Received on Wednesday, 6 April 2011 21:26:45 UTC