W3C home > Mailing lists > Public > www-style@w3.org > November 2011

RE: [css3-text] comments on text-transform

From: Koji Ishii <kojiishi@gluesoft.co.jp>
Date: Wed, 2 Nov 2011 17:50:33 -0400
To: "www-style@w3.org" <www-style@w3.org>
Message-ID: <A592E245B36A8949BDB0A302B375FB4E0CF961FFEB@MAILR001.mail.lan>
I'm sorry that I could not jump into the discussion during the F2F as it went too fast for me. Please allow me to continue by e-mail.

Basically, I like the idea of having a generalized transforming engine, but to do so, we have to resolve several questions that are hard to answer.

Should it transform by code point, legacy grapheme cluster, or extended grapheme cluster? There was a discussion to use legacy grapheme cluster after the F2F, but I'm not sure that is the correct answer for Thai or Devanagari.

Should it compare in normalized form or not? Vietnamese might say yes because their forms vary by the input method/OS author uses. Japanese might say no because doing so may result in unexpected matches.

In which order should it apply when multiple values are specified? Before "uppercase" or after?

Should it support "[a-z]" notation just like "tr" command does? Doesn't regex give even more flexibility?

All these questions are not easy to answer without use cases in mind, and therefore I'm afraid that the feature may bloat if we don't have use cases we want to solve.

Håkon mentioned CSS Lists as an example, but I'd like to take it as an example from opposite side. CSS Lists gathered several values for i18n requirements, and when we had enough values on the table, we started thinking a generalized solution that can solve all those use cases. I like that approach because at that point, we might be able to answer to questions. We can also make sure that the generalized solution doesn't miss any important use cases.

If there's a transformation rule that can remove accents, not perfectly but can solve a good amount of cases, I'm glad to add the value to the spec, probably with at-risk though. It may not be implemented in Level 3, but having it on WD surely help to think about a generalized solution in future.


Regards,
Koji

-----Original Message-----
From: www-style-request@w3.org [mailto:www-style-request@w3.org] On Behalf Of Koji Ishii
Sent: Monday, October 10, 2011 2:36 PM
To: www-style@w3.org
Subject: RE: [css3-text] comments on text-transform

I agree that it's quite limited compared to, say, text-transform: uppercase, but is far more widely usable than, say, hiragana to katakana.

For hiragana-to-katakana, I asked about this idea to several Japanese authors before but none of them wished to have it. It was not comprehensive research, but I think using hiragana/katakana distinction for stylistic reasons is rare, and even when used, it's a bit different from other styles such as bold, italics, underlines, or emphasis marks. I guess the closest thing in English is double-quoting a word to emphasize.

U+2014/2015 could be a good use case, but you can't switch styles based on fonts, so the feature isn't useful by itself alone.

I could be wrong because I don't have good idea on other examples, but I'm seeing this in the following perspective:
1. Are the requirements real, from real authors using the language in their real life?
2. How much are implementers interested in implementing them?
3. Is it worth to delay the CSS Text Level 3, or okay to postpone to level 4?

"Full-size-kana" suffices both 1 and 2, and it will not delay the spec as it was already defined.

Do other examples suffice any of these? I don't have answers, so I believe postponing until we know better is the right thing to do. Do you, or does anyone in this ML know?


Regards,
Koji

-----Original Message-----
From: www-style-request@w3.org [mailto:www-style-request@w3.org] On Behalf Of Florian Rivoal
Sent: Friday, October 07, 2011 4:20 PM
To: www-style@w3.org
Subject: Re: [css3-text] comments on text-transform

On Wed, 05 Oct 2011 02:11:50 +0900, Koji Ishii <kojiishi@gluesoft.co.jp>
wrote:

> Assuming you guys know that the use case for the value is Ruby; I 
> don't think people who wants to use ruby on the web is "very few".
>
> The next question is how many percent of ruby usage would use this 
> feature. I don't have the data, but if we learn from publishing, most 
> published materials apply this rule to Ruby today. So, I think "very 
> few" is too strong. The basic idea of the feature is small Kana in 
> Ruby are too hard to read. It's even harder to read small letters on 
> screen than when printed.

It all depends on the perspective, but I would say the need for full-size-kana is indeed quite limited. First it is specific to the Japanese language.  
Within
Japanese, it is limited to documents using ruby. Then, it is a stylistic
choice:
some authors will actually prefer the small size kana.

It doesn't mean there is no real use case, but it is a specialized one.

It is the most prominent specialized use case we currently have. I think it would be a good opportunity to use it to drive a generic solutions that can benefit others.

> Regarding the "full-size-kana" or general @-rules; I'm more than happy 
> to agree with @-rules if there were several use cases just like CSS3 
> Lists @counter-style does. I actually like the idea very much, and I 
> heard some people like it too. But as far as I know, "full-size-kana"
> is the only use case for now. As long as there's only one use case, I 
> think developing @-rules is too much feature.
>
> If we find more use cases in future, we could migrate "full-size-kana"  
> to @-rules as CSS3 Lists did from CSS2.

A few other specialized used cases have been mentioned. Of course, they are all relatively minor, but that's the point: since they are unlikely to be addressed individually, trying to aim for a generic solution that solves all in one go sounds better to me.

A few examples, either already mentioned or reasonable to expect:

* Removing accents. This cannot be solved properly in the generic case, so the WG has (rightfully) rejected a hard-coded text transform to do that.  
But in more limited contexts, it would often be perfectly doable for authors to define a custom rule that removes accents the way they want it.

* Hakon has mentioned mkhedruli to asomtavruli http://en.wikipedia.org/wiki/Letter_case#Other_forms_of_case


* hiragana to katakana: the mapping is easy to determine, and this could be desirable once in a while for stylistic reasons

* mapping U-2014 to U-2015 (or the other way around), as mentioned in the writing modes discussion.

* ſ (long s) to s

Regards,

  - Florian

Received on Wednesday, 2 November 2011 21:50:49 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 17:20:46 GMT