Re: [css3-text] 'text-transform' for Accents

Koji Ishii:
> First, I don't see benefits of splitting the property to multiple properties. Do you see any benefits for authors and/or implementers?

The usual benefit is separate cascading, but I’m not sure it’s relevant here. Glyph width mostly concerns sinograms and casing is most useful for “euroscript”, so they will hardly ever be used together.

> "capitalize" is "titlecase" in CSS1 terminology,

I suggested in the latest ‘capitalize’ thread to make it the dumb, language-agnostic, useless version, whereas ‘titlecase’ would, perhaps in Level 4, support language-specific behavior and only fall back to ‘capitalize’.

> so you're proposing to add "unicase", "mixedcase", and "camelcase", right?

I got a bit carried away with those, since I actually only wanted to write about accents, but yes – weak yes, though.

> Again, I can't find any good use cases that are important enough for implementers to implement. Do you have any good use cases?

Well, ‘unicase’ will probably be implemented anyway, but as a ‘font-variant’ sub-value, because of Open Type ‘unic’. I just believe that the interdependence of ‘font-variant’ and ‘text-transform’ should be carefully designed, i.e. more with authors in mind, less font technology.

The other values ‘mixedcase’ and ‘camelcase’ have hardly any valid usecases, they would only be included for a greater degree of completeness – or for humor.

> Third about width. What text-transform does is code point transformation and it doesn't deal with font properties. "fixed" and "proportional" are the font properties, so they should be done in CSS3 Fonts. "fullwidth" is Unicode code point transformation, so we have that here.

For an author it doesn’t matter much whether different glyphs or different characters are used to achieve the same desired visual result. Therefore those properties should be unified or closely connected, with code point transformation being the fallback for unavailable glyph switching.

> Last, accents. What are they used for? Do you have use cases? 

In some orthographies diacritic marks are mandatory on lowercase letters, but (more or less) optional on uppercase letters. This, as far as I know, has mostly technical reasons:
1. (most) accents on lowercase letters, especially roman vowels, fit easily above the base glyph and below the capital height, e.g. on typewriters and in metal type sorts and
2. some typewriter and computer keyboard layouts, e.g. the Swiss one (French and German), only feature the lowercase accented letters and the uppercase only with dead keys or not at all.
Although no linguistic harm is done when accents remain on uppercasing, some people nowadays are accustomed to not seeing capitals with diacritics, for instance almost all German place names starting with an umlaut are written with uppercase vowel (base) followed by lowercase ‘e’ and I hear in non-France French accents on capitals are basically always omitted, but Unicode defines the case pair with accents of course.

> Also, as you can see in the current spec, text-transform relies on Unicode for the logic to transform, and I can't find what you wrote in Unicode spec. Will you please tell me if and where they're defined in Unicode?

I haven’t looked for it there yet, but I think it’s basically NFD (or NFKD?) with everything from U+0300–036F, U+1DC0–1DFF and perhaps U+20D0–20FF as well as some other script-specific combining accents removed. (Combining diacritics probably have a special tag, too, but I can never remember those.)

> If not, I'd prefer to see it defined in Unicode first, and then CSS refers the logic. Is this reasonable?

I’m not sure the Unicode consortium considers this to be in their scope, but if they would do (or have already done) it, that’s just as fine with me.

Received on Thursday, 17 March 2011 16:17:49 UTC