Re: [csswg-drafts] [css-text] Add new CSS text-transform values for math (#3745) from Florian Rivoal via GitHub on 2019-03-26 (public-css-archive@w3.org from March 2019)

From: Florian Rivoal via GitHub <sysbot+gh@w3.org>
Date: Tue, 26 Mar 2019 04:48:25 +0000
To: public-css-archive@w3.org
Message-ID: <issue_comment.created-476474263-1553575704-sysbot+gh@w3.org>
@joanmarie (disclaimer: you clearly know what you're talking about. I am not dismissing your expertise, but trying to understand it, as it doesn't quite mesh with my own sense of how this is supposed to work).


This is a bit counter intuitive to me, as I'd expect an upper-casing text transform to be the stylistic in the same way as a font-feature or a color change would be, and therefore that it should not be preserved, to avoid having `::first-line { text-transform: uppercase}` result in the first line being read in a screaming voice or spelled out (neither of which seem the intended effect), or to avoid having 𝕸𝖆𝖙𝖍 being read as "mathematical bold fraktur capital M, mathematical bold fraktur small a, mathematical bold fraktur small t, mathematical bold fraktur small h" instead of as "math". This is not to say that this styling information should not be available to a user of the screen reader in some form (but then I'd expect similar announcements for `text-transform: uppercase` and `font-variant-caps: all-small-caps` to be announced in a similar way, or for `text-transform: math-bold-fraktur` and `font-family: "Fraktur Bold"`), but when reading "just the text", I'd expect the transform not to be applied. If it's semantically important, shouldn't it be in the source?

Also, only having access to the post-transform text seems to play poorly of the i18n oriented values:
* It breaks `text-transform: full-width`. Using it to display "IBM" as "ＩＢＭ" can be desirable in Chinese / Japanese / Korean text (particularly but not only in vertical text), but that doesn't mean that it should be read aloud as "full width Latin capital letter I, full width Latin capital letter B, full width Latin capital letter M". I still want to hear IBM. It's even worse if the word isn't an accronym to begin with. Maybe the Text-to-speech engine can be smart enough to know how to read words regardless of which character variant is being used, but short of reversing the transform (but then why have it applied in the first place), I expect it will, at least occasionally, run into things it doesn't know how to say other than by enumerating the Unicode character names. That sounds bad.
* It defeats the point of `text-transform: full-size-kana`: "りょ" and "りよ" are different (the first is ryo, the second is riyo), but at the very small font sizes sometimes used in ruby, using the smaller ょ in りょ can make the text hard to see, and so authors sometimes want to display りよ instead of りょ despite the difference. The whole point of doing it using `text-transform: full-size-kana` rather than by hard-coding りよ into the DOM is so that screen readers can read it as ryo instead of riyo. For example, if the ruby for 無料 (which means free / costless) is encoded (as it should) as むりょう, that is read is muryō, and all is understandable. If it is encoded (or rendered into) むりよう/muriyō, this could be misinterpreted as 無利用, which means "useless". For a sighted reader, the potential confusion is probably preferable to the letter being too small to read, and they would see the original ideographs next to it, clearing things up if they can read them. But for someone going through a screen reader, it could be pretty confusing. If the screen reader reads both the original text and the ruby, you still hear "free and useless", and if it sustitutes the ruby for the original text, you would just hear "useless", instead of the intended "free (free)".

All in all, I'd definitely expect the styling information to be available to the screen reader so that it can do something useful with it and inform the user of notable styling features that are deemed relevant (such as, as you mentioned, playing some tone, changing the voice pitch, displaying two dot-6 cells on a braille display, or whatever is appropriate), but I'd expect the text itself to be the untransformed text. If we don't have that, I don't really understand what the point of having text-transform in CSS instead of doing some server side preprocessing change to the document content itself (or the same DOM change in js/react/vue/… for people who're more fashionable than me).

-- 
GitHub Notification of comment by frivoal
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/3745#issuecomment-476474263 using your GitHub account
Received on Tuesday, 26 March 2019 04:48:26 UTC