Re: [csswg-drafts] [css-fonts] Handling of Standardized Variation Sequences

In the use case of the digitization of old Han manuscripts, most of the authors have long since passed away. The entity that is converting the manuscript into digital text is also the stylesheet author. In this case, the text is covered into digital text with all the variation selectors to reproduce the form that was used in the historical documents.

Give the example of The Analects 論語. Sinologists which are studying the text may prefer a very close approximation of the exact glyphs used since it may include important information on when that particular version was published or amended. However, for the general Chinese user, they would not care about the exact orthography used and care more about the semantic content. They may prefer all variants to be canonicalized to their modern orthography.

In this case, the stylesheet author only provides a mode (say, input.normalize:checked+p) which strips the variants, while the reader is the person who activates the stripping to his/her convenience. The reader is responsible for his/her actions if there is any loss in semantic meaning. (For phrases where the exact style should never be stripped, it could be further marked up with HTML+CSS).

> This risk is generally low, at least for Latin scripts, which do not change meaning that much depending on case (though there are certain significant exceptions).

IRG member bodies are supposed to verify that the variants they submit to be encoded via variation selectors are indeed unifyable and not non-cognate. If two variants are recognized as semantically different, they would be disunified by the non-cognate rule.

-- 
GitHub Notification of comment by hfhchan
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/1710#issuecomment-371072536 using your GitHub account

Received on Wednesday, 7 March 2018 09:10:01 UTC