- From: Florian Rivoal <florianr@opera.com>
- Date: Tue, 05 Jul 2011 16:17:30 +0900
- To: www-style@w3.org
On Fri, 01 Jul 2011 11:32:51 +0900, fantasai <fantasai.lists@inkedblade.net> wrote: > This can be done with the HTML lang tag, which can accept script subtags > from ISO 15924. If a document is tagged as lang="zh-Hant", we know it is > written in traditional Chinese, and therefore will have an upright > base orientation. Similarly if a document is tagged as lang="ja-Jpan", > we know it is written in a combination of Han, Hiragana, and Katakana, > and its base orientation is upright. I agree this is a good approach. > The question then is, what do we do if the script is not tagged (as it > almost never will be)? Do we use a heuristic, or default to one > orientation or another? If so, which one? I think we need to define an algorithm for determining what the language is. We need this here, and there are a fair few places in CSS3-TEXT where that would come in handy too. It probably needs to be written in one of the two specs, and referred to from the other. The algorithm should probably be something like: 1- if you have a lang attribute, use that 2- otherwise, if you have an Content-Language http header, use that 3- otherwise, if you have a <meta http-equiv="content-language" ...> use that 4- otherwise, if you have a charset specified in the http headers and that charset is specific to a language (shift-jis, BG, big5, EUC-KR... the list must be explicit), you're in that language 5- same as 4, but with a meta tag, rather than an http header 6- otherwise, you don't know Not sure if step 4 and 5 are a good idea though. A variation on 4 and 5 could be that the http headers or meta tags specify one such encoding and at least one non ascii character is used in the actual content. But that has performance implications I am not sure I like, and it might not really help anyway. Having an exhaustive list of all languages and their respective orientation is a lot of work, and I don't think it is necessary. Since I believe that the languages that want sideways outnumber the languages that want upright by a lot, we should have a list of all the languages that are upright, and say that everything else is sideways. This approach also gives us an answer to what to do when we don't know the language: sideways. I don't think it really matter which arbitrary default we pick, as long as browsers are consistent about it, but this seems like a decent choice. - Florian
Received on Tuesday, 5 July 2011 07:17:55 UTC