- From: David Muschiol <david@david-muschiol.de>
- Date: Sat, 2 Aug 2008 16:44:55 +0200
- To: "Ian Hickson" <ian@hixie.ch>
- Cc: public-html@w3.org
On Sat, Aug 2, 2008 at 1:29 AM, Ian Hickson <ian@hixie.ch> wrote: > On Fri, 1 Aug 2008, Simon Pieters wrote: >> Wouldn't :lang(en-US) match <span lang="only en-US">? > > Yeah, but it would break use of the |= attribute selector. Hmm, sounds like a serious problem if you ask me. Serious enough to reject the "[only] <bcp-47-code>" idea? Anyhow, I think it is time to sum up all the proposals made and discuss their respective advantages and disadvantages. Let me try and give a concise overview, including some subjective comments: – Chris Wendt and Chris Wilson suggested [1] a global @translate which is compatible with ITS [2]. In order to avoid redundancies, if we introduce @translate, we should only allow it in the HTML5 serialisation – just like @lang – and encourage XHTML5 authors to use @its:translate. Personally, I like this idea since it offers authors a very quick and simple way to mark up content that should not be translated – and we all know that simplicity is one of the key prerequisites for the success of a solution. – Ian suggested "a new keyword for 'lang', instead, which means 'not translatable' or some such" [3]. Korel taught us [4] that ISO 639-2 already defines the "zxx" and "und" values for similar purposes. [5] However, they are far away from covering all our requirements. – Later, Ian suggested [6] a notation like lang="[only] <bcp-47-code>". It would work with existing translation tools that already implement @lang correctly – while the popular Web translators do not seem to fall into this category. Cons: As mentioned above, this notation would break the use of the |= attribute selector. Furthermore, we can redefine @lang indeed, but @lang would be incompatible with @xml:lang then. These difficulties have finally made me dislike this idea. – Leif came up with the idea to register language tag extensions with IANA instead [7]. They could look like "en-q-notTranslate" or "en-q-name". An additional benefit of this solution would be that language tag extensions could be reused for other purposes: <link rel="alternate" lang="fr-q-original" href="text.html.fr"> tells the user that the French version of the text is the original one. I like this idea since, for example, <span lang="de-q-name">Daniel Schwarz</span> does not only show _that_ this span must not be translated to "Daniel Black" in English, but also _why_ it must not. Toby elaborated on this issue. [8] – Thus, several people raised the question whether there are some elements such as <code> and <kbd> that should not be translated by default. [9] Whether such a convention makes sense depends on how often these elements are misused on the Web, as Simon pointed out. [10] Possibly, some statistic data could help us out? If it does not break too much legacy content, I would strongly suggest defining clear rules here. This would save authors of technical documents from always having to specify <meta name="notranslate" content="code, kbd, …"> (see below) or even <code translate="no">. – For page-wide translation rules, Simon suggested <meta name="notranslate"> [11]. In its @content, CSS selectors specify which elements are not to be translated. Not universal enough to solve all our problems elegantly, I guess, but quite a useful idea anyway. (I do not consider specifying <meta name="notranslate" content=".notranslate"> on every page elegant.) – Dave is of the opinion that "auto-translate systems should be more careful, and only translate text that's in the overall language of the page, into the target, and not the 'call-outs' that are in a different language." [12] I am still not sure if this is workable and would like to wait for feedback of other people here that I can take into consideration before finally taking a stand on this idea myself :-) To conclude, I think that @translate, language tag extensions, <meta name="notranslate"> and defaulting <code>, <kbd> etc. to translate="no" could all make sense parallely. Do you think it is possible to reconcile all these solutions in HTML 5? Probabely too many redundancies, hm? What do you think, which solutions should we finally pick? (You see, I am of the opinion that careful authors should definitely have the possibility to mark up content that should not be translated.) -david [1] <http://lists.w3.org/Archives/Public/public-html/2008Jul/0427.html> [2] <http://www.w3.org/TR/its/> [3] <http://lists.w3.org/Archives/Public/public-html/2008Jul/0428.html> [4] <http://lists.w3.org/Archives/Public/public-html/2008Jul/0432.html> [5] <http://www.w3.org/International/questions/qa-no-language#nonlinguistic> [6] <http://lists.w3.org/Archives/Public/public-html/2008Jul/0431.html> [7] <http://lists.w3.org/Archives/Public/public-html/2008Aug/0005.html> [8] <http://lists.w3.org/Archives/Public/public-html/2008Aug/0011.html> [9] <http://lists.w3.org/Archives/Public/public-html/2008Jul/0446.html> [10] <http://lists.w3.org/Archives/Public/public-html/2008Jul/0433.html> [11] <http://lists.w3.org/Archives/Public/public-html/2008Jul/0443.html> [12] <http://lists.w3.org/Archives/Public/public-html/2008Aug/0003.html>
Received on Saturday, 2 August 2008 14:58:18 UTC