- From: Leif Halvard Silli <lhs@malform.no>
- Date: Fri, 01 Aug 2008 04:17:10 +0200
- To: Karl Dubost <karl@w3.org>
- CC: Ian Hickson <ian@hixie.ch>, public-html WG <public-html@w3.org>, Chris Wilson <Chris.Wilson@microsoft.com>, simonp@opera.com, Martin Duerst <duerst@it.aoyama.ac.jp>
Karl Dubost 2008-08-01 02.42:
>
>
> Le 1 août 2008 à 07:05, Ian Hickson a écrit :
>> We define lang, we can easily define it as being something akin to:
>>
>> [only] <bcp-47-code>
>
> If we were doing that, we would have to be sure to not break existing
> applications. I wonder if I18N activity has a list of apps implementing
> lang.
>
> Though lang="" is not very good for this purpose, for brand names,
> people name, trademarks, etc. Let's take Word, the name of the program.
> You don't want it to be translated or an English person called Schwartz.
> or "小林" (kobayashi) = little wood which is a common Japanese name.
>
> My natural inclination would be to reuse the vocabulary of ITS to not
> reinvent the wheel.
Very interesting from Chris! But I side with Ian in that LANG
could be interesting to reuse. However, I also side with Karl in
that reuse in the form of *messing* with LANG is bad. I also liked
Simon Pieter's proposal about using META to "cascade" the
notranslate values to the document:
<meta name=notranslate content="code, #logo, .term, :lang(de)">
FIRST, the good news: BCP 47 perhaps has a way out so we can reuse
LANG without messing with it. BCP 47 offers the possibility of
registering language tag extensions with IANA. Such extensions are
added after the "real" language codes. Thus, simply put, if we had
registered with IANA e.g. a -q- singleton (q for quality), then
one could tag something like this (the exact values must be
registered with IANA):
<span lang="en-q-notTranslate">Word</span>
<span lang="en-q-original">Word</span>
<span lang="en-q-name">Word</span>
The current draft of IETF 4646 says:
"Extensions [...] are intended to identify information which is
commonly used in association with languages or language tags, but
which is not part of language identification." [1]
THIRDLY, and back to Simon Pieters: Going the route via BCP 47 has
the advantage that we get "something" which is useful both inside
the META tags (as Simon proposed it) as well as in LANG attributes
and even in the REL attributes. Just think about the
rel="alterntate" attribute. According to HTML 4 (and I hope HTML
5), with
<link lang=fr rel=alternate href=text.french.htm >,
we are pointing to a French alternate - and translation - of the
current document. However, there is nothing which tells you
wehther the document you are reading or the linked document makes
up the original document.
For this, I imagine that one could also register -q-original, so
that one could have
<link lang=fr-q-original rel=alternate href=text.french.htm >
And also, this way one could solve the problem which Chris asked
Simon about, namely, let's say you want only some designated parts
of the German parts of your text to be translated, then you could
solve that this way:
<meta name=notranslate content=":lang(de-q-original)">
The translate="yes/no" attribute seems to me to be better used
when you need to translate from one language to only one other
language. It does not seem fitting for making machine translations
to hundred of languages. That is: Unless your main purpose is to
take care of registered trademarks etc.
FINALLY, Karl, it seems to me - and this underlies all I said
above -- that you have found a usecase for the <NAME> element! For
instance, perhaps the right thing would be to transLITERATE Word
in some languages, in some situations? Would translate="no" permit
that to happen? It seems to me more crucial to give the needed
info --that it is a name-- so that one can judge, per
language/translatiion, whether translation/transliteration is needed.
[1]
http://www.inter-locale.com/ID/editor/draft-ietf-ltru-4646bis-14a.html#extensionsubt
--
leif halvard silli
Received on Friday, 1 August 2008 02:18:22 UTC