Re: [Ltru] RE: For review: Tagging text with no language

On 4/12/2007 2:25 PM, Kent Karlsson wrote:
>  
> FWIW, in CLDR 1.4 some of the translations for "und" has the word "language"
> (translated of course) in them, in accordance with John Cowan's original suggestion:
>
> da.xml:			<language type="und">Sproget kan ikke bestemmes</language>
> de.xml:			<language type="und">Sprache nicht ermittelt</language>
> it.xml:			<language type="und">lingua imprecisata</language>
> sv.xml:			<language type="und">obestamt sprak</language>
>   
The sample translations show that there's general difficulty in agreeing 
on the concept. The German translation says "no language (has been) 
determined", while the Danish translation says that "no language could 
be determined". In my reading the Swedish allows both possibilities, but 
perhaps implies more strongly than the other two that assigning a 
language to the contents would be meaningful. (The Italian translation 
seems to most closely agree with the Swedish one to the extent of my 
command of Italian)

> (I would be to blame for the last one, but apparently I'm not the only one to (maybe)
> be misguided). Perhaps those ones should be retranslated not to refer to language,
> **if** "und" may apply also to "maybe not in any language".
>
>   
The problem is that the scheme does not explicitly accounts for all the 
types of edge conditions that you can get into when analyzing text for 
language up front. Instead, labels are added here and there to handle 
some of these as they become urgent enough to require attention. As a 
result, all the translators have to go by is the shorthand English 
description for the label. And that's not written with enough precision 
to overcome the limitation of not having thought through all the 
possible cases.

A./

Received on Thursday, 12 April 2007 21:53:46 UTC