W3C home > Mailing lists > Public > www-international@w3.org > April to June 2007

Re: [Ltru] Re: For review: Tagging text with no language

From: JFC Morfin <jefsey@jefsey.com>
Date: Sun, 20 May 2007 05:05:11 +0200
To: Najib Tounsi <ntounsi@emi.ac.ma>,Martin Duerst <duerst@it.aoyama.ac.jp>
Cc: 'LTRU Working Group' <ltru@ietf.org>,www-international@w3.org, ietf-languages@jefsey.com
Message-Id: <20070520030513.59CA517BD2@smtp7-g19.free.fr>

Dear Najib,
your point is with the W3C. IETF only documents the tags. Not their use.
What should be done in the XML page is to first declare (name) all 
the languages that the page is going to use, and to use that names to 
document the text. Then it is to the W3C to decide what to do when a 
non declared name of language is being used.

I note that declarations permit to address several embarassing 
problems, such as:
- "en" =  en-arab
- "ar"  =  en-arab
permit to use en/ar information in spite that everything is writen in 
arabic script.

Or to use XML internationalised models. One can declare several 
language names and use the same template with different values for that names.
jfc

At 12:07 19/05/2007, Najib Tounsi wrote:
>Martin Duerst wrote:
>>Hello Richard, Najib,
>>
>>At 07:25 07/05/19, Najib Tounsi wrote:
>>
>>>Hi Richard,
>>>
>>>My feedback is perhaps subjective. My feeling is that, in some 
>>>places, the text is not sufficently clear for those who don't 
>>>speak English fluently.
>>>
>>>Anyway, here are some remarks (about 
>>>http://www.w3.org/International/questions/qa-no-language#undetermined)
>>>
>>>- You write
>>>"For example, xml:lang="" might be used if text is included into a 
>>>document from a database that doesn't provide language information..."
>>>It is the text or the document which is from a database? The text of course.
>>>Should I understand this:
>>>For example, xml:lang="" might be used if text is to be included 
>>>into a document and (the text) comes from a database that doesn't 
>>>provide language information ...?
>>>
>>
>>Very good point.
>>
>>
>>>-You write
>>>"The effect would be to cancel any language information declared 
>>>higher up the hierarchy of elements in the document."
>>>What do "cancel any language" means?
>>>- remove the language information declared higher up the hierarchy? Wrong
>>>- override this declaration by the new one "und"? Right
>>>
>>>Finally the whole story (about the use of "und") is, if you can 
>>>"leave out the markup", go ahead. Mark up only if "you have a 
>>>particular need to indicate that the language is undefined". Right?
>>>
>>
>>I was also a bit surprised by this. It's easy to read this as
>>"language tagging, so who cares?". It looks like it's quite in
>>contrast to what we say on language tags otherwise.
>>
>In fact, to what I wanted to point is:
>
>Suppose you have a text like "The speaker said 'Salam Alikoum' and 
>began to talk".
>You know that this English with something strange inside it. And you 
>don't have a particular need to indicate that the strange language 
>is undefined.
>
>Which of the following two cases you recommend me to do? In which 
>circumstances?
>
>1. leave out the markup:
><text xml:lang="en"> The speaker said
>  <span>Salam Alikoum</span>
>  and began to talk
></text>
>
>2. cancel any language information declared higher up the hierarchy 
>using "und" (or xml:lang="",  depending on XML format):
><text xml:lang="en"> The speaker said
>  <span xml:lang="und">Salam Alikoum</span>
>  and began to talk
></text>
>or
><text xml:lang="en"> The speaker said
>  <span xml:lang="">Salam Alikoum</span>
>  and began to talk
></text>
>
>Now, if the English is not declared, is this the correct markup:
><text> The speaker said
>  <span>Salam Alikoum</span>
>  and began to talk
></text>
>
>
>Regards, Najib
>
>
>
>
>_______________________________________________
>Ltru mailing list
>Ltru@ietf.org
>https://www1.ietf.org/mailman/listinfo/ltru
Received on Sunday, 20 May 2007 03:05:23 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:13 GMT