W3C home > Mailing lists > Public > www-international@w3.org > April to June 2007

Re: [Ltru] Re: For review: Tagging text with no language

From: CE Whitehead <cewcathar@hotmail.com>
Date: Sat, 14 Apr 2007 13:09:01 -0400
Message-ID: <BAY114-F2876EFB84ACAF98AA9F69EB35C0@phx.gbl>
To: mark.davis@icu-project.org, nobody@xyzzy.claranet.de
Cc: ltru@lists.ietf.org, www-international@w3.org


Hi, my comment is below; I understand that mul is for where you know there 
is  content from more than one language.
Otherwise the code for content which you are sure is in some language where 
you do not know how many languages are involved is mis.

Is that right??

--C. E. Whitehead
cewcathar@hotmail.com

Mark Davis <mark.davis@icu-project.org>
On 4/13/07, Frank Ellermann <nobody@xyzzy.claranet.de> wrote:

    Mark Davis wrote:

   > >                                   mul = "Multiple
   > > >               mul, if the        languages"
   > > >              protocol only      maybe also
   > > > chat          permits a          others, since
   > > >              single tag         "chat" has
   > > >               <en, fr>           entered the
   > > >               otherwise          vocabulary of
   > >  >                                 many languages

   > > Depends on the context.  If the context is something in the
   > > direction of your comment it's fine.  But if the context is
   > > "I don't know if that's about a chat or a cat" I'd use "und".


>'und' is possible, but I think that mul conveys more information.

'mul' only works I think if you are sure there is more than one language;
otherwise 'mis' or 'und' but und thus is being used rather freely I guess--
it seems to me.

That's what I gather.



>
>>                                   some language
>> >                                   the process
>> >  Igonda flatunicai vbinkli?  mis  recognizes, but
>> >                                   which is not in
>> >                                   BCP 47
>>
>>That would be wrong for almost all languages not yet in the
>>registry belonging to another collection like "ger".
>
>
>It's not wrong. I agree that where possible, one 'should' tag as precisely
>as possible, but there is no requirement to in the RFC. Moreover, is 
>unclear
>to me that every human language falls under either one of the explicit 
>codes
>or into one of the collections. That may be true, I just don't know - Peter
>might.
>
>
>>                                    something the
>> >                                    process
>> >                                    recognizes as
>> >                                    having
>> >                                    linguistic
>> >  podstatné jméno            mis  content, and
>> >                                    might be in BCP
>> >                                    47, but it
>> >                                    doesn't know
>> >                                    which language
>> >                                    it is.
>>
>>If it doesn't know that it can use "und", abusing "mis" is
>>no option.
>
>
>I don't see how it is an abuse of 'mis'. If you could outline for me the
>reasoning, based on the standards, behind saying that 'mis' is not correct,
>I'd appreciate it.
>
>I think "art" is about artificial languages for
>>humans or in fiction, not for programming languages.
>
>
>That may well be, it is just not clear to me from the specification.
>
>Frank
>>
>>
>>
>>_______________________________________________
>>Ltru mailing list
>>Ltru@ietf.org
>>https://www1.ietf.org/mailman/listinfo/ltru
>>
>
>
>
>--
>Mark

_________________________________________________________________
Get a FREE Web site, company branded e-mail and more from Microsoft Office 
Live! http://clk.atdmt.com/MRT/go/mcrssaub0050001411mrt/direct/01/
Received on Saturday, 14 April 2007 17:09:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:13 GMT