W3C home > Mailing lists > Public > www-international@w3.org > April to June 2007

Re: [Ltru] Re: For review: Tagging text with no language

From: Mark Davis <mark.davis@icu-project.org>
Date: Fri, 13 Apr 2007 12:52:00 -0700
Message-ID: <30b660a20704131252h19b75709mea47be6cb5bccfa9@mail.gmail.com>
To: "Frank Ellermann" <nobody@xyzzy.claranet.de>
Cc: ltru@lists.ietf.org, www-international@w3.org
On 4/13/07, Frank Ellermann <nobody@xyzzy.claranet.de> wrote:
>
> Mark Davis wrote:
>
> >                                   mul = "Multiple
> >                mul, if the        languages"
> >                protocol only      maybe also
> >  chat          permits a          others, since
> >                single tag         "chat" has
> >                <en, fr>           entered the
> >                otherwise          vocabulary of
> >                                   many languages
>
> Depends on the context.  If the context is something in the
> direction of your comment it's fine.  But if the context is
> "I don't know if that's about a chat or a cat" I'd use "und".


'und' is possible, but I think that mul conveys more information.

>                                   some language
> >                                   the process
> >  Igonda flatunicai vbinkli?  mis  recognizes, but
> >                                   which is not in
> >                                   BCP 47
>
> That would be wrong for almost all languages not yet in the
> registry belonging to another collection like "ger".


It's not wrong. I agree that where possible, one 'should' tag as precisely
as possible, but there is no requirement to in the RFC. Moreover, is unclear
to me that every human language falls under either one of the explicit codes
or into one of the collections. That may be true, I just don't know - Peter
might.


>                                    something the
> >                                    process
> >                                    recognizes as
> >                                    having
> >                                    linguistic
> >  podstatné jméno            mis  content, and
> >                                    might be in BCP
> >                                    47, but it
> >                                    doesn't know
> >                                    which language
> >                                    it is.
>
> If it doesn't know that it can use "und", abusing "mis" is
> no option.


I don't see how it is an abuse of 'mis'. If you could outline for me the
reasoning, based on the standards, behind saying that 'mis' is not correct,
I'd appreciate it.

I think "art" is about artificial languages for
> humans or in fiction, not for programming languages.


That may well be, it is just not clear to me from the specification.

Frank
>
>
>
> _______________________________________________
> Ltru mailing list
> Ltru@ietf.org
> https://www1.ietf.org/mailman/listinfo/ltru
>



-- 
Mark
Received on Friday, 13 April 2007 19:52:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:13 GMT