Re: [Ltru] RE: For review: Tagging text with no language

"mis" is defined in 639-2 as "Miscellaneous languages". That does not mean
that it is limited to "languages that don't belong to any other collection".
You interpretation also breaks stability, since I could validly tag content
today with "mis", which would become invalid under your interpretation at
some point in the future.

On http://www.loc.gov/standards/iso639-2/php/code_list.php it is not listed
with "(other)", so it is not a collection.

> I interpret "zxx" to mean "the content so tagged is not any instance of
the kind of entities encompassed by this coding standard".

That is not born out by the name on
http://www.loc.gov/standards/iso639-2/php/code_list.php, which says " No
linguistic content". It does not say "no linguistic content that could
otherwise be represented by a code in this standard", a very different
thing.


Mark

On 4/12/07, John Cowan <cowan@ccil.org> wrote:
>
> Mark Davis scripsit:
>
> > I think I agree with you in spirit, but not in precise details. The
> > tag "und" means "undetermined", so when I encounter it I don't know
> > whether the content contains one language, many languages, or no
> > language. The tag "zxx" would mean that there is no language content,
> > "mis" would mean that there is at least some language content, and "mul"
> > would mean that there is language content, with more than one language.
>
> I'm okay with all of this except "mis".  "mis" is a collection code,
> as I explained, and means "languages that don't belong to any other
> collection."  It is not the universal collection.
>
> --
> Mark Twain on Cecil Rhodes:                    John Cowan
> I admire him, I freely admit it,               http://www.ccil.org/~cowan
> and when his time comes I shall                cowan@ccil.org
> buy a piece of the rope for a keepsake.
>



-- 
Mark

Received on Friday, 13 April 2007 14:57:34 UTC