Re: [Ltru] RE: For review: Tagging text with no language

Mark Davis scripsit:

> Q1. I had missed the choice of "mis". I agree with that suggestion;
> we should incorporate that into 4646bis. The problem is ameliorated
> considerably once we add -3, but it doesn't disappear completely, so
> "mis" remains a good choice for dealing with that situation.

I did not mean to suggest that "mis" is suitable in cases of ignorance
about the language in use: it is not a fallback *language* code.  Rather,
it is a fallback language *collection* code, suitable for languages that
don't appear in any other ISO 639-2 collection.  By the Ethnologue's
count, there are about 130 of these.

So it would be an error to tag a language you didn't recognize as 'mis',
because it is far more likely to be one of the non-'mis' languages, for
the same reason that it would be incorrect to use 'en' or 'nds' or 'afa'.
If you want a completely vague language tag, use 'und' (excluding for
the moment the question of whether non-linguistic content not recognized
as such can be tagged 'und').

> Q2. The issue *does* remain, since we talk about "und" vs the absence
> of a language tag, which "" represents.

I still don't see that there's anything more to say than we are
saying already, which is just a special case of "Tag wisely".

-- 
I now introduce Professor Smullyan,             John Cowan
who will prove to you that either               cowan@ccil.org
he doesn't exist or you don't exist,            http://www.ccil.org/~cowan
but you won't know which.                               --Melvin Fitting

Received on Thursday, 12 April 2007 18:56:36 UTC