Re: [Ltru] RE: For review: Tagging text with no language

Mark Davis scripsit:

> I think I agree with you in spirit, but not in precise details. The
> tag "und" means "undetermined", so when I encounter it I don't know
> whether the content contains one language, many languages, or no
> language. The tag "zxx" would mean that there is no language content,
> "mis" would mean that there is at least some language content, and "mul"
> would mean that there is language content, with more than one language.

I'm okay with all of this except "mis".  "mis" is a collection code,
as I explained, and means "languages that don't belong to any other
collection."  It is not the universal collection.

-- 
Mark Twain on Cecil Rhodes:                    John Cowan
I admire him, I freely admit it,               http://www.ccil.org/~cowan
and when his time comes I shall                cowan@ccil.org
buy a piece of the rope for a keepsake.

Received on Friday, 13 April 2007 02:05:05 UTC