- From: Mark Davis <mark.davis@icu-project.org>
- Date: Thu, 12 Apr 2007 10:34:13 -0700
- To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>
- Cc: "LTRU Working Group" <ltru@ietf.org>, www-international@w3.org
Received on Thursday, 12 April 2007 17:34:20 UTC
I was answering the question "A related question is how to tag text that is definitely in a language but I don't know what the language is. (But I might know the script).", assuming that one knows the script. But after seeing John's suggestion, a better choice might be "mis-Latn" (if one knows that it is some language but not sure which or not able to encode, written in Latin) and "mis" if one knows that it is some language (but doesn't know the script). Mark On 4/12/07, Jukka K. Korpela <jkorpela@cs.tut.fi> wrote: > > On Thu, 12 Apr 2007, Mark Davis wrote: > > [ about text that is in unknown language but known script ] > > > For that, I'd suggest und-Latn (or whatever the script is). Since only > > languages would have scripts, that is sufficiently determinate. > > I'm not so sure about it; it depends on what "script" really means. If the > data is "JuUiYTlajUJO", which is not in any language as far as I know, > can't we still say that it is in the Latin script? > > -- > Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/ > > > _______________________________________________ > Ltru mailing list > Ltru@ietf.org > https://www1.ietf.org/mailman/listinfo/ltru > -- Mark
Received on Thursday, 12 April 2007 17:34:20 UTC