Re: Handling Aboriginal Languages from Jonathan Pool on 2009-01-14 (www-international@w3.org from January to March 2009)

From: Jonathan Pool <pool@utilika.org>
Date: Wed, 14 Jan 2009 09:49:46 -0800 (PST)
To: "Brian Cassidy" <brian.cassidy@gmail.com>
Cc: www-international@w3.org
Message-ID: <58132.192.168.1.5.1231955386.squirrel@utilika.org>

> Now, traditionally I just do everything in utf-8 and send that across
> the wire. However, with this language, are there even unicode codepoints for
> it?

Absolutely. The only non-ASCII characters seem to be L WITH STROKE and A, E,
I, O, and U WITH OGONEK, and they are all in Unicode.

> If so, how would i do the data entry?

If you're doing it frequently, the best solution is to create a keyoard layout
that lets you type the characters you want with the keypresses you want.

For the rarer entry, you can either use the character-selection tool in your
operating system or copy the characters from anywhere, such as the Wikipedia
article on the language (http://en.wikipedia.org/wiki/Dogrib_language), and
paste them in.

> There are fonts available for the
> language so i could "cheat" and go that route as well.

No, that's the 20th-century solution, and it's headed for oblivion. If you
could help the proprietors of the http://www.tlicho.ca Website exit from that
method into the UTF-8 method, you'd be doing them a great service.

By the way, my organization (http://utilika.org) is looking for one or more
machine-readable dictionaries between Tłįchǫ and any other languages, to add
translation data to our panlingual lexical translation resource. If you know
anybody who has such a dictionary and is willing to let us use it, we'd be
very grateful.

Received on Wednesday, 14 January 2009 17:51:03 UTC