W3C home > Mailing lists > Public > www-international@w3.org > January to March 2009

RE: Handling Aboriginal Languages

From: Phillips, Addison <addison@amazon.com>
Date: Wed, 14 Jan 2009 07:47:37 -0800
To: Brian Cassidy <brian.cassidy@gmail.com>, "www-international@w3.org" <www-international@w3.org>
Message-ID: <4D25F22093241741BC1D0EEBC2DBB1DA017D23B03C@EX-SEA5-D.ant.amazon.com>
Hello Brian,

I am not an expert on Canadian Aboriginal languages, so don't know the full answer to your question, as I'm not familiar with the specific languages and dialects in question. However, I do have some information and advice for you.

First, using a "font cheat" is a Very Bad idea on a couple of levels. While it does produce a web site, only members of the community with access to the font can see/read it: it requires additional technical sophistication to make the site accessible. Search engines and other common web tools cannot read the site, etc. So Font-based encodings may actually harm minority language communities, as a result.

Second, I note that the characters displayed on the web site in question are all Latin-script characters encoded in Unicode. So if that is a real example of the languages's writing system or transcription, you should have no problem using UTF-8 for your site encoding. You may need to develop (for example) a custom keyboard map to make it easier to input the language(s).

Finally, Unicode does encode a variety of North American aboriginal scripts, including the Canadian syllabics used for various Inuktitut languages and dialects, as well as other aboriginal scripts. You can find these on the Unicode.org website:

   http://www.unicode.org/charts/


If, for some reason, you think the script might not be encoded, you can also check the roadmap for information on future encoding assignments:

   http://unicode.org/roadmaps/


If the characters used by a particular language are not encoded, it is still a better idea to a) make an encoding proposal to Unicode and b) use the private use area in Unicode while the encoding standardization work takes place. Still, I would be surprised to find that Unicode does not already encoded what you need. 

I hope that helps you get started. Anyone else on-list know this particular language and have specific advice for Brian?

Addison

Addison Phillips
Globalization Architect -- Lab126
Chair -- W3C Internationalization WG

Internationalization is not a feature.
It is an architecture.


> -----Original Message-----
> From: www-international-request@w3.org [mailto:www-international-
> request@w3.org] On Behalf Of Brian Cassidy
> Sent: Wednesday, January 14, 2009 6:48 AM
> To: www-international@w3.org
> Subject: Handling Aboriginal Languages
> 
> 
> Hello All,
> 
> As a web developer in Canada, I've had to deal with both of our
> official languages: French and English. Today I've been given a new
> challenge as one of our clients wants to develop a site in some
> Aboriginal languages (Tlicho [1] for e.g.).
> 
> Now, traditionally I just do everything in utf-8 and send that
> across
> the wire. However, with this language, are there even unicode
> codepoints for it? If so, how would i do the data entry? There are
> fonts available for the language so i could "cheat" and go that
> route
> as well.
> 
> Does anyone have any advice on what direction I should follow?
> 
> Thanks in advance,
> 
> -Brian Cassidy (brian.cassidy@gmail.com)
> 
> [1] http://www.tlicho.ca/


Received on Wednesday, 14 January 2009 15:48:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:19 GMT