- From: Erik van der Poel <erik@netscape.com>
- Date: Fri, 06 Mar 1998 09:46:27 -0800
- To: Bill Janssen <janssen@parc.xerox.com>
- CC: www-international@w3.org
Hi Bill! Bill Janssen wrote: > I'd like to find an algorithm to determine the charset and language > (in the sense of those terms defined by IETF RFC 2277, > http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2277.txt) of a C > string, probably using the information returned by a call to setlocale: > > current_locale = setlocale (LC_ALL, NULL); > > Is this in any way standardized? Are there good heuristics that > can be used? Which platform(s)? Windows? Mac? Unix? Where does the C string come from? String literal? Keyboard? Network? File? Resources? setlocale is not useful on all platforms. It is somewhat useful on Unix, maybe also on NT. Don't know about Win95. Probably not on Mac. There are organizations that have worked on systems that guess the charset and/or language of a piece of text. Some of those organizations have people on this mailing list. Maybe they will reply. Erik
Received on Friday, 6 March 1998 12:46:57 UTC