- From: Elliotte Harold <elharo@metalab.unc.edu>
- Date: Wed, 27 Oct 2004 06:49:15 -0400
- To: Chris Lilley <chris@w3.org>
- CC: Norman Walsh <Norman.Walsh@Sun.COM>, www-tag@w3.org
On a side note, I'm not sure what programming language you prefer, but there's a very real issue here in Java to be aware of when writing this finding, or otherwise this will cause 90%+ of Java implementations to get the algorithm exactly backwards. The apparently non-Locale sensitive methods such as the no-args versions of toUpperCase and toLowerCase behave according to the current locale rather than according to the Unicode case folding tables or any other fixed mapping. This means a lot of Java code breaks in Turkey when it uses the apparently locale insensitive methods. To get true locale-insensitivity when comparing syntactic strings, in Java it is necessary to specify a locale. Counter-intuitive but true. The only way to get the behavior one wants is to say toUpperCase(Locale.EN) instead of just toUpperCase(). This is why I think adopting an explicit algorithm such as match a-z with A-Z, and don't change anything else is more likely to be implemented correctly than a mere statement that "Languages are compared case insensitively." -- Elliotte Rusty Harold elharo@metalab.unc.edu XML in a Nutshell 3rd Edition Just Published! http://www.cafeconleche.org/books/xian3/ http://www.amazon.com/exec/obidos/ISBN=0596007647/cafeaulaitA/ref=nosim
Received on Wednesday, 27 October 2004 10:49:18 UTC