W3C home > Mailing lists > Public > www-tag@w3.org > October 2004

Re: XML Chunk Equality

From: Elliotte Harold <elharo@metalab.unc.edu>
Date: Tue, 26 Oct 2004 17:36:28 -0400
Message-ID: <417EC35C.7070807@metalab.unc.edu>
To: Chris Lilley <chris@w3.org>
CC: Norman Walsh <Norman.Walsh@Sun.COM>, www-tag@w3.org

Chris Lilley wrote:

> ERH> What's probably intended here is that languages are compared case 
> ERH> insensitively within the ASCII range using English case mappings.
> No; what is intended here is that *language tags* are compared case
> insensitively. xml:lang="en" and xml:lang="EN" denote the same language.
> Since the intent has clearly been misunderstood, the finding should be
> clarified to say 'language tags are ...'

I'm sorry. This is relevant. First of all, language tags should but do 
not have to be ISO 639 language tags. Although some early parsers were 
confused about this, xml:lang="Franšais" is well-formed.

Secondly, even if we stick to ASCII this is an issue. Consider 
xml:lang="it". This is the same as xml:lang="IT" when compared in an 
English locale but not when compared in a Turkish locale. In Java. 
"it".equalsIgnoreCase("IT") is *false* in Turkey.

Elliotte Rusty Harold  elharo@metalab.unc.edu
XML in a Nutshell 3rd Edition Just Published!
Received on Tuesday, 26 October 2004 21:36:31 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:56:06 UTC