[Bug 3245] Equality of strings

http://www.w3.org/Bugs/Public/show_bug.cgi?id=3245





------- Comment #8 from addison@yahoo-inc.com  2007-10-10 02:52 -------
Hi Dave,

The issue here is specifically the one sentence in section 3.3.1.1 of:

 http://www.w3.org/TR/xmlschema11-2/#string

Here we find this little five word sentence:

 "Equality for string is identity"

The problem is that we read this to mean that two strings are equal if they
consist of the same sequence of characters. The I18N WG has long held that
string identity, in a Unicode context, needs to consider normalization.
Otherwise certain languages that typically use combining sequences will produce
false negatives for string equality. 

While this would require maybe a very few additional words in XML Schema and
introduces few, if any, additional requirements for implementations, it *does*
have an effect in the many technologies that depend on XML Schema.

If string identity means the same character sequence, XML Schema really should
point out:

 http://www.w3.org/TR/charmod-norm/#sec-IdentityMatching

Received on Wednesday, 10 October 2007 02:52:34 UTC