- From: Jonathan Robie <Jonathan.Robie@SoftwareAG-USA.com>
- Date: Fri, 02 Jun 2000 11:54:39 -0500
- To: www-xml-schema-comments@w3.org
- Cc: Paul Cotton <pcotton@microsoft.com>
The Schema WG has asked me to respond to Paul Cotton's discussion of collations, found in http://lists.w3.org/Archives/Public/www-xml-schema-comments/2000JanMar/0201.html. This response represents my opinion, and will be taken as further feedback on the topic by the Working Group. >4. Section 3.2.1 string >This section states "The ordered property of string is the Unicode >character number sequence." I wonder why the definition of the string >datatype does not permit a user to define the "collation" to be used? >"Unicode character number sequence" is only one "collation" and is not very >useful. In addition the specification does not explain why this >"collation" is needed. A collation sequence defines how comparisons of strings are done to establish order. Since we allow minOccurs and maxOccurs to be defined on strings, and minimum and maximum can not be defined until we have some way to determine whether the value of one string is less than the value of another string, I believe that collation sequences are needed for our own purposes if we are to compare strings in foreign languages appropriately. >XML Query will need to support different collations for the string data >type. It would be preferable if the collation was defined as part of the ><data type> not as part of the query <predicate>s. I would recommend you >consider a solution such as one adopted by SQL to permit the type definer >to simply name the collation to be used. No exact definition of the action >collation needs to be provide since there are several other sources for >this information. An advantage of this is that it is possible to sort or compare strings appropriately without forcing the person who composes the query to explicitly state the collation sequence to be used, which simplifies writing queries significantly. I think there would probably be cases in which a query still must explicitly specify, e.g. if strings with two different collations are compared. Are there important issues that I'm neglecting here? Jonathan
Received on Friday, 2 June 2000 11:52:40 UTC