W3C home > Mailing lists > Public > public-qt-comments@w3.org > June 2004

RE: PLease define 'collation'

From: Michael Kay <mhk@mhk.me.uk>
Date: Thu, 10 Jun 2004 09:18:31 +0100
To: "'Michael Brundage'" <xquery@comcast.net>, "'Igor Hersht'" <igorh@ca.ibm.com>
Cc: "'XQuery Public Comments'" <public-qt-comments@w3.org>, <ashokmalhotra@alum.mit.edu>, <Stephen.Buxton@oracle.com>
Message-Id: <20040610081908.32886A175F@frink.w3.org>

> > I was hoping that by saying it is a mapping to a sequence of
> > integers then we can also imply some properties of functions like
> > contains(), for example that contains(A,B) is true if A=B 
> is true, and that
> > startswith(A, B) implies A <= B.
> The mathematician in me is required to reply with a proof 
> that contains()
> can never satisfy such a property.  The problem is that equality is
> reflexive (symmetric), while string containment is not.

I should have been more formal. By "if" I didn't mean "if and only if". The
property I was after was

compare(A, B, C) = 0 implies contains(A, B, C)

It does seem to me that a collation without this property is likely to be

Michael Kay

> Assume contains(A, B) is true if and only if collation(A) = 
> collation(B) is
> true.  Then consider any two strings A and B such that 
> contains(A, B) is
> true but contains(B, A) is not (for example, "a" and "aa").  By the
> hypothesis, contains(A, B) implies collation(A)=collation(B), 
> but then by
> the collation(B) = collation(A) so by hypothesis contains(B, 
> A) is true, a
> contradiction.
> Therefore there cannot exist a collation for which 
> contains(A, B) is true if
> and only if collation(A) = collation(B).  [Note that this proof holds
> regardless of what value space the collation maps into.]
> Cheers,
> Michael Brundage
> xquery@comcast.net
> Author, XQuery: The XML Query Language (Addison-Wesley, 2004)
> Co-author, Professional XML Databases (Wrox Press, 2000)
Received on Thursday, 10 June 2004 04:19:08 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:45:20 UTC