- From: Jim Melton <jim.melton@acm.org>
- Date: Thu, 10 Jun 2004 11:15:12 -0600
- To: Igor Hersht <igorh@ca.ibm.com>
- Cc: Jim Melton <jim.melton@acm.org>, ashokmalhotra@alum.mit.edu, "Michael Kay" <mhk@mhk.me.uk>, public-qt-comments@w3.org, Stephen.Buxton@oracle.com
Igor,
I'm not completely convinced that the collation itself defines how matching
is done. I think that the rules for matching are heavily influenced (even
determined) by the semantics of the collation, but I think that the act of
matching involves string comparisons, which is covered by my proposed
definition of "collation".
In fact, I recently analyzed that very situation and believe that my
analysis supports my statement above. The secret seems to be in carefully
reading the definitions for comparisons and matching in Unicode Technical
Report #10, especially the definition for 'minimal matching', coupled with
an understanding of 'ignorable collation elements'.
Once those are understood, I agree with you that the definitions of
fn:contains, fn:starts-with, fn:ends-with, fn:substring-before and
fn:substring-after are pretty obvious.
Hope this helps,
Jim
At 10:07 AM 6/10/2004 Thursday, Igor Hersht wrote:
> >The most pithy definition of "collation" that I can devise would read
>something like this: >collation: A specification of the manner in which
>character strings are compared and, by >extension, ordered.
>
>This is very similar to what I was trying to say. I think that there is
>one point
>missing here - string matching. String "a" could be equal both "a" and
>"a-".
>Having just comparison would not give us unambiguous matching.
>
>I would say
>collation: A specification of the manner which defines character strings
>comparing ( by extension ordered) and matching.
>
>(May be it could be expressed in better English).
>
>Defining of the collation functions is a separate issue.
>I think that the definition of the functions
> fn:contains, fn:starts-with, fn:ends-with,
>fn:substring-before and fn:substring-after are quiet obvious
>in terms of string matching.
========================================================================
Jim Melton --- Editor of ISO/IEC 9075-* (SQL) Phone: +1.801.942.0144
Oracle Corporation Oracle Email: jim dot melton at oracle dot com
1930 Viscounti Drive Standards email: jim dot melton at acm dot org
Sandy, UT 84093-1063 Personal email: jim at melton dot name
USA Fax : +1.801.942.3345
========================================================================
= Facts are facts. However, any opinions expressed are the opinions =
= only of myself and may or may not reflect the opinions of anybody =
= else with whom I may or may not have discussed the issues at hand. =
========================================================================
Received on Thursday, 10 June 2004 13:19:47 UTC