- From: Arnt Gulbrandsen <arnt@gulbrandsen.priv.no>
- Date: Thu, 22 Sep 2005 17:34:32 +0200
- To: Mark Davis <mark.davis@icu-project.org>
- Cc: Philip Guenther <guenther+collation@sendmail.com>, Martin Duerst <duerst@it.aoyama.ac.jp>, public-ietf-collation@w3.org
Mark Davis writes: > The goal and work so far is good. I'll need to read the document over > more carefully, but one quick point. The specification should make > very sure that some formal properties are observed. Yes, but which? I didn't add any on my watch, because I felt uncomfortable establishing new requirements on running code without understanding all of that code. To illustrate, your suggested list contains one item which I know is problematic: > Matching MUST be defined such that if there is a match, the substring > meets the equality criteria. Note: there are some real gotchas in > matching, see http://www.unicode.org/reports/tr10/#Searching If you ask an IMAP server to search for messages «FROM "<mark.davis@"», a message which contains «From: mark.davis@icu-project.org (Mark Davis)» may very well match. I'll add them, though. Some I'll add in the main specification, some I'll add in one or more defined collators, and maybe I'll flag some as «discussion needed». There is one thing I'm considering relaxing: When sorting, a collator need not leave malformed items in any particular order. That is, when sorting ten items, who of which are malformed, both malformed items must be at the end, but not in any particular order. I haven't quite made up my mind on that. Perhaps a stable sort should leave malformed in their original order, and unstable sorts can do what they want. Arnt
Received on Thursday, 22 September 2005 15:39:36 UTC