As a default, I'd suggest the Unicode algorithm documented in http://unicode.org/reports/tr10/#Searching. Commercial systems (such as what we do a Google) will go beyond this to include more sophisticated processing such as synonyms, language-sensitive deaccenting, stemming, etc., but that is beyond what can be required in a general specification. Mark — Il meglio è l’inimico del bene — On Wed, Jun 2, 2010 at 08:14, Robin Berjon <robin@robineko.com> wrote: > Hi Addison, > > On Jun 2, 2010, at 16:52 , Phillips, Addison wrote: > > Hi, I've added this to our agenda to discuss. > > Excellent, thanks a lot! > > > Full text search is a somewhat complex topic and varies by language. > > I like your use of the word "somewhat". > > > Looking at the Contacts API draft quickly I notice many interesting > internationalization issues that may not be fully addressed (handling of > personal names; handling of postal addresses; enumerated types which need to > consider the needs of other cultures; etc.) > > Yes, we're aware of these issues. The schema that describes people is > caught between two contradictory tensions: do the right thing, and do > something that can be layered atop existing implementations (some of which > are dreadfully daft). Personally I'd rather we did things right even if it > limits the target platforms somehow, but that's not necessarily a consensual > view. > > That being said, we've already received feedback that's likely to cause us > to rethink the current schema (yet again...). As a result I think that in > the interest of not taking up your precious resources it might be best to > wait. We will definitely ask for your review once we've stabilised this. > > Thanks a lot! > > -- > Robin Berjon > robineko — hired gun, higher standards > http://robineko.com/ > > > > > >Received on Wednesday, 2 June 2010 16:36:35 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 2 June 2010 16:36:36 GMT