- From: Andrea Giuliano <a.giuliano@iccu.sbn.it>
- Date: Mon, 09 Sep 2002 11:14:46 +0200
- To: ZIG list <www-zig@w3.org>
Ray Denenberg wrote: >wald@library.ho.lucent.com wrote: > > > >> Realized better what is bothering me about this proposal. >>The proposal says: >> 1. A single asterisk (*) is used to mask zero or more characters. >> 4. A single vertical bar (|) is used to mask zero or more words. >> >>Question is how is a*b different from a|b >> >> > >Suppose you search on "search" and you want to retrieve "amalgamated search" >but not "amalgamated research". Then "*search" won't help but "|search" >will. > >--Ray > As far as I know, there is not a standard set of regular expressions in which some symbol can be used as the "|" in the proposal. If it exists, please pardon me for this message, and let me know about this set. Otherwise read the rest of the message. In many UNIX commands, as well as in many high level programming languages, you can use "\<" to match left word boundaries (and "\>" to match right word boundaries). The example above should work with "\<search": this RE would match "amalgamated search" but not "amalgamated research". The same result should be given with "\Wsearch" ("\W" matches any character which can't be part of a word). There could well be situations in which "matching zero or more words" is not equivalent to "matching word boundaries", buf if they are very few or even don't exist, it would be nice to change the proposal speaking of word boundaries instead of whole words, because of the availability of standard solutions to handle word boundaries (I'm almost sure that Perl and Java support such kind of RE, for example). Another note: how many DBMS' support word matching or word boundaries matching? As far as I know, standard SQL does not, for example. The proposal should take into account the actual possibility of implementing the 105 attribute with small effort. Best regards. -- Andrea Giuliano, Ph. D. Virtual System Administrator ICCU - Istituto Centrale per il Catalogo Unico Viale Castro Pretorio 105, Rome - ITALY Tel. +39064989509, Fax +39064059302
Received on Monday, 9 September 2002 05:13:00 UTC