RE: SRU/CQL requirements

Hi Ivan;



> Is my understanding correct that CQL is some sort of a framework?

> Framework in the sense that we have to define which indexes are usable in

> our environment and how they translate onto the model?



While I would not characterize CQL as a "framework" (it has substantial teeth), it is true, you define indexes for your particular environment.   These sets of indexes (and other objects)  are called "context sets".  A list of known context sets (those I know about) are at http://www.loc.gov/standards/sru/cql/contextSets/listOfContextSets.html






> Another question/comment is on the '=' operator like 'body = cat' in your

> example. Is this defined to be an exact match, or a submatch, or something

> similar?



There is both "=" and "==".   Here is what the spec says about these:



"=

 This is the default relation, and the server can choose any appropriate relation or means of comparing the query term with the terms from the data being searched."



    ==

This relation is used for exact equality matching. The term in the data is exactly equal to the term in the search.  A relation modifier may be included to specify how whitespace (trailing, preceding, or embedded) is to be treated  (for example, the CQL relation modifier ‘honorWhitespace’).  "







.... I would expect that some sort of a (possibly simplified) regular

> expression match may be more useful for our purposes. I do not know

> whether CQL has this facility.



Yes, CQL has several built-in matching functions, including:



"

*     A single asterisk (*) is used to mask zero or more characters.



?        A single question mark (?) is used to mask a single character, thus N consecutive question-marks means mask N characters.



^        Carat/hat (^) is used as an anchor character for terms that are word lists, that is, where the relation is 'all' or 'any', or 'adj'. It may not be used to anchor a string, that is, when the relation is '==' (string matches are, by default, anchored). It may occur at the beginning or end of a word (with no intervening space) to mean right or left anchored."^" has no special meaning when it occurs within a word (not at the beginning or end) or string but must be escaped nevertheless.



\        Backslash (\) is used to escape '*', '?', quote (") and '^' , as well as itself. Backslash not followed immediately by one of these characters is an error.



"



Ray

Received on Friday, 1 May 2015 13:58:39 UTC