SRU/CQL requirements

This is in response to the action item assigned to me to "Provide summary of sru level 1" .

To clarify: It is actually CQL that needs to be summarized.   (SRU is a  protocol and CQL is its companion query language. )

First, some terminology and brief background.

Consider the query:

                motivation = tagging  AND  annotatedAt   >  20150401

(which says "find annotations with  Motivation  oa:tagging  created later than  April 1, 2015")

Each of:

*         motivation = tagging

*          annotateAt   >  20150401
is, In CQL terminology. a  search clause.

A search clause consists of an index, relation, and search term ("term" for short).
Thus:

*         'motivation'  and 'annotatedAt'  are indexes.

*          '=' and  '>'  are relations.

*         'tagging' and '20150401' are search terms.

A search clause may consist of the term only. The index and relation may be omitted in which case the system defaults for these apply (and so the search clause is still considered to be index-relation-term even if only the term is supplied). So for example the following is a legitimate CQL query:

                                cat

which expands to

[default index]   [default relation]  cat

The system default relation is usually '=';  furthermore, the system may define '=' to mean whatever it wants. So the query 'title=cat' on most systems means "find documents with 'cat' in the title"  as opposed to documents whose title is exactly 'cat'.  There is also the relation '==' to mean "exact".

An SRU annotation database could define "body" to be the default index.

The query

                body = cat

Would say "find annotations with 'cat' in the body."

  And that query  could be abbreviated as:

                         cat


So, let's get down to requirements:

Formally speaking, conformance to CQL level 1  requires support for at least one of the following:

(a)    queries where search terms are combined with booleans, e.g. "cat AND dog"

(b)   full search clauses; i.e. where index and relation are explicitly supplied rather than defaulted;
e.g. "motivation = tagging"

This means that if you support Boolean operators then you don't have to support full search clauses; and conversely, you don't have to support Booleans but if you don't then you have to support full search clauses.

Note that I said "formally" above, because I think this is over specified  (blame me for that).  I suggest for our purposes that we require support for:

*         Boolean operators (AND, OR, NOT)

*         Full search clauses

*         Term-only search clauses

Let's take this a step at a time.  If after looking at this, we want to pursue it further, I will provide more detailed background, including additional background on the SRU protocol, and begin to draft a proposed profile of SRU/CQL for annotations.

Ray

Received on Thursday, 30 April 2015 13:49:50 UTC