Re: Paging, filtering, and sorting from Robert Sanderson on 2015-04-15 (public-annotation@w3.org from April 2015)

From: Robert Sanderson <azaroth42@gmail.com>
Date: Wed, 15 Apr 2015 13:28:00 -0700
To: Frederick Hirsch <w3c@fjhirsch.com>
Cc: "Denenberg, Ray" <rden@loc.gov>, Web Annotation <public-annotation@w3.org>
Message-ID: <CABevsUHrACSjmmwjDwXrTc8LcmSN+aC61QNnUFU0whhkoWzqUg@mail.gmail.com>
(As I also worked on SRU and CQL back in the day...)

On Wed, Apr 15, 2015 at 1:18 PM, Frederick Hirsch <w3c@fjhirsch.com> wrote:

> Search/Filter the annotations stored on my web site (example.com) for the
> target domain boston.com (or *.boston.com) posted on the date 1 April
> 2015 sorted by most recent first and limited to the first 200?
>


At some sru endpoint for example.com, the parameters would be:

query: oa.target = *boston.com and oa.annotatedAt = 2015-04-1 sortBy
oa.annotatedAt/descending
   (assuming you meant only on that date, rather than that date or after,
which would be oa.annotatedAt >= 2015-04-01)
startRecord: 1
maximumRecords: 200

The server may choose to give you less than 200, but it can't give you more
than that.

Mike Taylor wrote up a very approachable introduction to the query language:
    http://zing.z3950.org/cql/intro.html


Rob


>
> My naive approach might be to simply store annotations with ids I create
> and perhaps index by target domain without other fields (e.g. think of a
> table with id, domain as text string, and text holding arbitrary JSON of
> the annotation). This means I would have a server that could return an
> annotation by id, or by domain, or iterate, but other choices might be more
> difficult in terms of parsing JSON etc.
>
> I might think I have the following URLs:
>
> http//example.com/annotations/ ; (container)
>
> http//example.com/annotations/ids/ ;  e.g. GET
> http://example.com/annotations/ids/3 to get annotation #3
>
> http//example.com/annotations/targets/ ;  e.g. GET
> http://example.com/annotations/targets/boston.com to get all annotations
> for the boston.com domain (exact match)
>
> I think you are suggesting that all logic is in the query string, so to
> get all matches containing boston.com, it might be
>
> or GET http://example.com/annotations?target=boston.com&match=contains
>
> where 'contains' is a string that would have to be well defined.
>
> I'm probably missing something related to the resources but am thinking I
> might be interested in all targets as well...
>
> regards, Frederick
>
> Frederick Hirsch
>
> www.fjhirsch.com
> @fjhirsch
>
> > On Apr 15, 2015, at 2:02 PM, Denenberg, Ray <rden@loc.gov> wrote:
> >
> > At this morning’s call  we discussed paging, filtering, and sorting of
> annotations.
> >
> > A container may have a large number of annotations, and a client may
> want to specify that it wants only 100, then another 100 on the next
> request, and so on.   That would be straight paging, as the annotations are
> going to be supplied in random order.
> >
> > But the client may be  interested only in annotations with (for example)
> a specific Motivation, or meeting some other criteria.  Then that’s going
> to require pre-filtering, and it still may require paging in addition
> because the set of annotation meeting the criteria might still be large.
>  So this brings into the conversation the concept of a result set (where
> for “straight” paging, the result set is the entire set of annotations).
> >
> > Further, the client may want the results supplied in some specified
> order, for example, most recent first.  That brings into play sorting the
> result set.
> >
> > If we are going to come up with a querying mechanism  it would make
> sense to build into it  support for result sets and sorting.  Alternatively
> we could use an existing search protocol that already supports all of this.
> >
> > So I’d like to offer for consideration developing a profile of the SRU
> protocol  http://www.loc.gov/standards/sru/. I suggest that you NOT
> bother reading the spec and instead let me try to describe how simple it
> really can be if profiled for our  purposes.   (As to the status of this
> protocol, it is an OASIS standard, and is being fast-tracked in ISO.)
> >
> > Here is a rough outline of the suggested approach:
> > _________________________________________________
> >
> > I have a resource:
> > http://example.com /rays-resources/resource1
> >
> > I create an annotation container for it:
> > http://example.com /rays-resources/resource1/annotations
> >
> > I create an SRU endpoint for it:
> > http://example.com /rays-resources/resource1/annotations/sru
> >
> > this URL …..
> >
> > http://example.com /rays-resources/resource1/annotations/sru?
> > query=”oa.motivation=reviewing sortBy=oa.date/descending”
> &startRecord=1&maximumRecords=100
> >
> > (might have to percent encode “/” and space)
> >
> > …….  Says:
> > Search  http://example.com /rays-resources/resource1/annotations/
> > ·         For annotations whose Motivation is “reviewing”
> > ·         Sort the results by date, most recent first
> > ·         Return 100 annotations, beginning with the first
> >
> > Within the response, there will be a resultSetId.  Let’s say it’s
> “resultsXYZ”
> >
> >    The following URL gets the next 100 annotations:
> >
> > http://example.com /rays-resources/resource1/annotations/sru?
> > query=resultSetId=resultsXYZ&startRecord=101&maximumRecords=100
> >
> >
> >
> > Ok there’s handwaving here,  it needs elaboration, but it is nearly as
> simple as this.  Don’t be scared by the complexity of the specification, it
> can be profiled into a specification nearly as simple as I have described.
> >
> >
> > Ray
>
>
>


-- 
Rob Sanderson
Information Standards Advocate
Digital Library Systems and Services
Stanford, CA 94305
Received on Wednesday, 15 April 2015 20:28:28 UTC