Re: ISSUE-45: Introduce hydra:filter (subPropertyOf hydra:search) from Gregg Kellogg on 2014-04-23 (public-hydra@w3.org from April 2014)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Wed, 23 Apr 2014 10:34:46 -0700
To: Markus Lanthaler <markus.lanthaler@gmx.net>
Cc: public-hydra@w3.org, Thomas Hoppe <thomas.hoppe@n-fuse.de>
Message-Id: <6644A7C0-0746-4DDE-88FF-E910BCDE9F70@greggkellogg.net>
On Apr 23, 2014, at 7:32 AM, Markus Lanthaler <markus.lanthaler@gmx.net> wrote:

> Collapsing Gregg's and Thomas' threads
> 
> On Sunday, April 20, 2014 1:56 AM, Gregg Kellogg wrote:
>>> The semantics would be that a template (pseudo-code)
>>> 
>>>   hydra:filter [
>>>     hydra:template "/collection?name={schema:name}"
>>>   ]
>>> 
>>> that is expanded to
>>> 
>>>   /collection?name=Markus
>>> 
>>> would return a collection in which all members correspond to the
> following
>>> graph pattern
>>> 
>>>  ?member schema:name "Markus"
>>> 
>>> 
>>> PROPOSAL: Add a hydra:filter property with the semantics outlined above.
>>> 
>>> Question: If we decide to support "propertyPath" in
> "PropertyConstraint",
>>> should we also support propert paths in IRI template mappings? This
> would
>>> certainly make filter much more powerful.
>>> 
>>> https://github.com/HydraCG/Specifications/issues/45
>> 
>> Yes, this is something like I've been looking for; I'm not sure if
> aggregations are appropriate
>> here, or would be left to search (probably the later).
> 
> What exactly do you mean by aggregations... do you mean things like
> calculating sums of subsets of the members of a collection? If so, that's
> not addressed at all. The only "sum" you would potentially get is the number
> of members of the filtered collection.

Certainly counting should come for free with a filter, which presumably returns a new Collection from which you can get totalItems. I was thinking of a hydra:search, which might return something different than a collection of resources all the same type as the class on which the search if performed (or derived from that class). I was thinking of aggregation in the form of returning simple scalar values such as MIN/MAX/SUM/COUNT, but also possibly GROUP results, in which each result might be described with a class defined in hydra:returns. This may be beyond Hydra's specific use case, but you've said that the semantics of hydra:search are intentionally vague. The problem with this, however, is that it then becomes useless for describing a contract between the client and the server, so we fall back on traditional API documentation to understand the behavior of search.

A specific use case I had in mind was to be able to return information about a collection made up of heterogeneous members. You might then want to do the equivalent of a SPARQL aggregating query. I discussed this more fully at the end of my email on querying collections [1] which didn't spark any discussion.

>> You'll need to explain more about URI template variable binding; It seems
> that
>> schema:name here is somehow used to find "Markus", perhaps in the subject
> of the
>> collection, and that the "name" query element is interpreted by the
> service to be
>> schema:name for members of the collection. It might be somewhat confusing
> if it has two
>> senses that aren't directly related.
> 
> OK, I thought the pseudo code above is enough. Anyway, here is how it works:
> 
>  1) You have a collection with several members.
>  2) You define a IRI template and associate it via hydra:filter to the
> collection
>  3) Each variable in that IRI template is bound to a property (path)
>  4) Expanding the template with concrete values results in a queries of the
> form
>          ?member ?property "value supplied by client"
>  5) Derefering the expanded IRI template returns a collection whose members
> match the query criteria
> 
> Is it clearer now? Otherwise I'll post a concrete example.

So, in your example, hydra:template "/collection?name={schema:name}", the {schema:name} portion of the template is interpreted both by the service and the client? When the client makes the template concrete by replacing {schema:name} with Markus, the server knows to bind the "name" query parameter with the schema:name property path and reverse this logic.

It's not explicit in this snippet, but I presume there is a mapping that binds "name" to schema:name, in which case wouldn't the template be "/collection{?name}" (per Example 15 from the spec). The server understands "name" to be bound to "schema:name", because of the mapping, and RFC6570 describes how to construct a query presuming that "name" is bound to a concrete value; if it were bound to "Markus", this would create "/collection?name=Markus". Presumably, unbound variables are just eliminated from the URI Template.

>> From a client's perspective, I would think that these variables would be
> used to drive HTML
>> form fields, or similar. Is the purpose of specifying the predicate, then,
> to get the range and
>> description to build such a form entry? Explaining the motivation behind
> doing this would
>> be useful.
> 
> Yes, you could use it to render a form or you could present all the members
> of a collection in the form of a spreadsheet:
> 
> <memberUrl>  | property1 |  property2 | ... | propertyN
> 
> You could then filter that table by the properties that appear in the
> hydra:filter IRI template.

Ok.

> Thomas has also a good point:
> 
>> On Sunday, April 20, 2014 1:40 PM, Thomas Hoppe wrote:
>> I would appreciate the support of filtering as I have mentioned on
>> other posts but the proposed approach as far as I have understood it
>> has the major disadvantage that I would need to define a filter for
>> each property of collection members on which I want to offer
>> filtering. This can become quite lengthy.
> 
> Yeah, that's true. You would need to specify them explicitely.

We should say something about the role of sub-classing with Hydra operations and constraints. If I define a constraint on schema:Event that defines a hydra:TemplatedLink, can we infer that this is also a constraint on something like schema:SportsEvent? Looking at it the other way, a schema:SportsEvent is also a schema:Event through RDFS inference, so operations and constraints defined on the such an instance would presumably also be appropriate for such a resource.

>> I opt for a more generic approach which allows the client to pick
>> arbitrary properties and filter for them -- something like this:
>> 
>> hydra:filter: {
>>  @type: "IriTemplate",
>>  template: "?f={property}:{value}",
>>  mappings: [
>>    {
>>      @type: "IriTemplateMapping",
>>      variable: "property",
>>      property: "rdf:Property",
>>      required: true
>>    },
>>    {
>>      @type: "IriTemplateMapping",
>>      variable: "value",
>>      required: true
>>    }
>>  ]
>> }
> 
> Using rdf:Property this way is ambigous as you wouldn't know whether the
> server just supports filterting for rdf:Property or all properties.
> 
> 
>> This would also allow for templates like this:
>> 
>>  template: "?{property}={value}"
> 
> The other problem with this approach ist hat {property} would have to be
> expanded to a full URL as otherwise. So you would end up with very long and
> ugly URLs.

So?

>> Which would allow to describe the diversity of current filtering
>> notations found in APIs.
> 
> I don't know of many APIs that allow completely arbitrary filtering. Most of
> them are quite restricted... which makes sense because filtering might be a
> quite costly operation especially if there are lots of properties. If you
> really want to allow completely arbitrary filtering, it might actually make
> more sense to just send a SPARQL query or something similar. I'm not sure.
> Thoughts?

I think basic filtering using property paths is a pretty important use case. We might constrain the length of these paths, as not every implementation will be done using a SPARQL back end. But, for my part, I'm fine with limiting filters to property paths defined as specific mappings within an TemplatedLink.

Gregg

[1] http://lists.w3.org/Archives/Public/public-hydra/2014Apr/0028.html

> --
> Markus Lanthaler
> @markuslanthaler
>
Received on Wednesday, 23 April 2014 17:35:17 UTC