Re: Proposed EXPath module: resource collections from Hans-Juergen Rennau on 2015-02-20 (public-expath@w3.org from February 2015)

From: Hans-Juergen Rennau <hrennau@yahoo.de>
Date: Fri, 20 Feb 2015 06:42:58 +0000 (UTC)
To: Michael Sokolov <msokolov@gmail.com>, Michael Kay <mike@saxonica.com>
Cc: "jonathan.robie@emc.com" <jonathan.robie@emc.com>, "ndw@nwalsh.com" <ndw@nwalsh.com>, "christian.gruen@gmail.com" <christian.gruen@gmail.com>, "public-expath@w3.org" <public-expath@w3.org>
Message-ID: <1816035135.3966686.1424414578869.JavaMail.yahoo@mail.yahoo.com>

Michael (S.), I am not sure. The problem I have is that there is a one-to-one relationship between resource descriptors and resources, and how would this map to an index? But I am sure that in general access to the resource descriptor itself should *also* be enabled, so my view is: main purpose of the descriptors is resource retrieval; minor purpose is to supply interesting information about resources. And therefore I think we need separate functions returning the resources and the descriptors, respectively.
Hans-Juegen
 

     Michael Sokolov <msokolov@gmail.com> schrieb am 0:09 Freitag, 20.Februar 2015:
   

 I'm curious if you see this resource mechanism as something that would 
be able to support representing persistent indexes in some way. If so it 
is likely to lead to other use cases than merely retrieving documents 
since indexes become valuable in their own right; enumerating all the 
possible values of an index-resource with restriction and counts of 
matching documents for each value is a pretty typical example, but there 
would be others.

-Mike

On 2/19/2015 5:58 PM, Michael Kay wrote:
>> One question is - what is the chief value of rc:resource-collection (present or future version) - delivery of resource descriptors, or delivery of resources referenced by the resource descriptor?
> The history here is that we had collection() which had two limitations: (a) it can only return nodes (in practice that usually means XML documents) as distinct from other kinds of resource such as text and binary files and JSON files and (b) it cannot return metadata about the resources which can be used to achieve selective retrieval of the resources based on their metadata.
>
> We introduced uri-collection() first in XSLT 3.0 and then in XPath 3.1 to try and solve these problems, for example you can get a set of URIs, filter it to select those ending in ".txt", and then use unparsed-text() to retrieve those resources. But this isn't general enough; there's still no metadata apart from the URI itself, and no way of discovering what kind of resource it is and how it should be parsed.
>
> So the value, compared to what's in the standard, is to provide more metadata (resource descriptors if you like), but in most cases this is only useful as a means to an end, where the end is successful retrieval of the resource itself.
>
>> I vote for the second, as I regard the resource descriptors not so much as an augmentation of the resource, but as a means to find the resource (like an index is a means, although occasionally you want to see it itself), and therefore I favor a variant delivering the resources themselves, rather than maps. Doesn't the following signature make things simpler for both, the user and the implementer alike:
>>    collection($uri, $filter) : resource-type*
> Yes, the challenge is how to provide the filter.
>> You wrote: "If the content of the resource is just the value of one of the properties, I think some implementations may have difficulty fetching the data only when it is actually needed."
>> Oh! I thought it a fairly general and safe assumption that any resource should be constructable from one string - either a reference (URI or proprietary) or the serialized resource. But you think this should not be assumed. Could you give me a counter example?
> I think you may have misunderstood me. If the resource descriptor is a map $R, then the question is whether to make the resource content available as a map property $R?content, or via a function $R?fetch(). In some sense these are equivalent. However, I've been assuming that we would usually want to get the content only after looking at the metadata, and if both are properties of the same map, then presenting both as "data properties" would require a specialist map implementation, whereas if the content is made available by a function, a conventional implementation with no special tricks would achieve the desired effect that the content is retrieved only on demand.
>> You wrote: "I think we should steer completely clear of models that use nodes. Nodes in XDM carry far too much baggage - names, namespaces, 13 axes, identity semantics, base URIs etc."
>> I am at a loss, as I believe that node trees are of overwhelming importance when working with XQuery, no? I want two separate functions for retrieving XML and not-XML resources (fn:doc vs. fn:unparsed-text), and I want two separate functions for retrieving filtered node collections and filtered resources in general. I am not sure if you disagree, or perhaps meant something else by saying "steering clear of models that use nodes"?
>>
> Nodes (XML documents) are important as resources that we want to retrieve, but I don't think they are a good way to model the metadata of collections. And I don't think we should have parallel sets of methods for handling XML versus other resources (except to the extent that we have to live with what we've got); we're trying here to generalize the model so it's not limited to handling a finite set of media types built in to the language.
>
> Michael Kay
> Saxonica
>
>
>

Received on Friday, 20 February 2015 06:43:28 UTC