Re: Proposed EXPath module: resource collections

I was thinking of the list of properties along the lines "if the information is available, this is how it should be represented". I wasn't thinking that last-modified, for example, should be available for every resource, only that we should standardize what the property name is for situations where it is offered. Also that we might standardize the property sets for some particular kinds of concrete collection, e.g. collections that map to directories in filestore.

Michael Kay
Saxonica
mike@saxonica.com
+44 (0) 118 946 5893




On 17 Feb 2015, at 17:31, Robie, Jonathan <jonathan.robie@emc.com> wrote:

> I am also interested, and agree with reducing the required properties to a minimum.
> 
> If schema is provided, how would schema versioning be handled?  I think we would need to answer that question or omit that property, and omitting it seems easier.
> 
> Jonathan
> ________________________________________
> From: Adam Retter [adam@exist-db.org]
> Sent: Tuesday, February 17, 2015 12:25 PM
> To: Michael Kay
> Cc: EXPath CG; hgrennau@yahoo.de; Robie, Jonathan; Norman Walsh
> Subject: Re: Proposed EXPath module: resource collections
> 
> Hi Mike,
> 
> Of course interest from me on this. My feeling is to either reduce the
> amount of properties to a bare minimum, or reduce the mandatory number
> of properties that must be supported.
> 
> 1) I guess you by now know that I don't like media-type ;-)
> 2) It seems to me that local-name and file-extension can be taken from
> the resource-uri easily enough.
> 3) In addition, I can already think of systems that do not
> support/provide for created, last-modified and owner.
> 4) schema maybe seems an odd choice to me, as are we not duplicating
> the purpose of a catalog here?
> 
> On 17 February 2015 at 00:01, Michael Kay <mike@saxonica.com> wrote:
>> 
>> Three inputs in the last week all point in the same direction:
>> 
>> Hans-Juergen Rennau gave a paper in XML Prague ("Node search preceding node
>> construction") about how to define collections of resources with properties
>> allowing them to be filtered and selected before they are actually parsed
>> for querying.
>> 
>> The XQuery WG discussed how to model heterogeneous collections of resources
>> including for example XML documents, JSON documents, and binary documents,
>> and how to extend or supplement the collection() function to process such
>> sets of resources.
>> 
>> XProc 2.0 has some (very basic, currently) model that document nodes have
>> properties that are external to the document content (document URI, last
>> modified date, etc) which should be made available to XProc applications.
>> 
>> This set me thinking that it would not be very difficult to do something
>> very useful in EXPath in this area. I'm thinking of something less elaborate
>> than Hans-Juergen's model, but general enough to achieve similar levels of
>> capability by layering things on top.
>> 
>> As a basic model, the idea is that we have an object called a "resource
>> collection" identified by a URI. A resource collection is a set of
>> resources, each modelled as a map containing key-value pairs representing
>> properties of the resources in the collection. The keys that are present in
>> the map may vary from one kind of resource to another, but we will define
>> some commonly used property names for use when information is available. For
>> example:
>> 
>> resource-uri - a context-free URI identifying the resource
>> name - local name of the resource within the collection
>> media-type - the MIME type of the resource
>> extension - a part of the name of the resource conventionally used to
>> identify its type
>> created - dateTime of the original resource creation
>> last-modified - dateTime of last modification of the resource
>> fetch - a zero-arity function that can be called to deliver an XDM item
>> representing the content of the resource in a way appropriate to its media
>> type
>> is-collection - a boolean indicating whether this resource is itself a
>> collection, in which case the fetch() function returns the sequence of maps
>> representing that collection
>> schema - the uri of a schema against which the resource is intended to be
>> valid
>> owner - identifier of a person or other entity owning the resource
>> 
>> A function resource-collection($uri) returns the sequence of maps
>> representing a collection.
>> 
>> We can then use XPath 3.1 facilities to filter this sequence of maps. For
>> example
>> 
>> rc:resource-collection('coll-uri')[?media-type = 'appllication/json' and
>> ?last-modified gt xs:dateTime('2012-01-01T01:01:01')]?fetch()
>> 
>> selects the JSON resources modified since a certain date, and parses them
>> using a JSON parser to deliver a sequence typically of maps or arrays
>> (depending on the JSON content). (The parsing step of course is unnecessary
>> if the collection holds the JSON resources in pre-parsed form).
>> 
>> We could consider defining a mapping from this abstract concept of a
>> resource collection to certain concrete kinds of collection, e.g. a
>> directory of unparsed files in filestore, or a WebDAV collection.
>> 
>> We could consider impure functions to add, remove, or replace resources
>> within a collection.
>> 
>> We could consider a variant of the fetch() function that takes a map giving
>> parsing options, e.g. whether to validate, what to do on error, etc.
>> 
>> Any interest?
>> 
>> 
>> Michael Kay
>> Saxonica
>> mike@saxonica.com
>> +44 (0) 118 946 5893
>> 
>> 
>> 
>> 
> 
> 
> 
> --
> Adam Retter
> 
> eXist Developer
> { United Kingdom }
> adam@exist-db.org
> irc://irc.freenode.net/existdb

Received on Tuesday, 17 February 2015 18:16:17 UTC