- From: John Lumley <john@saxonica.com>
- Date: Tue, 04 Feb 2014 10:52:03 +0000
- To: EXPath ML <public-expath@w3.org>
- CC: John Lumley <john@saxonica.com>
- Message-ID: <52F0C653.4090904@saxonica.com>
Gentlefolk, The Archive Module http://expath.org/spec/archive has been somewhat put on the back burner since the September draft while getting the Binary and File modules close to or at 1.0 status, but it's about time it got a little more attention. Some discussion took place in November and December (with an unpublished Editor's draft) suggesting the following easy changes: 1. Adding functions arch:to-files() and arch:from-files() to give single-call transfers between archive and file directory trees. [These are effectively the examples in the Sept. draft, with some correctons and made more complete. It has been suggested that a 'target-path' argument should be added to arch:to-files().] 2. Improving the actions dealing with empty 'directory' entries. 3. Adding some further convenience functions for content, such as arch:text() which is effectively bin:encode-string(), or arch:xml() which is a compound bin:encode-string(fn:serialize()). [Similar for arch:html() with different serialization controls.] Both of these would require dependency on the Binary module, unless someone wishes to rewrite their own implementation. The first two have already been altered on the Editor's draft. Views on the third are welcome. Florent suggested examples: arch:create( ('mimetype', 'META-INF/container.xml', ...), (arch:text($mimetype), arch:xml($container), ...)) The ideal solution, from an API and usability point of view, would be something like: arch:create(( arch:text-entry('mimetype', $mimetype), arch:xml-entry('META-INF/container.xml', $container), ...)) and this last example becomes "easy" to represent if arch:create() accepts maps. Whilst the first form for arch:create() works easily with the current 'non-map' archive mechanism, of two parallel sequences, the second does not. It could conceivably work for a single argument form of arch:create($in as /item()/*), taking the members of $in by pairs, assumed to be (/xs:string,xs:base64Binary/)*, andarch:/xxx/-entry() producing such a pair. Not the most elegant, but certainly coherent. If you really wanted the first argument could be treated polymorphically - if you want 'per entry' control on properties, an element structure could be used and implementations make type-directed choices - but I'd rather not open that can of worms. Excluding the use of maps, other (minor) issues that are outstanding include: 1. Whether options for an archive should be read or set through attributes on an element, or child elements, viz <arch:options format="zip"> or <arch:options><arch:format>zip</ .. Currently <arch:entry> uses attributes. We should perhaps try to be consistent and establish a coherent policy. Personally I much prefer attribute mechanisms (they're textually denser!), though structured options, such as character maps (!) will require elements, and the options for fn:serialize() are now in the form of elements e.g. <output:serialization-parameters><output:method value="html"> - though they still use a @value attribute! But the largest issue is how to address the use of maps, which is by far the most coherent mechanism, enabling content and all properties to be kept together. 1. One objection might have been that the order of entries in a map is undefined (i.e. the return from map:keys()) - this can be accomodated by consistent use of a 'position' property attached to each entry - written when reading from an archive, used for order (if present) when creating an archive. 2. How does the 'element' - based mechanism (used for describing options and properties) co-exist with a map based one? My initial suggestion is that they don't - map-based Archive is in a totally different namespace and uses (almost) ONLY maps. Apart from some probable sharing of implementation code, and being joined at the specification, both have totally separate function catalogs, examples and test suites. [For example arch:to-files() and archM:to-files() would have different internal operational definitions (actually in terms of their use of entries()) though their external function would be identical.] 3. Maps require some support for XPath3.0 - at the absolute minimum the functions map:entry(), map:new(), map:keys() and map:get()from XSLT3.0 in the absence of the map{} syntax of XPath3.0 - how many implementations will have this? 4. I've used the prefix archM: in the spec for all the map-based stuff, but it's a little awkward in reading. Of course you're free to use (almost) any prefix you wish in code, and could re-use arch:, but in practice we tend to stick to the spec.conventional prefix to aid understanding. Any suggestions for a better differentiation between arch: and archM: ? My suggestion is that we aim in version 1.0 to support both element-structure and map-based libraries, but make it clear that additional functionality (1.1 etc...) will be focussed almost exclusively on using maps. Reactions are more than welcome. -- *John Lumley* MA PhD CEng FIEE john@saxonica.com <mailto:john@saxonica.com> on behalf of Saxonica Ltd
Received on Tuesday, 4 February 2014 10:52:23 UTC