- From: John Lumley <john@saxonica.com>
- Date: Wed, 13 Nov 2013 12:12:54 +0000
- To: Christian Grün <christian.gruen@gmail.com>
- CC: EXPath ML <public-expath@w3.org>
- Message-ID: <52836CC6.20701@saxonica.com>
On 21/10/2013 12:54, Christian Grün wrote: > Hi John, > > thanks for editing the Archive Module. > > • As indicated before, I think that a convenience function for > extracting ZIP archives to disk would be beneficial. I see three > reasons for this: Extracting files is one of the most frequent > operations done with archives. Next, a pure XPath or XQuery solution > is, by nature, pretty cumbersome and not intuitive. Last but not > least, we should provide a solution for extracting very large files, > as it’s pretty tricky to rewrite the example in 3.3, and related ones, > for streaming IO. However, I agree that is easy to find SoC arguments > against providing such a function. I've added a generalization of the example of 3.3 into a function that can either be provided as an XSLT package (which assumes the use of EXPathFile), or provided as a built-in function with the same name, for which an entry appears in the function catalog: <xsl:function name="arch:extract-to-files"> <xsl:param name="archive" as="xs:base64Binary"/> <xsl:variable name="entries" select="arch:entries($archive)"/> <xsl:variable name="dirs" select="$entries[ends-with(.,'/')]"/> <xsl:variable name="required.dirs" select="distinct-values(for $r in ($entries except $dirs) return replace($r,'/[^/]+$','/'))[ends-with(.,'/')]"/> <xsl:sequence select="for $d in distinct-values(($required.dirs,$dirs)) return file:create-dir(replace($d,'/$',''))"/> <xsl:sequence select="for $f in ($entries except $dirs) return file:write-binary($f,arch:extract-binary($archive,$f))"/> </xsl:function> It is then implementation dependent whether the extension might be OK in streaming. Perhaps for sake of completeness we might define the inverse - arch:archive-from-files(/$files as/ /xs:string*/, /$options/) as /xs:base64Binary/ which descends the file trees to garner data? > > • Another open issue is the handling of directories in archives. I > don’t have an easy answer for that, but we still need to find some way > to create empty directories via the Archive Module. Since entries are named with the solidus as the separator, then any entry whose name ends with '/' could be taken to be a directory, and any (usually empty) content, either in the parallel content argument or in the appropriate 'content' map entry is ignored, and an empty directory added to the archive? > > • Some additional errors could be added for handling unsupported > archive formats or algorithms, or entry descriptors (unless we want to > generally use FORG0006 for unknown function arguments) More on this as stuff develops. The big issue for me at present is how we distinguish between map and element-based forms. > > Hope this helps, > Christian > -- *John Lumley* MA PhD CEng FIEE john@saxonica.com <mailto:john@saxonica.com> on behalf of Saxonica Ltd
Received on Wednesday, 13 November 2013 12:13:17 UTC