- From: Vasil Rangelov <boen.robot@gmail.com>
- Date: Sun, 22 Jul 2007 17:00:38 +0300
- To: <public-xml-processing-model-comments@w3.org>
Hello. More often than not, we all work with a large set of XML files and aggregate them into one which contains all or some of the data in all or some of all those files. Well... at least I do :D XProc provides a p:for-each step which is great for invoking a pipeline on a set of documents and other steps provide great massive manipulation facilities as well. One thing that is missing though is how exactly are those documents found. The pipeline author could create some sort of XML sitemap that he'll then pass on to XProc for further manipulation, but when a file is removed/added/moved, this sitemap must also be updated. To solve this type of issues, I suggest a new (atomic?) step, possibly called p:list (or p:index?) declared like so: <p:declare-step type="p:list"> <p:output port="result"/> <p:option name="base" required="yes"/> <p:option name="deep" value="yes"/> </p:declare-step> When invoked, this step will list all files in the folder specified by the "base" option and all of its subfolders. If a file is provided, then the folder of that file is used instead. Setting the "deep" option to "no" will limit the indexing only to the files in the "base" folder. The result of this step would be a c:folder (?) element containing zero or more c:file elements or other c:folder elements. The c:folder and c:file elements would have a "name" attribute which must contain the name of the file/folder. The base folder (root element of the result document) should not have such attribute (or if it does, its value should be "/"). Implementations may find useful to add additional information about files/folders in other attributes. For example, "read-only" with a value of yes or no, saying whether a file is read only. Whether such information is generated could be tweaked with additional options. I'll leave the WG to decide those type of details. Example: <p:list base="foo" /> May produce <c:folder> <c:folder name="bar"/> <c:folder name="foobar"/> <c:file name="bar.xml"/> </c:folder> <c:file name="foo.xml"/> </c:folder> And <p:list base="file:///C|/" deep="no"/> May produce <c:folder> <c:folder name="Documents and Settings"/> <c:folder name="Program Files"/> <c:folder name="Temp"/> <c:folder name="Windows"/> <c:file name="AUTOEXEC.BAT"/> <c:file name="boot.ini"/> <c:file name="CONFIG.SYS"/> <c:file name="IO.SYS"/> <c:file name="MSDOS.SYS"/> <c:file name="ntldr"/> <c:file name="pagefile.sys"/> </c:folder> The following pipeline demonstrates a sample use case. It indexes all files on the server and passes the result as input to XSLT, which could then use this information to perform all sorts of further manipulation: <p:pipeline name="pipeline" xmlns:p="http://www.w3.org/2007/03/xproc"> <p:input port="stylesheet" primary="yes"/> <p:output port="result" primary="yes"/> <p:xinclude/> <p:list base="/" name="contents"/> <p:xslt> <p:input port="document"> <p:pipe step="contents" port="result"/> </p:input> <p:input port="stylesheet"> <p:pipe step="pipeline" port="stylesheet"/> </p:input> </p:xslt> </p:pipeline> Regards, Vasil Rangelov
Received on Sunday, 22 July 2007 14:01:06 UTC