- From: Norman Walsh <ndw@nwalsh.com>
- Date: Wed, 01 Aug 2007 11:51:53 -0400
- To: public-xml-processing-model-wg@w3.org
- Message-ID: <87ps27gsme.fsf@nwalsh.com>
Revised slightly: Vasil Rangelov proposes[1] an atomic step to read a directory listing and return it as a document. Jeni and I chatted about it a bit and it seems like a good idea. Here's my (slightly revised) proposal: <p:declare-step type="p:directory-list"> <p:output port="result"/> <p:option name="path" value="."/> <p:option name="recursive" value="no"/> <p:option name="filter"/> </p:declare-step> The p:directory-list step reads all of the files in the specified directory and returns a c:folder element: <c:directory path="abs-path-specified"> <c:directory path="abs-path-specified/dirname"/> <c:file path="abs-path-specified/filename"/> ... </c:directory> If the "recursive" option is "yes", then you get the whole, recursive listing: <c:directory path="abs-path-specified"> <c:directory path="abs-path-specified/dirname"> <c:file path="abs-path-specified/dirname/othername"/> ... </c:directory> <c:file name="abs-path-specified/filename"/> ... </c:directory> The significant change here is that the path names are returned as fully qualified paths. The path originally specified is made absolute before returning it. The "filter" option specifies a command-line style pattern. So <p:directory-list path="." recursive="yes" filter="*.xml"> returns only the files that match "*.xml" in the current directory or any directory under the current directory. There are a few different ways that we could go on the whole recursive/filter business. I suggest that filters only apply to the names of files, not directories. The order of c:file and c:directory elements within a directory is implementation defined. The current working directory is implementation defined. I don't know exactly what to point to for the syntax for filters. We could use regexp, but that seems like overkill (and filenames often contain periods so it's tedious for users). I cribbed the following text from the csh manpage (and massaged it to fit this context): The filter is regarded as a pattern and treats the characters '*', '?', and '[' specially. If a filter is specified, only files which have names that match the filter pattern are returned. For the purpose of determining whether a filename matches or not, only the filename part (and not any of the path components of its absolute name) is considered. In matching filenames, the character '*' matches any string of characters, including the null string. The character '?' matches any single character. The sequence [...] matches any one of the characters enclosed. Within [...], a pair of characters separated by '-' matches any character lexically between the two in Unicode codepoint order (inclusive). All other characters match exactly the same character. Implementations can throw a dynamic error if the requested path is not available to the user running the pipeline. The set of paths that are available is implementation-defined. In environments where security is paramount, there may be no accessible paths. I propose that this be a required step. Be seeing you, norm [1] http://lists.w3.org/Archives/Public/public-xml-processing-model-comments/2007Jul/0002.html -- Norman Walsh <ndw@nwalsh.com> | Everything should be made as simple as http://nwalsh.com/ | possible, but no simpler.
Received on Wednesday, 1 August 2007 15:52:07 UTC