- From: Alex Milowski <alex@milowski.com>
- Date: Thu, 20 Feb 2014 11:10:25 +0000
- To: XProc Dev <xproc-dev@w3.org>
On Tue, Feb 18, 2014 at 11:46 AM, James Fuller <jim@webcomposite.com> wrote: > > Documents flowing through a pipeline is a fundamental concept in xproc > eg. data flowing through pipe whose connections to steps are defined > by bindings. This is classic data flow language, though the decision > in v1 was to only allow XML documents flow through. > > In XProc vnext we are considering allowing item()* with non xml > documents flowing through pipes, which would address your requirement > (I think) > The requirement, as stated is: "Experience has shown that real-world pipelines often involve non-XML documents. Several workarounds have been invented for special cases. The limitation that V1.0 can only pass XML between steps makes some pipelines difficult, if not impossible, to write. Providing the ability to allow non-XML documents to flow between steps opens up the possibility of writing simple pipelines to work with images, JSON, Turtle, EPUB, etc." It isn't saying "item()*" because those things do not currently have representations in the XDM. While we could extend the XDM to handle these items, I think that would be difficult to justify and certainly a large task to get right. I'm inclined to just say "they are documents with metadata" and let implementors do the right thing. This allows for use cases like processing large video files with XProc where the binary is a reference to a stream handle that steps process using streaming. This allows the application of language technology to the audio tracks for automatic captioning or other video annotations. You can't do that with an XDM value right now as the model is typically requires the whole value to be represented. There are certainly ways to specify such things without requiring complete instantiation but that will require a sufficient amount of time to get correct. The result would be either we'd get it wrong it XProc V2 would take far too long. We are also not the WG in charge of the XDM. As such, I'd rather that non-XML documents become a bit more abstract to allow implementors to do whatever is necessary to meet the requirements we set forth in the specification. Whether an implementation can handle large binaries will just be a feature or quality of implementation question. [1] http://www.w3.org/TR/xproc-v2-req/#non-xml-documents -- --Alex Milowski "The excellence of grammar as a guide is proportional to the paucity of the inflexions, i.e. to the degree of analysis effected by the language considered." Bertrand Russell in a footnote of Principles of Mathematics
Received on Thursday, 20 February 2014 11:11:02 UTC