W3C home > Mailing lists > Public > public-expath@w3.org > March 2013

Re: Draft of Binary module

From: Michael Kay <mike@saxonica.com>
Date: Wed, 13 Mar 2013 13:13:32 +0000
Message-ID: <51407B7C.8050607@saxonica.com>
To: public-expath@w3.org

On 13/03/2013 12:48, Adam Retter wrote:
> Wow thats quite comprehensive :-)
> I will need to digest it fully yet, but I have a few initial questions -
>
> 1) Why the use of xs:hexBinary when most other EXPath function
> libraries (and in fact most 3rd party XQuery functions) I have seen
> use xs:base64Binary? Converting from one to the other is something
> that you *really* dont want to have to do, especially for large files!
I think converting between hexBinary and base64Binary should be pretty 
much a no-op for most processors: the internal representation of the 
value is likely to be an immutable byte array, and conversion just means 
creating a new wrapper around the byte array. But it's a user 
inconvenience. Actually for input parameters, I don't see why we 
shouldn't accept either form.
>
> Im just reading through the rest now, my main concern is that these
> operations can be done efficiently. I have been re-working the
> implementation of the common Java code for the EXPath http module to
> support streaming of large binary values and large string values. We
> have customers that want to work with binary and text documents that
> are several gigabytes each from XQuery.
>
>
Interesting question. I don't know how efficient direct access to binary 
files is; if it's OK, then one could easily have an internal 
implementation of a base64Binary value that's mapped directly to a file 
rather than to memory, and perform all the operations directly on the 
file. But if efficiency means maintaining a current position in the file 
and reading what's at the current position, then that complicates the 
interface considerably. It could be done using higher-order functions, 
but would be a bit mind-blowing. Although we've got functions with 
side-effects in the File module, they are external side-effects, and I'd 
be reluctant to design anything with internal side-effects, e.g. on the 
current position of a file handle.

Michael Kay
Saxonica
Received on Wednesday, 13 March 2013 13:13:55 GMT

This archive was generated by hypermail 2.3.1 : Wednesday, 13 March 2013 13:13:55 GMT