Bin Module does not work well with Streams

Hi there,

I am at present implementing the bin module in eXist-db. However there
are a few things in the spec which do not play nice when working with
streams.

In eXist a xs:base64Binary or xs:hexBinary is represented internally
by a stream. We do this because binary values can be very large, for
example when working with digital video or similar, as such it is
undesirable to have to load all the binary data into memory to be able
to work with it.

My main issue is with the definitions of when bin:index-out-of-range
should be thrown.

If we consider just one definition of bin:index-out-of-range, the
function bin:decode-string states:

[bin:index-out-of-range] is raised if $offset is negative or $offset +
$size is larger than the size of the binary data of $in.

The problem with this is that we cannot perform the second check
($offset + $size < bin:length($in)) tup-front without reading the
entire data stream of $in. Reading the entire datastream of $in is
undesirable, as our streams also have efficient random positioning
features, which otherwise allow us to efficiently just read a region
of the stream.

May I suggest that this constraint would be better relaxed, so that
the definition for that function would be like:

[bin:index-out-of-range] is raised if $offset is negative.

If $offset + $size is greater than the size of $in, I think it is fine
to just return data of length bin:length($in) - $offset.

How does that sound?




-- 
Adam Retter

skype: adam.retter
tweet: adamretter
http://www.adamretter.org.uk

Received on Wednesday, 7 June 2017 12:05:51 UTC