W3C home > Mailing lists > Public > public-expath@w3.org > June 2017

Re: Bin Module does not work well with Streams

From: Michael Kay <mike@saxonica.com>
Date: Wed, 7 Jun 2017 15:39:53 +0100
Cc: EXPath ML <public-expath@w3.org>
Message-Id: <E4DA9916-9F02-4CB9-BEBE-18598416F068@saxonica.com>
To: Adam Retter <adam.retter@googlemail.com>
array:subarray#3 has the same problem.

I would have thought bin:part#3 is usually going to be used to read a chunk of say 4 or 8 bytes, in which case you want to know if it's reading off the end. I guess there's a scenario where you're reading TLV data and L is long. You still want an error if it takes you off the end. I don't think anyone's going to complain much if the error is deferred, but if they wanted to just read to the end of the stream, they would have used bin:part#2.

Michael Kay
Saxonica


> On 7 Jun 2017, at 13:05, Adam Retter <adam.retter@googlemail.com> wrote:
> 
> Hi there,
> 
> I am at present implementing the bin module in eXist-db. However there
> are a few things in the spec which do not play nice when working with
> streams.
> 
> In eXist a xs:base64Binary or xs:hexBinary is represented internally
> by a stream. We do this because binary values can be very large, for
> example when working with digital video or similar, as such it is
> undesirable to have to load all the binary data into memory to be able
> to work with it.
> 
> My main issue is with the definitions of when bin:index-out-of-range
> should be thrown.
> 
> If we consider just one definition of bin:index-out-of-range, the
> function bin:decode-string states:
> 
> [bin:index-out-of-range] is raised if $offset is negative or $offset +
> $size is larger than the size of the binary data of $in.
> 
> The problem with this is that we cannot perform the second check
> ($offset + $size < bin:length($in)) tup-front without reading the
> entire data stream of $in. Reading the entire datastream of $in is
> undesirable, as our streams also have efficient random positioning
> features, which otherwise allow us to efficiently just read a region
> of the stream.
> 
> May I suggest that this constraint would be better relaxed, so that
> the definition for that function would be like:
> 
> [bin:index-out-of-range] is raised if $offset is negative.
> 
> If $offset + $size is greater than the size of $in, I think it is fine
> to just return data of length bin:length($in) - $offset.
> 
> How does that sound?
> 
> 
> 
> 
> -- 
> Adam Retter
> 
> skype: adam.retter
> tweet: adamretter
> http://www.adamretter.org.uk
> 
Received on Wednesday, 7 June 2017 14:40:23 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 June 2017 14:40:24 UTC