- From: Adam Retter <adam.retter@googlemail.com>
- Date: Wed, 7 Jun 2017 11:09:09 -0400
- To: Michael Kay <mike@saxonica.com>
- Cc: EXPath ML <public-expath@w3.org>
Okay so if I understand, you are saying, don't do the check up front, do the check afterwards and report if less than size was read? I can see the argument that there is value there for the user, however, it is very hard to implement for us because of the streaming nature. If we consider bin:part#3 it takes a xs:base64Binary and returns a xs:base64Binary. Internally for us it takes a stream and returns a stream, also we don't actually do anything with the stream until it is actually realised, this makes tracking the error very hard, in the face of nested functions on xs:base64Binary. I will give some thought to how we can catch the underlying IOException and relate it to the correct expression; it's tricky because effectively the `stream` escapes the scope of the enclosing expression. On 7 June 2017 at 10:39, Michael Kay <mike@saxonica.com> wrote: > array:subarray#3 has the same problem. > > I would have thought bin:part#3 is usually going to be used to read a chunk of say 4 or 8 bytes, in which case you want to know if it's reading off the end. I guess there's a scenario where you're reading TLV data and L is long. You still want an error if it takes you off the end. I don't think anyone's going to complain much if the error is deferred, but if they wanted to just read to the end of the stream, they would have used bin:part#2. > > Michael Kay > Saxonica > > >> On 7 Jun 2017, at 13:05, Adam Retter <adam.retter@googlemail.com> wrote: >> >> Hi there, >> >> I am at present implementing the bin module in eXist-db. However there >> are a few things in the spec which do not play nice when working with >> streams. >> >> In eXist a xs:base64Binary or xs:hexBinary is represented internally >> by a stream. We do this because binary values can be very large, for >> example when working with digital video or similar, as such it is >> undesirable to have to load all the binary data into memory to be able >> to work with it. >> >> My main issue is with the definitions of when bin:index-out-of-range >> should be thrown. >> >> If we consider just one definition of bin:index-out-of-range, the >> function bin:decode-string states: >> >> [bin:index-out-of-range] is raised if $offset is negative or $offset + >> $size is larger than the size of the binary data of $in. >> >> The problem with this is that we cannot perform the second check >> ($offset + $size < bin:length($in)) tup-front without reading the >> entire data stream of $in. Reading the entire datastream of $in is >> undesirable, as our streams also have efficient random positioning >> features, which otherwise allow us to efficiently just read a region >> of the stream. >> >> May I suggest that this constraint would be better relaxed, so that >> the definition for that function would be like: >> >> [bin:index-out-of-range] is raised if $offset is negative. >> >> If $offset + $size is greater than the size of $in, I think it is fine >> to just return data of length bin:length($in) - $offset. >> >> How does that sound? >> >> >> >> >> -- >> Adam Retter >> >> skype: adam.retter >> tweet: adamretter >> http://www.adamretter.org.uk >> > -- Adam Retter skype: adam.retter tweet: adamretter http://www.adamretter.org.uk
Received on Wednesday, 7 June 2017 15:09:44 UTC