Re: Final draft of Proposed Binary Module

On 16/11/2013 13:48, Christian Grün wrote:
> Dear Jirka, dear John,
>
> I finally implemented the Binary Module, and I stumbled upon some
> minor issues (all the other parts of the spec. look completely fine!):
>
> – bin:pad-left, bin:pad-right: It’s not clear to me what “blank
> octets” are. Do you refer to the zero value (00)?
Yes - it should be explicitly stated as '00'
> – bin:pad-left, bin:pad-right: The XPath expressions in the notes must
> be swapped.
The perils of cut-and-paste! Will fix. Actually if we'd used these as 
specific tests in the QT3 testsuite (which I didn't), we'd have spotted 
it. I wonder whether semi-automatic links between the examples in the 
spec (and these are actually defined in the function catalog) might be 
helpful.
> – bin:decode-string: typo "passed though" -> "passed through".
Thanks
> – bin:encode-string: [bin:unknown-encoding] needs to be added.
Again - I think that the test suite does the correct thing.
> [bin:decoding-error] needs to be replaced with a new error code
> [bin:encoding-error] (or a common code like [bin:conversion-error]
> needs to be added).
There is a missing error code declaration around here. Will fix.
> – bin:shift: are values larger than 8 allowed for $by? In other words,
> may a shift extend over more than one byte? If yes, queries such as
> the following one would be allowed, too:
>
> – bin:shift(bin:hex("000001"), 17) → bin:hex("020000")
Yes - they are - and the test-suite tests such cases (have a look). If 
we want to be pedantic, should we limit $by to be less than a long, or 
support BigInteger ;-) ? I think 2^63 bit shifts might be quite 
sufficient, at least until Exabyte main memories appear.
>
> – [offset-beyond-end] The descriptions of this error are a bit
> inconsistent, as they sometimes refers to the last array position
> ($offset + $size - 1, see e.g. bin:unpack-unsigned-integer) and
> sometimes to the length of the array ($offset + $size, see e.g.
> bin:part). Maybe it’s better to always refer to the array length, and
> remove the occurrences of "-1" and change "+7" to "+8" and "+3" to
> "+4".
I'll look through these.
> Last but not least, three suggestions:
>
> – bin:decode-string, bin:encode-string: I would love to see a
> 1-argument-signature added with UTF-8 as default encoding, as this is
> the quasi-default encoding for XML data. We are providing similar
> signatures in the File and Archive Module.
Very simple to do and anything that makes the most common case easier 
should be encouraged. As you say - the other similar modules use the 
same convention.
>
> – bin:find: we could accept empty search data as well, and discard the
> corresponding error, this is quite common for many other related
> functions (like fn:contains), and completely valid from a logical
> perspective (empty search data returns $offset as result). While empty
> search data may seem useless, it can be the result of other
> operations, and code may get unnecessarily bloated if array length
> checks needed to be added in order to avoid errors.
Mmmm- I need to think about this. Of course the 'find-all' example now 
would go into an infinite loop with a zero-search (but this can be 
guarded of course). I was probably coloured by the 
'empty-string-matching' regex exclusion, wich i think is due to be dropped.
>
> – [offset-beyond-end]: the error code could be renamed to a more
> general designation such as "out-of-bounds", as in many cases it does
> not only refer to $offset, but involves another value as well (such as
> $size, or a constant).
Agreed - though I thought it might be helpful to distinguish between 
'negative' and 'positive' excursions beyond the data.


-- 
*John Lumley* MA PhD CEng FIEE
john@saxonica.com <mailto:john@saxonica.com>
on behalf of Saxonica Ltd

Received on Monday, 18 November 2013 13:39:06 UTC