W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2005

Re: binary query results format, draft

From: Jeen Broekstra <jeen@aduna.biz>
Date: Mon, 31 Oct 2005 09:34:15 +0100
Message-ID: <4365D707.1000009@aduna.biz>
To: Steve Harris <S.W.Harris@ecs.soton.ac.uk>
CC: RDF Data Access Working Group <public-rdf-dawg@w3.org>

Steve Harris wrote:
> On Thu, Oct 27, 2005 at 04:13:16PM +0200, Jeen Broekstra wrote:
[snip]
>> - Bytes 0-3 contain the ASCII codes for the string "BRTR", which stands for
>>   Binary RDF Table Result.
>> - Bytes 4-7 specify the format version (a 32-bit signed integer).
>> - Bytes 8-11 specify the number of columns of the query result that will
>>   follow (a 32-bit signed integer).
> 
> 
> How (are) the columns named, or are you expected to inspect the query?
> FWIW, I'd prefer explicit column names in the results.

Ah, I didn't make that sufficiently clear, sorry. The column names are 
  encoded as UTF-8 encoded strings and follow the header directly. So 
there are explicit column names in the result.

>>UTF-8 String encoding
>>=====================
>>
>>(Note: In the current Sesame implementation, a modified UTF-8 encoding
>>scheme is used, which is the default UTF-8 encoding scheme in Java (see
>>http://java.sun.com/j2se/1.5.0/docs/api/java/io/DataInput.html#modified-utf-8
>>for details). Obviously this can be generalized/changed to be more
>>standards-compliant. I am documenting the current Sesame/Java scheme here
>>for now though).
>>
>>Each value encoded as an UTF-8 string is preceeded with a 2-byte prefix
>>indicating the byte-length of the encoded string. The length is stored as
>>a 16-bit unsigned short integer.
> 
> 
> Why not NUL terminated? Does Sesame allow literals with NULs in? Doesnt
> unmangling unalligned integers get expensive? If literals with NULs in are
> important the length marker should be longer.

We used the 2-byte length marker scheme simply because that is what is 
implemented in java.io.DataOutput/DataInput. But NUL-termination 
sounds good to me.

As for literals with NULs in, even if we use NUL-termination that 
shouldn't be a problem, since an embedded null is encoded differently 
IIRC.

Jeen
-- 
Jeen Broekstra          Aduna BV
Knowledge Engineer      Julianaplein 14b, 3817 CS Amersfoort
http://aduna.biz        The Netherlands
tel. +31 33 46599877
Received on Monday, 31 October 2005 08:33:12 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:24 GMT