Re: Header Serialization Discussion

On Sun, 14 Apr 2013 00:01:24 +0200, James M Snell <jasnell@gmail.com>  
wrote:

>
> As far as Value serialization is concerned... As I have discussed
> previously, I have been looking at support for four distinct value
> types, each represented by their own two-bit identifier. As I will
> explain further in a second, these types are represented using the two
> most significant bits in a byte:
>
>   00 = Text
>   01 = Number
>   10 = Date Time
>   11 = Binary
>
> Text can be either UTF-8 or ISO-8859-1, indicated by a single bit flag
> following the type code. All text strings are prefixed by it's length
> given as an unsigned variant length integer
> Numbers are encoded as unsigned variable length integers.
> Date Times are encoded as Numbers representing the number of seconds
> that have passed since the epoch (GMT).
> Binary data is encoded as raw binary octets prefixed by an unsigned
> variable length integer specifying the number of bytes.
>

I'll share what Opera Mini does for it's request protocol. It's a delta  
encoded name-value list statically assigned property identifiers and with  
a very minimal typing and length encoding. Since we control both client  
and server there are a lot of property specific value encodings used, but  
it doesn't change for a specific property (i.e. user strings are always  
UTF-8).

 From our documentation:

     Each property is encoded as eight flag bits followed by the id# of
     the property (16 bits) followed by the payload, the format of which
     depends on the flags.

      6bit         1bit    1bit     16 bit
     +---------+------+------+---------------+----------------- - -
     | unused  | INT  | LONG |  PROPERTY ID# | PAYLOAD
     +---------+------+------+---------------+----------------- - -

      if INT is 1, the payload is an integer, otherwise it's a string
      if LONG is 1, the payload length, or the integer, depending on INT,
      is 4 bytes long instead of 1 byte.

     -----+------+-----------------------------------------------------
      INT | LONG |  PAYLOAD
     -----+------+-----------------------------------------------------
       0  |  0   |  1 byte LEN (max 254), followed by LEN bytes data
       1  |  0   |  1 byte INT (value between 0 and 254)
       0  |  1   |  4 bytes LEN (max 0xfffffffe) followed by LEN bytes
       1  |  1   |  4 bytes INT (value between 0 and 0xffffffff)
     -----+------+------------------------------------------------------

     Only properties that have changed value need to be transmitted to
     the server, with a few exceptions.

Comparing with your proposal I would not create a specific Date Time type,  
but just keep it as an integer representation of posix time. I would not  
have different text encodings, but stick to only UTF-8. I would have a  
really serious thinking about if I should have both Binary and Text.

/Martin Nilsson

-- 
Using Opera's revolutionary email client: http://www.opera.com/mail/

Received on Wednesday, 17 April 2013 01:07:40 UTC