Re: Fwd: LOPT: serialization algorithm suggestions from Constantine Plotnikov on 2000-04-17 (xml-dist-app@w3.org from April 2000)

From: Constantine Plotnikov <cap@mail.novosoft.ru>
Date: Mon, 17 Apr 2000 16:30:47 +0700
To: xml-dist-app@w3.org
CC: MOREAU Jean-Jacques <moreau@crf.canon.fr>, Dan Connolly <connolly@w3.org>, Henrik Frystyk Nielsen <frystyk@microsoft.com>, David Burdett <david.burdett@commerceone.com>, Ken MacLeod <ken@bitsko.slc.ut.us>
Message-ID: <38FAD9C7.9E0C708E@mail.novosoft.ru>

Eric Prud'hommeaux wrote:
> 
> On Fri, Apr 14, 2000 at 04:17:04PM +0700, Constantine Plotnikov wrote:
> > Hi!
> >
> > 1. Could you please reconsider serialization algorithm?
> > As for as I understand from your site the SOAP is starting
> > point for LOPT development.
> >
> > We had some problems with implementing soap protocol in java.
> > The algorithm require two passes for serialization and
> > deserialization.
> >
> > The basic idea I suggest is the same as in java and XMI 1.1
> > serialization algorithm.
> >
> > When object is serialized, it is assgined id and it is written
> > as:
> > <Type id="id0" >
> >   // contents
> > </Type>
> >
> > Later (in the body of the the element or or ), when reference is
> > encountered, empty element with href is used.
> >
> > <List t="i0">
> >   <Type id="i1" >
> >     <value>
> >       <Element id="i2">
> >         <parent>
> >           <Type href="i1"/>
> >         </parent>
> >       <Element id="i2">
> >     </value>
> >   </Type>
> >   <Type id="id0"/>
> > </List>
> >
> > It allow single pass serialization/desirailaization and references
> > to parent. I do not suggest to use exactly this representation
> > for protocol. For example XMI 1.1 like optimization for representation
> > of values may be used. I just want to make (de)serialization simple and
> > single pass.
> 
> I'm very interested in building on object model that supports graphs
> without reading a supporting schema. Let's take a regorous example in
> C. (I don't use Java because it glosses over the distinction between
> pointers and nested data.)
> 
> struct {
>   int i;
>   char c;
> } t_Foo;
> 
> struct {
>   char * str;
>   t_Foo * fooPointer;
> } t_Bar;
> 
> struct {
>   char * str;
>   t_Foo nestedFoo;
> } t_Baz;
> 
> t_Baz myBaz = {"there", {9, 'c'}};
> t_Bar myBar = {"hi", &myBaz.nestedFoo};
> 
> If we use a something like a hashtable to tell which objects we've
> serialized, and we are called on to serialize(&myBar, &myBaz), we can
> write the myBar structure (ignoring SOAP serialization for now):
> 
> <t_Bar LOTP:name="myBar">
>   <XMLSchema:string LOTP:name="str">hi</XMLSchema:int>
>   <t_Foo LOTP:type="pointer" LOTP:objectID="t_Foo_0">
>     <XMLSchema:int LOTP:name="i">9</XMLSchema:int>
>     <XMLSchema:char LOTP:name="c">c</XMLSchema:int>
>   </t_Foo>
> </t_Bar>
> 
> <t_Baz LOTP:name="myBaz">
>   <XMLSchema:string LOTP:name="str">there</XMLSchema:int>
>   <t_Foo LOTP:objectID="t_Foo_0"/>
>   </t_Foo>
> </t_Bar>
> 
> I used objectID to make sure it was clear I was identifying the object
> being generated, not the XML element where it happened to be
> serialized. This XML element may have another name to make it
> available to XSLT or something like that.
> 
> I'll flush this example out with actual XML schema conformance and
> other tasty tidbits.
> 
> This example would have been more convient if we were suppposed to
> serialize(&myBaz, &myBar) as the t_Foo is actually nested in t_Baz,
> but that wouln't be as rigorous.
>
serialize(&myBaz, &myBar) withot schema would be very difficult
C does not have reflective facilites like Java. How serialize()
will learn what it is serializing and its structure. 

What you are suggething is possibly some sort of security hole. 
The detail of embeding is pretty low level and RPC protocols usually 
do not consider such details. I looks like that in you example you 
talking not about objects, their references and their representation 
in protocol, but about memory, pointers and their representation 
in protocol. I do not think that it will be easy task to prove 
security properties of anything that work with pointers.

I think that this feature would not be needed for Java, Scheme, 
Smalltallk, Prolog to name few that I know well. My knowledge of 
perl and ASP basic is more limited, and I would ask other to 
comment on it. I would like to see good practical example where 
this feature will bring significant benefits.

I can say nothing of RPC protocol that was named "RPC" (it had 
64bit ports that sould be reserved and the prots are published in 
RFCs) because I do not have experience with it. It was acessible 
mainly from C so it might have some hacks to address the issue. 
I someone have worked with it, please tell us aout it.

But in all other RPC system I have seen objects were of two kinds 
values and references to remote objects. The requests were isolated 
and it was not possible to reference nonremote objects that are 
outside of request. Maybe C CORBA interface would be good place
to study these issues more for C (I have worked with CORBA from 
C++ and Java).

Constantine

BTW there are a lot of people were in CC. I think that some 
of them are subscribed to xml-dist-app@w3.org. I copied cc 
exactly for now becuse I do not know why it was done, but do 
they need to receive it twice?

Received on Monday, 17 April 2000 05:30:32 UTC