RE: Streamability II from Seaborne, Andy on 2004-06-22 (public-rdf-dawg@w3.org from April to June 2004)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Tue, 22 Jun 2004 14:40:06 +0100
To: Janne Saarela <janne.saarela@profium.com>, Tom Adams <tom@tucanatech.com>
Cc: public-rdf-dawg@w3.org
Message-ID: <E864E95CB35C1C46B72FEA0626A2E80803615B67@0-mail-br1.hpl.hp.com>

Tom - you didn't explicitly say that the pages were held on disk at any time
- are they?  Just sometimes?

Janne - While the server may be storing state in the server-side ResultSet
that is paged, this just a matter of how Tucana have chosen to implement
their system.  There does not appear to be in the application-contract that
means server state is necessary.  The client-side ResultSet class is going
over the results one at a time - the fact that there is an implementation
that has the client pulling pages on demand is hidden.  The key is that the
interface to the application is result-by-result. 

The same thing could be achieved with protocol that just pumps results as it
can and let the network block if the client can't keep up.  In this case, no
server state is involved (outside the lifetime of the query) in the query
processor.  Might still choose in the server to record result to disk
sometimes (e.g. consistency in the presence of updates).

If it's a big set of results and the client can't keep up with the server,
then something somewhere has to block.  It may be in the query result
protocol; it may be in the TCP stack.  I prefer to keep it in the TCP stack
until we have a need otherwise for a web facing system.  There is a
Denial-Of-Service attack point in creating TCP connections for operations
then doing nothing with them but this is a well known problem.  DAWG isn't
making it any different - it's the same as asking for large images and not
reading them.

	Andy

-------- Original Message --------
> From: Janne Saarela <>
> Date: 22 June 2004 10:01
> 
> Hi Tom
> 
> > As promised, for those concerned about implementability, here is an
> > outline from one of our engineers of our approach to streaming
> > results across the wire: 
> > 
> > "When a remote query is performed, the ResultSet is wrapped in a
> > paging class that can return a page of results at a time. On the
> > client side is a class that meets the ResultSet interface and calls
> > across the network to this paging class. When this client-side
> > ResultSet class is returned from a query it contains its first page.
> > Iterating over the client-side ResultSet just iterates over the
> > results in the internal page. Once a page is finished, the next page
> > is automatically requested from the paging class on the server and
> > iterating can continue." 
> 
> Isn't this effectively an approach that stores the state for each
> individual client on the server side? I would agree with Andy's
> comment in
> 
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/0606.html
> 
> where the requirement to store state for each client is
> considered an unnecessary burden.
> 
> Have I understood correctly your concept of a 'paging class'?
> 
> Janne
> --
> Janne Saarela <janne.saarela at profium.com>
> Profium, Lars Sonckin kaari 12, 02600 Espoo, Finland
> Internet: http://www.profium.com

Received on Tuesday, 22 June 2004 09:56:01 UTC