- From: Tom Adams <tom@tucanatech.com>
- Date: Tue, 27 Jul 2004 10:21:33 -0400
- To: public-rdf-dawg@w3.org
- Cc: Andy Seaborne <Andy.Seaborne@hp.com>
Hi Andy, > Chris asks for LIMIT and OFFSET in order to do client-side control of > the flow of results. > > "3.10 Result Limits" is approved. > "3.12 Streaming Results" is approved > > we also noted the relationship to sorting matters. But this isn’t > LIMIT and OFFSET where the client asks for just a slice of the > results, and then come back for another slice later. The slices asked > for need not be in order so result set stability across calls might be > expected (transactions?). I don't think transactions are needed, but some kind of session-based state keeping would be required. > It may be in Chris's use case that the client will ask for chunks in > order, in which case streaming using a suitable XML encoding (that is, > the whole document does not need to be stored before further > processing) and LIMIT may be sufficient because the client can > influence the results sufficiently, but it isn't what he is asking > for. > > Illustration: Google lists for first 10 results, then you can jump > around the "result set" using the page links at the bottom. I think that this may be what he's looking for. > Example: One style of facetted browsers show the first N results when > the user has a lot of items in a category. The client UI never > retrieves the whole result set so just LIMIT is a win. > > The limitations on JDBC drivers noted in the F2F minutes applies in > default configuration. Having streams results has consequences - for > MySQL that means locking over the length in time that the results are > active with possibly adverse effects on the overall system > performance. I'll defer to Simon on how Kowari handles this internally, perhaps this can shed some light on the discussion, though perhaps he's already covered it anecdotally. > I would like to understand Chris's use case better. The use case has > the client and server quite tightly designed together and possibly > deployed. It does not sound like a general browser-ish UI applied to > some unknown RDF store. It may be that LIMIT+Streaming is sufficient > (not ideal, but tolerable)? Alternatively, it may be we need > different level in the protocol, with a simple, general web-wide one > query, one response mode and than a more complex one for closer > associations of client and server. I think Chris is after a combination of LIMIT and OFFSET. I know that he's discussed this issue in the past on the Kowari list, and has just posted a contribution (KModel), so I imagine this is what he's using. But yes, we need to find out more information on what he is doing. I'll add asking for more information to my email. > We should also note charter item "2.3 Cursors and proofs" (I don't > understand why cursors and proofs are lumped together). You're on the ball as ever :) Cheers, Tom > Tom Adams wrote: > >> Below is an outline of my proposed reply to Chris Wilper on his use >> case for requirement 3.10, posted to public-rdf-dawg-comments@w3.org. >> ---- >> Hi Chris, >> Thanks for your posting to the DAWG comments list. The DAWG is always >> happy to receive comments and use cases on its proposed requirements. >> The requirement you noted was moved from PENDING to APPROVED at the >> DAWG face to face on the 15th July. You can view the details at: >> http://www.w3.org/2001/sw/DataAccess/ftf2#req >> Keep the comments coming! >> Cheers, >> Tom >> On 06/07/2004, at 1:25 PM, Chris Wilper wrote: >>> Hi, >>> >>> Looking at the Requirements/Use Cases document, I noticed that 3.10 >>> and 3.10a >>> had "Pending" status. We[1] plan on using an rdf triplestore to >>> back a large >>> metadata repository, exposed to other systems via the OAI-PMH[2]. >>> While not >>> being too domain and protocol-specific here, I'll describe our case: >>> >>> We have a large collection of metadata in a triplestore that we want >> to >>> make available to people through a set of queries. Someone >>> typically asks, >>> "Give me the metadata that has changed since last week and is in XYZ >>> collection", or simply, "Give me all the metadata". >>> >>> It is a requirement for us that the responses can come in chunks: >>> XML is >>> sent over the wire, and rather than require all of our clients (and >> our >>> server) >>> to be able to handle arbitrarily large chunks of xml in one stream, >> our >>> server >>> can be configured to give only, say 1,000 responses, along with a >>> "resumption token" that can be used for subsequent requests. >>> >>> Without the ability to specify LIMITS/OFFSETS with the triplestore >>> query, we >>> would >>> need to stream everything to disk and manage much more state within >> our >>> application. >>> >>> [1] http://www.fedora.info/ and http://www.nsdl.org/ >>> [2] OAI-PMH is a protocol for exposing xml metadata in a repository. >>> See http://www.openarchives.org/OAI/openarchivesprotocol.html >>> >>> ___________________________________________ >>> Chris Wilper >>> Cornell Digital Library Research Group >>> http://www.cs.cornell.edu/~cwilper/ >>> >>> > > -- Tom Adams | Tucana Technologies, Inc. Support Engineer | Office: +1 703 871 5312 tom@tucanatech.com | Cell: +1 571 594 0847 http://www.tucanatech.com | Fax: +1 877 290 6687 ------------------------------------------------------
Received on Tuesday, 27 July 2004 10:21:55 UTC