- From: Sandro Hawke <sandro@w3.org>
- Date: Mon, 17 Feb 2014 09:58:09 -0500
- To: John Arwe <johnarwe@us.ibm.com>, Linked Data Platform WG <public-ldp-wg@w3.org>
- Message-ID: <53022381.4060206@w3.org>
On 02/17/2014 08:33 AM, John Arwe wrote:
> > .... Lossy paging would result in postings not
> > being shown to some people in some circumstance, which is likely to
> > be unacceptable.
>
> This makes it sound as if the real chafing point is the inability for
> the client to detect when it never sees something (when it needs to
> "start over" if it cares about completeness), which is different than
> having a problem with lossy paging per se. In our current
> implementations (other email), we also ended up giving clients a
> signal by which they could Know that they missed something and hence
> need to start over if they care about completeness; [1] is the spec
> many of them are following.
>
> [1] http://open-services.net/wiki/core/TrackedResourceSet-2.0/
I'm not seeing an easy way to do that, with that spec - it looks like it
does a lot more than is needed for this.
>
> > .... As with static paging, the server can, at any time, give
> > up on a particular paging function and answer 410 GONE for those
> > URLs. ...
>
> This is an interesting variation. Many client apps are written to
> treat 4xx codes as errors. "page gone" is something of an "expected
> error" - more like a 5xx in some ways. It's not like there's anything
> wrong with the client's code to cause the 410 (but that would be true
> of 410 in general, aside from cases where the same code already
> deleted the request-URI for which the 410 is sent).
>
Yeah, I figure one can handle 410 intelligently when getting an SPR.
> Nit: "Stable" seems a bit strong. This is more a bounded-loss case,
> isn't it?
>
Well, what's stable is the assignment of triples into pages.
"Fixed-Boundary Paging".
> > ..., but each triple which could
> > theoretically ever be in the graph is assigned to a particular page.
>
> Does this imply that you need a closed model in order to implement it?
> Otherwise the number of triples which could theoretically ever be in
> the graph is infinite, so you fall somewhere in the space between
> needing infinite pages, having some pages that will be too large to
> transfer (defeating the purpose), and having an infinite number of
> mapping functions. It's sounding like some of the exchanges the WG
> has had on 'reasoning' ... theoretically NP, but in practice not so bad.
>
No, I think it's easy.
If the server is application-specific it can just use days as the
buckets for events, for instance, or alphabet ranges.
> I'm wondering if generic graph stores would have any problem with it,
> since they definitionally have open models and hence know basically
> nothing about the triples that might theoretically exist over time in
> a resource.
Yeah, for paging generic LDP-RR's I see two easy approaches, which kind
of correspond to the data structures one would probably use to store the
graph:
- if the triples are really in no special order, use a hash function on
the text of each triple. Pick a hash function that gives you a number
of buckets suitable for the number of triples you have and the page size
you want. Change the paging function when the number of triples
changes a lot.
- if the triples are conceptually sorted, then figure out and store
reasonable boundary values, as one would do if using a b-tree or
balanced binary tree for maintaining the sorting. Change the paging
function when adding or deleting a b-tree-node which has been given to a
client. If no client has seen the page corresponding to a node, then
it's okay to split or delete it.
-- Sandro
>
>
> Best Regards, John
>
> Voice US 845-435-9470 BluePages
> <http://w3.ibm.com/jct03019wt/bluepages/simpleSearch.wss?searchBy=Internet+address&location=All+locations&searchFor=johnarwe>
>
> Tivoli OSLC Lead - Show me the Scenario
>
Received on Monday, 17 February 2014 14:58:16 UTC