- From: Wilde, Erik <Erik.Wilde@emc.com>
- Date: Mon, 11 Nov 2013 21:32:10 -0500
- To: "Eric Prud'hommeaux" <eric@w3.org>
- CC: Alexandre Bertails <bertails@w3.org>, Linked Data Platform WG <public-ldp-wg@w3.org>
hello eric. On 2013-11-11, 15:25 , "Eric Prud'hommeaux" <eric@w3.org> wrote: >Yup. C's code works both on S1 and S2. It just works better on S2. A >non-streaming client works identically well with S1 and S2. after thinking about this a little more, i am wondering how relevant the optimization is to begin with. do we have any data that would tell us that this might be a problem? for example, while the inherently ordered XML of feeds would easily allow streaming parsing, i am not aware of any implementation that actually does that (using SAX). instead, what usually happens is that implementations use DOM, which first reads the whole resource, builds the internal XML tree, and then the code starts working with that complete tree. in DOM/XML, the very fuzzy rule of thumb is that a DOM tree needs 10x as much memory as the source file. i would assume for RDF there's a similar rough guesstimate relating serializations and in-memory models? the thing is that neither feeds nor LDP are made for sharing/exchanging massive amounts of data. they are loosely coupled protocols to allow easy resource access. given today's machines, it may be safe to assume that 100mb of runtime memory consumption seems tolerable. in XML-land, that would translate to a resource size of 10mb. i haven't seen many feeds exceeding that size: you can control by page size, and you can also control by not randomly embedding everything in a feed (for example, podcasts are really small, because the large video files are linked and not embedded). just wondering: do we have any guesstimates of RDF memory requirements, and do we really plan for scenarios where LDP resources are exceeding the resulting maximum resource sizes we might want to see? thanks and cheers, dret.
Received on Tuesday, 12 November 2013 02:32:53 UTC