Re: optimizing container pages serialization to enable streaming

* Wilde, Erik <Erik.Wilde@emc.com> [2013-11-11 17:18-0500]
> hello eric.
> 
> On 2013-11-11, 14:01 , "Eric Prud'hommeaux" <eric@w3.org> wrote:
> >The odds that someone who cared about optimizing would use a generic
> >serializer are vanishingly small. I'd say that's not a significant
> >price. The payload is still text/turtle; no text/turtle parser would
> >be able to detect any way that it differs from any rearrangement of
> >the same triples.
> 
> sure. whatever somebody does within the permissible bound of a media type
> is fair game. but that optimization is not all that interesting when
> nobody can depend on it and/or ask for it, right?

No one is depending on it. They are simple able to take advantage of
it if it's there.


> >Server S1 serializes some LDPC using a generic serializer. Client C
> >parses the data, recording each arc in an RDF graph, until it sees the
> >membership predicate. At that point, it scans back through the graph
> >so far, looking for the membership predicate and acting on each member
> >now that it can find them. It continues to parse the network input,
> >now in a streaming mode and able to dispatch each member as it arrives
> >from the network. Client C has broken no LDP, HTTP, RDF or Turtle
> >rules; it has only optimized for the part of the document that follows
> >the membership predicate.
> 
> i don't think i ever claimed that.
> 
> >Server S2 uses a custom serializer which starts out by emitting the
> >membership predicate. Client C can consume S2's data much more
> >efficiently becaue it can start out in streaming mode. S2 hasn't
> >broken any rules, but it is nonetheless, a much more efficient server
> >for large collections.
> 
> how does s2 signal that c can safely go into streaming mode? or let's put

When the client sees the required membership triples, it knows how to
interpret the graph to date and the rest of the incoming network
stream. 


> it like this: if, as a side-effect of defining LDP, there now is some
> generic and context-independent way of how text/turtle can be used in a
> way that reliably optimizes exchange of (some kind of) ordered RDF-based
> data model, then that's nice and should be documented somewhere and then
> referenced in the LDP implementation guide.

What I outlined was already true of LDP; as soon as one saw the
membership triples, one could dispatch the graph and switch to
streaming mode. There was always going to be an incentive to serialize
the membership triples first in case there was a streaming client.

Henry's proposal simply provides a single arc to let the client know
that it has seen all of the membership triples, even if they are
default values. The cleverness lies in the fact that because it's
Turtle (or any other RDF dialect), there is a way to make a blank node
that can't be referenced later in that document or any other document.


> >I proposed that we say there is exactly one ldp:membershipRules arc
> >from the container to the node with all the membership predicate et
> >al. That doesn't break any LDP, HTTP, RDF or Turtle rules. Perhaps
> >this will meet with less resistance if we simply don't mention that
> >serializing that arc at the top of the document will enable more
> >efficient streaming parsers. We can let people figure it out for
> >themselves.
> 
> again, that seems like implementation guidance. in spec speak, i guess all
> you could do is say:
> 
> - servers MAY choose to serialize (some) responses this way: ...
> 
> - clients MUST NOT rely on servers serializing in the way described above.

Yup. C's code works both on S1 and S2. It just works better on S2. A
non-streaming client works identically well with S1 and S2.


> that's a relatively odd thing to add to a spec, at least in my opinion,
> because it creates zero constraints for anybody. which is good because
> that's how this discussion started and it seems we're in agreement about
> that. so documenting this might be useful, but it might make the spec more
> concise to leave it out, and instead move it to the implementation guide.
> 
> cheers,
> 
> dret.
> 

-- 
-ericP

office: +1.617.599.3509
mobile: +33.6.80.80.35.59

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

There are subtle nuances encoded in font variation and clever layout
which can only be seen by printing this message on high-clay paper.

Received on Monday, 11 November 2013 23:25:35 UTC