RE: Bandwidth efficiency from Howard Katz on 2004-06-02 (public-rdf-dawg@w3.org from April to June 2004)

From: Howard Katz <howardk@fatdog.com>
Date: Wed, 2 Jun 2004 05:58:28 -0700
To: "Steve Harris" <S.W.Harris@ecs.soton.ac.uk>, "DAWG public list" <public-rdf-dawg@w3.org>
Message-ID: <IKEOLCDFPBBPPAHGNKKOCEOKELAA.howardk@fatdog.com>

> [mailto:public-rdf-dawg-request@w3.org]On Behalf Of Steve Harris

> On Tue, Jun 01, 2004 at 05:00:38 -0700, Howard Katz wrote:
> > This slight digression is inspired by the topic at hand. What
> about the idea
> > of prepending some sort of preamble to the query result to provide meta
> > information on the result set, as well as the general
> environment and (for
> > example) user-selectable settings of interest? In this case
> it's primarily
> > to reduce bandwidth (stealing the prefix-namespace mechanism
> generally used
> > on triples input to make the output both more compact and
> human-readable),
> > but I'm also throwing in a few other hypothetical preamble
> "infoitems" that
> > connect into several of our other uc requirement items as well:
> >
> >       dawg-ql:resultPreamble
> >       {
> >             dawg-ql:prefix  "ex"
> >             dawg-ql:namespace  "http://example.com/foo#"
> >             dawg-ql:resultFormat  dawg-ql:compactTriples
> >             dawg-ql:maxChunkSize  2048
> >             dawg-ql:numTriples  3
> >             dawg-ql:numInferredNodes  0
> >       }
> >       ex:alice ex:worksFor ex:deptA
> >       ex:bob ex:worksFor ex:deptA
> >       ex:deptA ex:hasName "DepartmentA"
> >
> > The particular preamble format here, whether by accident or
> design, looks a
> > lot like RDF but doesn't necessarily have to. I'm just trying
> out a concept.
>
> Also a result format in N3 or RDF/XML could do this using thier prefix
> mechanisms. There is a downside however, if the server is expected to
> strip out the namespaces for the URIs then it must store the result set
> internally and post process it to ensure that its got all the namespaces
> before it starts to send the results (as the namespaces aredeclared at the
> top). This increses latency.
>
> The situation where the namespaces are declared after the results is not
> much better as the client cant interpret them untill its parsed all the
> namespace decls anyway.
>
> - Steve

Right. So what you're saying is that the tradeoff is basically between :

(1) taking processing time to compute the namespace of every uri in the
result set before they depart the scene, or
(2) leaving it to the client to do the inverse operation following arrival,

I wouldn't have thought that (1) was all that expensive offhand -- I need to
have some hands-on play with this one.

There's one other alternative of course, but only if the store was created
to your own specification (which makes me realize I only have a vague
understanding of the various possible scenarios) :

(3) store the namespace (or at least some sort of integer key representation
of it) along with the uri for each node in the repository as it's added.
Then there's no latency or bandwith issues on the way out. However there
might be other internal processing issues that would arise in the area of
self-consistency and the like, and of course increased data size.

Our painful motto: No free lunch in computerland,
Howard

Received on Wednesday, 2 June 2004 08:58:06 UTC