Re: Updated SPARQL Query Results XML Format draft from Steve Harris on 2005-07-14 (public-rdf-dawg@w3.org from July to September 2005)

From: Steve Harris <S.W.Harris@ecs.soton.ac.uk>
Date: Thu, 14 Jul 2005 11:08:43 +0100
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <20050714100843.GC27994@login.ecs.soton.ac.uk>

On Thu, Jul 14, 2005 at 10:46:46 +0100, Dave Beckett wrote:
> On Wed, 2005-07-13 at 15:07 +0100, Steve Harris wrote:
> > On Wed, Jul 13, 2005 at 02:59:45 +0100, Dave Beckett wrote:
> > > However, I've also noticed a couple of items in Red Ink that still need
> > > thinking about:
> > > 
> > > 1. How/if to record duplicates in results. (Section 2.3.3)
> > > 
> > > When ORDER BY is given, the result format may record index="1",
> > > index="2" on the <result> element.  (Side issue - "may" or "should" do
> > > this?)
> > 
> > I dont see the point to this really, but how does it interact with OFFSET?
> > Shouldn't the count start from OFFSET + 1?
> 
> If we keep with this design, I guess so.
>  
> > > However when there are duplicates should it generate indexes 1, 2, 2, 3
> > > where items #2 and #3 are duplicates?  (A query with ORDER BY but no
> > > SELECT DISTINCT).
> > 
> > Strong "no" from me. Any numbering should be monotonic.
> 
> monotonic means order preserving right?  So 1, 2, 2, 3 does preserve the
> order - if items #2 and #3 are duplicate results.  Otherwise order
> information is lost.

Yes, sorry, wrong word. I mean incrementing and consequtive. Or something.
 
> The index="number" item was added because we added ORDER BY and before
> we finished deciding what it would do.

As XML is inherantly ordered it just seems like a waste of bytes to me.
I still care about bandwidth efficiency for mobile applications and so on.
 
> Maybe you just need to know that the results are ordered - i.e. an
> isOrdered boolean flag.   Is isDistinct also needed?  Those seem to be
> the two crucial flags that tell you the four forms of variable bindings
> results you can get:
>   1. a bag (the default)
>   2. an ordered sequence (ORDER BY)
>   3. an ordered sequence with no duplicates (ORDER BY + DISTINCT)
>   4. a set (DISTINCT)

Maybe, I'm not clear on any situations where the client might not know, and
would care.
 
>   Refering to 10.1 Solution Sequences and Result Forms
>   http://www.w3.org/2001/sw/DataAccess/rq23/#solutionsResults
> 
> unless the LIMIT and OFFSET indexes are important.

they may be, but again, the client would be aware of wether it had used
LIMIT and/or OFFSET, or would be agnostic, I would have thought.

- Steve

Received on Thursday, 14 July 2005 10:09:09 UTC