W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > July to September 2006

Re: ACTION item(s): definition(s) and an algorithm for postprocessing minimization

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 24 Aug 2006 09:52:49 -0700
Message-Id: <p06230977c11388ad0a03@[192.168.1.6]>
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>

>On Aug 21, 2006, at 10:53 AM, Seaborne, Andy wrote:
>
>>Bijan Parsia wrote:
>>>I believe that the qualms Andy had about what happens in 
>>>intermediate  steps was do the different interpretations of what 
>>>the minimized  final answer set would look like. That is, his 
>>>understanding was  close to 1 (like pat) and mine were entirely 
>>>different. However,  algorithms for computing 2 and 3 are easy 
>>>enough, so I'm doing the  action item anyway.
>>
>>My reservations are:
>>0/ BNodes over the graph matter - not answer sets
>>1/ Lack of streaming
>>2/ Additional processing cost [*]
>>3/ Non-standard processing (i.e. can't be, say SQL DISTINCT).
>
>Thanks for the clarifications.
>
>>[*] Even assuming you want to do it at all, graphs can be leaned 
>>once, then queried many times.  Answer sets have to be leaned on 
>>every query result.
>
>Very true. Though if we deal with updates as many real engines do, 
>the gap narrows a bit. Also, for many applications (cmd line based 
>query where you load the file each time) this is not true. Of 
>course, it is most true where you need the most speed, i.e., servers 
>with big triples store and lots of queries. OTOH, it's unclear how 
>painful typical queries will be. A lot depends on the size of the 
>result set and the amount of redundancy.
>
>>>So, first qualification: These algorithms are only minimizing with 
>>>respect to BNodes. You have to plug in your own account of 
>>>literals.  I use "row" for "answer" avoid confusion:
>>>DEFINITION 1: Answer graph template
>>>	Let A be an answer set and Avar be the set of column headings of A.
>>>	The answer graph template of A is the set of triple patterns, 
>>>such  that:
>>>		{tp | _:row ('http://var.org/#" ++ var) var. & var \in Avar}
>>                                                         ^^^
>>value?
>
>I just needed a random URI prefix. Substitute any you like.

We did once consider a proposal to have a special 'anonymous' prefix, 
basically an IRI space solely for skolem names, as part of the spec. 
That idea kind of died, I think (?) because it was felt to be 
artificial and not in the IRI spirit of rock-solid eternal universal 
identifiers that retain their meaning throughout the known universe, 
but it might be worth reconsidering it. There are emails about this 
in the archive somewhere but I confess I can't actually find them now.

>>(Just to show I'm reading this all!)
>
>Heh.
>
>Oh, if we had (do we have?) CONSTRUCT DISTINCT, I would personally 
>expect a lean graph as output.

I agree, if we have this, but I hope we don't :-)

Pat
-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Thursday, 24 August 2006 16:53:15 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:27 GMT