- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Thu, 12 Apr 2012 13:22:56 +0100
- To: public-rdf-wg@w3.org
On 11/04/12 18:40, William Waites wrote: > On Wed, 11 Apr 2012 10:37:22 -0400, Sandro Hawke<sandro@w3.org> said: > > sandro> Put differently, as a test case: > sandro> > sandro> Trig Document 1 (D1):<u> {<a> <b> 1 } > sandro> > sandro> Trig Document 2 (D2):<u> {<a> <b> 2 } > sandro> > sandro> What is the merge/union of D1 and D2? > sandro> > sandro> It's not defined, when asked like this. We use > sandro> something Trig-Like but different: > sandro> > sandro> D1A<u> {+<a> <b> 1 } D2A<u> {+<a> <b> 2 } > sandro> > sandro> in which case the merge is: > sandro> > sandro> D3A<u> {+<a> <b> 1,2 } > sandro> > sandro> ==or== > sandro> > sandro> D1B<u> {=<a> <b> 1 } D2B<u> {=<a> <b> 2 } in > sandro> > sandro> which case there is no merge; they are inconsistent. > > Reading some of the background discussion, talking about crawler dumps > and such, it seems to me there is quite a bit more information we > might want to carry around in the "header" of a trig document. > > For example, if D1 was downloaded at time t1 and D2 at t2, one could > reasonably conclude that even with the + notation it is inappropriate > to merge them, D2 having superceded D1. > > Or perhaps D1 comes from a reliable source and D2 comes from someone > whose data I'll use if I don't have anything better but otherwise I > wouldn't trust. So when combining the information I'll throw out the > second version. But perhaps I would nevertheless keep it around and do > a straight additive merge if I know the cardinality of<b> to be > greater than 1. > > My point is that combining data from different sources, or the same > source at different times, is likely to need to take into account more > than just the +/= hints. Some of this information can be in-band > (e.g. time, source) and some must necessarily be out of band (e.g. how > much I trust that source). > > Cheers, > -w I wholeheartedly agree. Andy
Received on Thursday, 12 April 2012 12:23:35 UTC