- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Fri, 13 Aug 2004 20:22:05 -0400
- To: Jos De_Roo <jos.deroo@agfa.com>
- Cc: public-rdf-dawg@w3.org
- Message-ID: <20040814002205.GB24809@w3.org>
On Fri, Aug 13, 2004 at 08:17:06PM -0400, Eric Prud'hommeaux wrote: > Below, Jos supplied two examples of provenance using cwm's > log:includes. cwm directives (queries, rules) are expressed in terms > of triples similar to RDF triples. cwm has an additional node type, a > formula, which allows one to make assertions about a group of > statements. The current RDF specifications provide a mechanism for > making assertions about individual statements, reification [1], but it > doesn't seem to be used, at least in query. > > There are some issues with using reification to make assertions about > statements (for example, the Superman problem [2], or writing > unordered collections of reified statements). In spite of these > problems, I bet the major reason it isn't used is that it is a pain to > write out. cwm and Euler don't have that problem because formulas are > much easier to write than collections of reified statements. > > The RDF app most familiar to me is Annotea. It uses source provenance > for data management and sanity checking. For instance, if someone > wants to delete the statements from some (virtual) document, they > delete all the statements with that "Attribution". Onother query is > that it looks for resources where the same individual has said thay > have type Annotation and annodate a particular resource. > > FOAF is another application that pays attention to who said what [3]. > Again it needs to know the origin of each statement. Because these > queries are easy to express in cwm, it has some relevent test cases. Oops, I intended to poll others and ask what apps they were supporting that involved provenance or worked around it in clever ways. Also, how widely should I ask this question? rdfig? > Expressed in fairy tale format, consider the following query case: > > Joe is using a DAWG-QL application to write his checks. He does this > by merging documents from his credit card bank, his calendar, and some > collaborative scheduling pages maintained by his coworkers. The credit > card bank is the only one allowed to provide the amount and recipient > of the checks. The other documents provide other ledger information, > some of which is on the checks in the memo field. > > I have attached an IRC log between AndyS, DaveB and myself discussing > whether and how provenance should be queried or constrained in BRQL. > > On Mon, Jul 26, 2004 at 12:25:26AM +0200, Jos De_Roo wrote: > > > > For an explanation of log:semantics, log:includes and log:notIncludes > > I would like to point to http://www.w3.org/2000/10/swap/doc/Reach > > > > Now let's assume that > > > > <a.n3> a q:Source. > > <b.n3> a q:Source. > > > > and a.n3 is > > > > :foo :a "a". > > :foo :b "b". > > > > and b.n3 is > > > > :bar :a "a". > > > > Then the query > > > > [] q:select { (?O ?SRC) }; > > q:where {?SRC a q:Source. ?SRC.log:semantics log:includes {?S ?P ?O}}. > > > > results in > > > > ("a" <file:/temp/a.n3>) . > > ("b" <file:/temp/a.n3>) . > > ("a" <file:/temp/b.n3>) . > > > > as a matter of test case. > > > > > > Another test case is that the query > > > > @prefix log: <http://www.w3.org/2000/10/swap/log#>. > > @prefix q: <http://www.w3.org/2004/ql#>. > > @prefix x: <http://example.com/exon/#>. > > [] q:select { (?E) }; > > q:where { <http://www.w3.org/2000/10/swap/test/EricNeumann/exdata.n3> > > log:semantics ?F. > > ?F log:includes { ?T1 a x:Transcript; x:hasExon ?E. ?T2 a > > x:Transcript }. > > ?F log:notIncludes { ?T2 x:hasExon ?E }}. > > > > results in > > > > (<http://www.w3.org/2000/10/swap/test/EricNeumann/exdata.n3#ATP1B4_e3>) . > > (<http://www.w3.org/2000/10/swap/test/EricNeumann/exdata.n3#ATP1B4_e2>) . > > > > > > -- > > Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/ > > [1] http://www.w3.org/TR/rdf-syntax-grammar/#section-Reification > [2] http://www.w3.org/2001/12/attributions/#superman > [3] http://www-106.ibm.com/developerworks/xml/library/x-foaf2.html#N10163 > -- > -eric > > office: +81.466.49.1170 W3C, Keio Research Institute at SFC, > Shonan Fujisawa Campus, Keio University, > 5322 Endo, Fujisawa, Kanagawa 252-8520 > JAPAN > +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA > cell: +1.857.222.5741 (does not work in Asia) > > (eric@w3.org) > Feel free to forward this message to any list for any purpose other than > email address distribution. Content-Description: irc://irc.w3.org/%23dawg 2004-08-13 -- provenance in BRQL > 2004-08-13T16:21:22Z <AndyS> SOURCE is a good one because it isn't clear what "it" is yet and the F2F time is good for that > 2004-08-13T16:21:45Z <ericP> roger > 2004-08-13T16:21:53Z <AndyS> email for a bit (explore) ; meet to get a sense of people's views ; then can "do" > 2004-08-13T16:22:04Z <AndyS> So some text to light blue touch paper (!) > 2004-08-13T16:22:11Z <ericP> FOAF and annotea seem like the most documented users of source attribution > 2004-08-13T16:22:23Z <AndyS> I was idling thinking of trickinesses - > 2004-08-13T16:23:17Z <ericP> ...? > 2004-08-13T16:23:35Z <AndyS> what about { :x :y :z . :x :y :z SRC(?s) . } How many matches to first (assume several :x :y :z) > 2004-08-13T16:23:53Z <AndyS> Its the mixing of triple/3 and triple/4 that is interesting > 2004-08-13T16:24:52Z <ericP> doc1 says :foo is a CashiersCheck > 2004-08-13T16:25:02Z <ericP> anyone says CashiersCheck is a Check > 2004-08-13T16:25:21Z <ericP> (though your ex. had the same statement twice, intended?) > 2004-08-13T16:25:23Z <AndyS> (aside : Sidean aren't W3 members - may be they woudl be interested) > 2004-08-13T16:26:00Z <ericP> (good point, i'll query Bob) > 2004-08-13T16:26:02Z <AndyS> Better ex: { :x :y ?z. :x :y ?z SRC(?s) . } > 2004-08-13T16:26:45Z * ericP thinks about it from a relational perspective... > 2004-08-13T16:27:25Z <AndyS> Several matches to second part : so reverse the order and seem to get more lines in result set > (duplicates) where there is no ?s in SELECT > 2004-08-13T16:28:22Z <ericP> assuming ?s unbound: > 2004-08-13T16:28:22Z <AndyS> Haven't worked it in detail but we are outside RDF and need to pick something that is just query - not > general provenance which others are interested in a solution for > 2004-08-13T16:29:03Z <ericP> relvar has a set of ?z bindings. for each, cross with the subset of those bindings where ?z is known > 2004-08-13T16:29:11Z <AndyS> Thinks :: SELECT ?z { :x :y ?z. } vs SELECT ?z { :x :y ?z SRC(?s). } > 2004-08-13T16:29:24Z <ericP> what does { :x :y ?z. :x :y ?z } do? > 2004-08-13T16:29:36Z <AndyS> Why should they be the same? Why different? > 2004-08-13T16:29:55Z <AndyS> Your ex: its all the ?z, once. > 2004-08-13T16:30:48Z <ericP> yeah, seems to be all ?z crossed with all ?z restricted to the set where the ?z=?z > 2004-08-13T16:30:55Z <AndyS> Syntax aside : I like differentiating the 4th element but just quads is OK. > 2004-08-13T16:31:10Z <ericP> i prefer diferentiation > 2004-08-13T16:31:21Z <AndyS> dy/dx > 2004-08-13T16:32:00Z <ericP> algae used to be quads, but i discovered that sometimes i wanted to talk about other properties of the > triple, for instance, datatype > 2004-08-13T16:32:24Z <AndyS> ?datatype : why not "1"^^?x > 2004-08-13T16:32:39Z <ericP> we use datatype in the atom serialization, but i bet we'll think of *something* else later on > 2004-08-13T16:32:46Z <AndyS> its a property of the slot, not the triple isn't it? > 2004-08-13T16:32:47Z <ericP> also, i think it helps cognition > 2004-08-13T16:33:14Z <ericP> yeah, maybe it wasn't dt. > 2004-08-13T16:33:20Z * ericP checks... > 2004-08-13T16:33:27Z <AndyS> May well - is this area very open or are there a few well known approaches? > 2004-08-13T16:33:58Z <ericP> the source stuff? opne, i think. > 2004-08-13T16:34:02Z <AndyS> If opne; then email start else text in doc of "normal" approach > 2004-08-13T16:34:05Z <ericP> don't know about XUL > 2004-08-13T16:34:49Z <AndyS> 3Store has quads as does Redland and RDFStore (? the last one) > 2004-08-13T16:35:17Z <AndyS> I think DaveB assignes no particular meaning to 4th slot > 2004-08-13T16:35:27Z <DaveB> yeah > 2004-08-13T16:35:37Z <AndyS> Not idle then! > 2004-08-13T16:35:43Z <DaveB> 3store has 1 meaning, source or something > 2004-08-13T16:35:50Z <DaveB> busy merging redland win32 patches > 2004-08-13T16:35:53Z <ericP> daveb, used to be triples in containing sets, no? > 2004-08-13T16:35:59Z <ericP> (now quads) > 2004-08-13T16:36:11Z <AndyS> Redland isn't set based is it? > 2004-08-13T16:36:15Z <DaveB> it is now > 2004-08-13T16:36:19Z <DaveB> it is sets now > 2004-08-13T16:36:30Z <DaveB> but with contexts on, you can have dup triples > 2004-08-13T16:36:34Z <AndyS> sets of triples or sets of quads? > 2004-08-13T16:36:45Z <DaveB> sets of triples or bags of quads > 2004-08-13T16:36:47Z <AndyS> Ah - need to turns quads on? > 2004-08-13T16:36:55Z <DaveB> well, not quite quads > 2004-08-13T16:37:51Z <DaveB> I wrote about it most recently in http://www.w3.org/2001/sw/Europe/reports/large_scale_demo/ > 2004-08-13T16:39:20Z <DaveB> pub, see ya > 2004-08-13T16:39:28Z -!- DaveB [dajobe@137.222.34.57] has quit [Quit: Client exiting] > 2004-08-13T16:39:32Z <ericP> would you impelement CONSTRUCT * WHERE ( ?p rdf:type foaf:Person . ?p foaf:knows ?known ).... > 2004-08-13T16:39:36Z <ericP> rats, missed him > 2004-08-13T16:40:38Z <ericP> i don't know whether to look for all the statements in the same set, differentiated by the last col, > or whether i'd have to iterate over the known contexts > 2004-08-13T16:40:40Z <AndyS> Whats the issue with the Q > 2004-08-13T16:40:46Z <ericP> former seems more efficient > 2004-08-13T16:41:08Z <AndyS> There is a protocol matter > 2004-08-13T16:41:16Z <AndyS> Well - encoding matter really. > 2004-08-13T16:41:52Z <AndyS> No syntax for quads : we coudl restrict SRC usage to result sets and in query > 2004-08-13T16:41:58Z <ericP> re the Q, i was going to further constrain on triple to be stated by a known party (after i knew how > the general query was executed > 2004-08-13T16:43:15Z <ericP> "CONSTRUCT *" wasn't meant to complicate, just a boring head to the query > 2004-08-13T16:43:27Z <ericP> "SELECT *" would have been better > 2004-08-13T16:43:46Z <AndyS> :-) > 2004-08-13T16:44:15Z <ericP> what do you think of { ?p rdf:type ?q (SRC(?s) ) } > 2004-08-13T16:44:21Z <ericP> ie, move it to the constraints? > 2004-08-13T16:45:00Z <ericP> then you can have syntaxes like { ?p rdf:type ?q (SRC() = <http://trusted.example/foo>) } > 2004-08-13T16:45:07Z <ericP> for when you want to constrain > 2004-08-13T16:45:32Z <ericP> maybe former could be { ?p rdf:type ?q (?s = SRC() ) } > 2004-08-13T16:45:53Z <AndyS> Don't see point of outer () - its no different to wanting { ?x ?y (?<34) } > 2004-08-13T16:46:18Z <AndyS> Could have SRC(?s) as a binding operation like any other slot > 2004-08-13T16:46:24Z <ericP> i was tyring to parallel that syntax for the SRC constraints > 2004-08-13T16:47:10Z <AndyS> SRC applies to triples - can we have SRC applied to graphs? graph patterns? > 2004-08-13T16:47:16Z <AndyS> Syntax - err ---- > 2004-08-13T16:47:50Z * ericP digs up an algae test for expressivity comparison... > 2004-08-13T16:47:51Z <AndyS> SRC(?s,{pattern}) so SRC(?s, { :x :y ?z. } ) > 2004-08-13T16:47:54Z <ericP> > http://dev.w3.org/cvsweb/perl/modules/W3C/Rdf/test/Ephemoral0-alg.sh?rev=HEAD&content-type=text/x-cvsweb-markup > 2004-08-13T16:48:01Z <ericP> look for ATTRIB > 2004-08-13T16:48:07Z <ericP> (how i spell SRC) > 2004-08-13T16:48:59Z <ericP> SRC(?s,{pattern}) is appealing... > 2004-08-13T16:50:05Z <AndyS> What other systems to be considered? > 2004-08-13T16:50:17Z <ericP> algae syntax: ask ?db ( ?ps ?ps ?o {?ps != t:zzz}{%ATTRIB == t:attrib1}. ...) > 2004-08-13T16:50:44Z <ericP> doen that way to make it so SRC constraints are handled the same way as any other constraints > 2004-08-13T16:51:00Z <ericP> i wonder what XUL uses > 2004-08-13T16:51:51Z <ericP> crap, SRQL spec disappeared from <http://www.openrdf.org/publications/SeRQL%20user%20manual.pdf> > 2004-08-13T16:52:01Z <AndyS> In forming a consensus, who/what should be factored in? > 2004-08-13T16:52:34Z <AndyS> http://www.openrdf.org/doc/users/userguide.html > 2004-08-13T16:53:19Z <ericP> a general, elegant, beautiful solution without the slightest deference to the more modest needs of the > users? > 2004-08-13T16:53:29Z <ericP> (factored in) > 2004-08-13T16:53:53Z <ericP> what do people do now? what will they do in 1 year? > 2004-08-13T16:54:07Z <AndyS> I see no contexts or quads > 2004-08-13T16:54:23Z <ericP> beyond that is probably more work to speculate on than we would save in re-deployment > 2004-08-13T16:54:37Z <AndyS> The objective is a consensus in the WG - that may be different, may be the same > 2004-08-13T16:55:05Z <ericP> (btw, elegence proposal was in jest) > 2004-08-13T16:55:31Z <ericP> FOAF unifiers (i forget the real name) use a bit of this > 2004-08-13T16:55:39Z <ericP> edd wrote about it... > 2004-08-13T16:56:13Z <ericP> http://www-106.ibm.com/developerworks/xml/library/x-foaf2.html > 2004-08-13T16:56:21Z <AndyS> Only got a few mins more ... > 2004-08-13T16:56:29Z <ericP> Annotea uses it > 2004-08-13T16:56:42Z <ericP> Ontaria too > 2004-08-13T16:56:50Z <AndyS> that seems to get well beyond "data access" > 2004-08-13T16:56:59Z <ericP> what other "open world" data query systems are out there? > 2004-08-13T16:57:37Z <ericP> well beyond, yeah, wondering if we can support the part of the job that isolates the set of statements > from a particular document > 2004-08-13T16:57:44Z <AndyS> This is tricky - is the WG the right set of people? Seems to be prov and privacy and trust and ... so > much wider set of interested parties > 2004-08-13T16:57:59Z <AndyS> Its in danger of being full workshop material! > 2004-08-13T16:58:17Z <AndyS> So who are the WG interested parties? > 2004-08-13T16:58:21Z <AndyS> (not me!) > 2004-08-13T16:59:42Z <ericP> i'm sure we can make this as complicated as we want. someone will always be able to dream up an app > that justifies new functionality. > 2004-08-13T17:00:02Z <ericP> do you think isolating the set of statements is the sweet point? > 2004-08-13T17:00:32Z <ericP> if so, can we construct a forum where, when we ask that question, we hear "yes, that's perfect" ? > 2004-08-13T17:00:45Z <ericP> if so, let's ask that forum > 2004-08-13T17:00:47Z <AndyS> Not sure - but I only have "think" experience. Do you think the WG will cluster around it? > 2004-08-13T17:01:05Z <AndyS> (this is certainly meeting my idea of a thing to build towards for F2F) > 2004-08-13T17:02:00Z <ericP> oops, let the smoke out of my crystal ball -- have to go on intuition > 2004-08-13T17:02:16Z <ericP> may as well sound them out, i guess. > 2004-08-13T17:05:10Z <AndyS> Anything else to cover now? (its 18:00 here - and I have to pack!) > 2004-08-13T17:05:12Z -!- ericP2 [matthieu@128.30.52.30] has joined #dawg > 2004-08-13T17:05:17Z <ericP2> hi andy, sorry > 2004-08-13T17:05:40Z <ericP2> i made a ref to letting the smoke out and my laptop died > 2004-08-13T17:05:55Z <AndyS> Smoke out the laptop - serious > 2004-08-13T17:06:06Z <AndyS> Matthieu Fuzellier ? > 2004-08-13T17:06:24Z <ericP2> yes, i'm using his irc client > 2004-08-13T17:06:29Z <ericP2> and his chair > 2004-08-13T17:06:42Z <ericP2> he's the new webmaster > 2004-08-13T17:06:49Z <ericP2> also had a convient irc window > 2004-08-13T17:07:00Z <ericP2> so i'll be a little while gettting things working again > 2004-08-13T17:07:05Z <ericP2> shall we call it a day? > 2004-08-13T17:07:14Z <ericP2> or do you want to wait for me to recover? > 2004-08-13T17:13:50Z <AndyS> I have to help carry a rat cage - back in 5 > 2004-08-13T17:13:50Z <ericP2> ok > 2004-08-13T17:13:50Z -!- ericP2 is now known as matthieu > 2004-08-13T17:13:50Z -!- matthieu [matthieu@128.30.52.30] has left #dawg [Leaving] > 2004-08-13T17:13:51Z -!- ericP [ericP@128.30.52.30] has joined #dawg > 2004-08-13T17:13:51Z [Users #dawg] > 2004-08-13T17:13:51Z [ AndyS] [ ericP] > 2004-08-13T17:13:51Z -!- Irssi: #dawg: Total of 2 nicks [0 ops, 0 halfops, 0 voices, 2 normal] > 2004-08-13T17:14:10Z -!- Channel #dawg created Fri Aug 13 04:20:29 2004 > 2004-08-13T17:15:00Z -!- Irssi: Join to #dawg was synced in 69 secs > 2004-08-13T17:15:15Z <AndyS> I'm back > 2004-08-13T17:15:26Z <AndyS> Got about 15 > 2004-08-13T17:18:28Z <AndyS> Your descriptions suggest that SRC isn't just a matter of recording the de facto status quo. > 2004-08-13T17:18:33Z <ericP> ok. let me tell folks i'm supposed to heat with > 2004-08-13T17:18:38Z <AndyS> Is that a fair comment? > 2004-08-13T17:18:57Z <AndyS> (its cold in Boston!) > 2004-08-13T17:19:11Z <ericP> i'd say, it's not status quo in the QLs, but it is in the apps. > 2004-08-13T17:20:03Z <AndyS> So issue is how it appears ? > 2004-08-13T17:20:31Z <ericP> yeah, the step of formlizing it for a QL isn't well understood, i think > 2004-08-13T17:20:34Z <AndyS> And link to reification - as that is in (minorly) the RDF-core recs > 2004-08-13T17:20:40Z <ericP> i feel confident, but that's 'cause i have my pet > 2004-08-13T17:23:11Z <ericP> core says "here's how you reifiy" but specifically says that the product of reification entails > nothing (beyond the simple graph), ie, no way to de-reify > 2004-08-13T17:24:08Z <ericP> also, folks have lots of issues with reification and the superman prob > 2004-08-13T17:24:43Z <ericP> i think the owness should be on the person making the owl:sameAs statements, not on the person saying > "there is this statement..." > 2004-08-13T17:25:08Z <AndyS> Exactly - there is a prov soln in rec - no one likes it but it is there. Are we effectively ignoring > it? By passing it? Must be clear before Last call. > 2004-08-13T17:25:43Z <ericP> http://www.w3.org/2001/12/attributions/#superman > 2004-08-13T17:26:25Z <ericP> i don't think we have to bypass it, we can use the QL to imply the query over reified data > 2004-08-13T17:26:42Z <ericP> but there are so many outstanding issues, i don't think we can > 2004-08-13T17:26:55Z <ericP> so i *do* think we hav eto bypass it, i guess > 2004-08-13T17:27:33Z <AndyS> so {:x :y :z :s} is shorhand for a stating? > 2004-08-13T17:28:26Z <AndyS> This is another BobG interest area BTW > 2004-08-13T17:28:45Z <ericP> "skating"? is that a reified graph? > 2004-08-13T17:29:54Z <AndyS> No - is a quds actually the reification {:x :y :z :s} == { :s rdf:type Statement ; rdf:subject :x ; > rdf:predicate :y ; rdf:object :z } > 2004-08-13T17:30:23Z <AndyS> Or {:x :y :z :s} == { _:b rdf:type Statement ; rdf:subject :x ; rdf:predicate :y ; rdf:object :z } && > _:b seenIn :s > 2004-08-13T17:31:07Z <AndyS> If bypass, then I think we aren't in QL land but in RDF2 land > 2004-08-13T17:31:09Z <ericP> is there a standardized (or close) seenIn ? > 2004-08-13T17:31:22Z <ericP> (log:include) > 2004-08-13T17:31:27Z <AndyS> Not that I know of. > 2004-08-13T17:31:43Z <ericP> hmm, tricky > 2004-08-13T17:31:45Z <AndyS> log:include takes formula for RHS ? > 2004-08-13T17:32:36Z <ericP> i always see the value being a statement > 2004-08-13T17:32:53Z <ericP> but it is in {}s so it coudl be a bunch fo statements > 2004-08-13T17:34:07Z <AndyS> In particular it is not a named group > 2004-08-13T17:34:12Z <AndyS> c.f. TriX > 2004-08-13T17:35:04Z <AndyS> So - it seems that there is discussion in Wg to be had. Whether to start with doc text or with email > or something else is up to you (:-) > 2004-08-13T17:35:36Z <ericP> arguemtns to either approach > 2004-08-13T17:35:57Z <AndyS> Its style and preference as much as anything > 2004-08-13T17:36:46Z <ericP> i'd like to ask some group "who uses provenance? do you have QL support?" > 2004-08-13T17:36:49Z <ericP> what group would i ask? > 2004-08-13T17:37:35Z <AndyS> WG at least - maybe others - but it is about WG consensus > 2004-08-13T17:37:45Z <ericP> also, if we don't draw an arc between a statement and the document it came it (just describe > provenance in normaitve text), i think we can duck RDF2 > 2004-08-13T17:37:50Z <ericP> true > 2004-08-13T17:39:09Z <AndyS> Must go - have fun in FL - and hope you get there! > 2004-08-13T17:40:24Z <ericP> cheers -- -eric office: +81.466.49.1170 W3C, Keio Research Institute at SFC, Shonan Fujisawa Campus, Keio University, 5322 Endo, Fujisawa, Kanagawa 252-8520 JAPAN +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA cell: +1.857.222.5741 (does not work in Asia) (eric@w3.org) Feel free to forward this message to any list for any purpose other than email address distribution.
Received on Saturday, 14 August 2004 00:22:05 UTC