- From: Héctor Pérez-Urbina <hector@clarkparsia.com>
- Date: Tue, 15 Jan 2013 10:16:59 -0500
- To: James Cheney <jcheney@inf.ed.ac.uk>
- Cc: public-prov-comments@w3.org, Evren Sirin <evren@clarkparsia.com>
- Message-ID: <CAKSqmQ6kqO9aynxZVVjWLDHtO9ApXYGHGy460R8GoYe79Q4gjA@mail.gmail.com>
Hi James, Please find some comments regarding point 1 below. My colleague Evren will address point 2. On Mon, Jan 14, 2013 at 5:36 PM, James Cheney <jcheney@inf.ed.ac.uk> wrote: <snip> Two substantive points: > > 1. There could potentially be other occurrences of y in other statements > in I. These also need to be replaced with S(y), which is why we specify > that S needs to be applied to the whole instance. (There are more > efficient ways of obtaining the same behavior, avoiding visiting the whole > instance, but the behavior must be the same as the specification.) For > example, suppose the instance as a whole were: > > A0. wasDerivedFrom(d; e,e0,a,y,u). > A1. wasGeneratedBy(gen1; e,a, t1, []). > A2. wasGeneratedBy(y; e,a, t1, [attr1=value1]). > > Then if you just apply S to A1 and A2, you get > > A0. wasDerivedFrom(d; e,e0,a,y,u). > A1'. wasGeneratedBy(gen1; e,a, t1, []). > A2'. wasGeneratedBy(gen1; e,a, t1, [attr1=value1]). > > and the connection between the generation involved in derivation step A0 > and that described by A2 is lost. (In this case, it so happens that other > constraints may force us to re-discover that y = gen1, but this is just > luck. If we replaced the generations by uses this would not be the case.) > > Generally, once we discover that a variable is equal to some other term, > we should substitute the variable everywhere, as is done in the "TGD/EGD > chase" in relational databases, which partly inspires what we are doing. > You seem to assume that variable 'y' stands for the same term in every independent statement. This will be extremely confusing for people with a logic programming background. There is the fundamental assumption that clauses (i.e., facts, statements) do not share variables. In order to address this case, one could use functional terms: A0. wasDerivedFrom(d; e,e0,a,*f(d)*,u). A1. wasGeneratedBy(gen1; e,a, t1, []). A2. wasGeneratedBy(*f(d)*; e,a, t1, [attr1=value1]). But now, things get trickier as gen1 won't unify with f(d). One could use some kind of binary 'canReplace' predicate and have uniqueness constraints infer things like canReplace(gen1, f(d)). Based on this predicate, one could define equivalence classes (sets of terms that represent the same constant), come up with some replacing mechanism (to replace f(d) with gen1 everywhere), and impose the constraint that we can have at most one constant in every equivalence class (i.e., canReplace(gen1, gen2) should lead to invalidity as the arguments do not unify). -- Best, Héctor
Received on Tuesday, 15 January 2013 15:19:10 UTC