Re: PROV comments from Clark&Parsia

>> We have inferred some attributes for gen1. I don't see why A3 is not to be considered a PROV statement.
>
> I agree that it is a statement, but to be pedantic, what I wrote was
> that uniqueness/key constraints do not infer *new* statements, that
> is, statements with additional information not present in the instance
> before.

In this example, we are inferring gen1 has value1 for attr1. You can
say this is information already present in the instance but one can
also say all the inferences depend on information present in the
information in some way. In the end, this is still information that
was implicit in the original instance which is why we think it is
better characterized as an inference

> Instead, such constraints propagate equality information
> through variables, or merge statements that provide complementary,
> compatible information about the same identified object/relationship.
> At an RDF/OWL level (I believe) these would all translate to
> functional or inverse functional relationships, not subclass axioms.

This is true but in OWL there is no difference between a functional
property definition and a subclass axiom [1]. In fact

FunctionalObjectProperty(p)

is equivalent to

SubClassOf(owl:Thing ObjectMaxCardinality(1 p))

And in OWL, functional properties can lead to inferences. For example,
coupled with the functional property definition above, the following
statements

ObjectPropertyAssertion(p a b)
ObjectPropertyAssertion(p a c)
ObjectPropertyAssertion(p b d)

will infer

SameIndividual(b c)
ObjectPropertyAssertion(p c d)

Note that, this example is very similar to our example with PROV statements.

> In databases, these are almost always called functional dependencies
> or key constraints.

Yes, but our point is OWL functional properties are different than
functional dependencies in databases and PROV uniqueness constraints
are more like OWL functional properties. It is indeed this
relationship between PROV uniqueness constraints and OWL functional
properties that we think the term constraint is slightly misleading.

>
> Does this help?  In any case, as noted below, we're happy to revisit
> the terminology - to me, though, "uniqueness inferences" does not
> sound right.

Having said all this, I must also say this is a rather minor point. We
don't have any objections if you would like to keep using the term
"uniqueness constraints" but then adding a paragraph clarifying this
point would be good.

Best,
Evren

[1] http://www.w3.org/TR/owl2-syntax/#Functional_Object_Properties

On Tue, Jan 15, 2013 at 10:16 AM, Héctor Pérez-Urbina
<hector@clarkparsia.com> wrote:
> Hi James,
>
> Please find some comments regarding point 1 below. My colleague Evren will
> address point 2.
>
> On Mon, Jan 14, 2013 at 5:36 PM, James Cheney <jcheney@inf.ed.ac.uk> wrote:
>
> <snip>
>
>> Two substantive points:
>>
>> 1.  There could potentially be other occurrences of y in other statements
>> in I.  These also need to be replaced with S(y), which is why we specify
>> that S needs to be applied to the whole instance.  (There are more efficient
>> ways of obtaining the same behavior, avoiding visiting the whole instance,
>> but the behavior must be the same as the specification.)  For example,
>> suppose the instance as a whole were:
>>
>> A0. wasDerivedFrom(d; e,e0,a,y,u).
>> A1. wasGeneratedBy(gen1; e,a, t1, []).
>> A2. wasGeneratedBy(y; e,a, t1, [attr1=value1]).
>>
>> Then if you just apply S to A1 and A2, you get
>>
>> A0. wasDerivedFrom(d; e,e0,a,y,u).
>> A1'. wasGeneratedBy(gen1; e,a, t1, []).
>> A2'. wasGeneratedBy(gen1; e,a, t1, [attr1=value1]).
>>
>> and the connection between the generation involved in derivation step A0
>> and that described by A2 is lost.  (In this case, it so happens that other
>> constraints may force us to re-discover that y = gen1, but this is just
>> luck.  If we replaced the generations by uses this would not be the case.)
>>
>> Generally, once we discover that a variable is equal to some other term,
>> we should substitute the variable everywhere, as is done in the "TGD/EGD
>> chase" in relational databases, which partly inspires what we are doing.
>
>
> You seem to assume that variable 'y' stands for the same term in every
> independent statement. This will be extremely confusing for people with a
> logic programming background. There is the fundamental assumption that
> clauses (i.e., facts, statements) do not share variables.
>
> In order to address this case, one could use functional terms:
>
> A0. wasDerivedFrom(d; e,e0,a,f(d),u).
> A1. wasGeneratedBy(gen1; e,a, t1, []).
> A2. wasGeneratedBy(f(d); e,a, t1, [attr1=value1]).
>
> But now, things get trickier as gen1 won't unify with f(d). One could use
> some kind of binary 'canReplace' predicate and have uniqueness constraints
> infer things like canReplace(gen1, f(d)). Based on this predicate, one could
> define equivalence classes (sets of terms that represent the same constant),
> come up with some replacing mechanism (to replace f(d) with gen1
> everywhere), and impose the constraint that we can have at most one constant
> in every equivalence class (i.e., canReplace(gen1, gen2) should lead to
> invalidity as the arguments do not unify).
>
> --
> Best,
> Héctor

Received on Tuesday, 15 January 2013 15:53:47 UTC