Re: Some thoughts about the revised provenance Model document from Satya Sahoo on 2011-10-17 (public-prov-wg@w3.org from October 2011)

From: Satya Sahoo <satya.sahoo@case.edu>
Date: Sun, 16 Oct 2011 20:00:28 -0400
To: "Myers, Jim" <MYERSJ4@rpi.edu>
Cc: Graham Klyne <GK@ninebynine.org>, Paul Groth <p.t.groth@vu.nl>, W3C provenance WG <public-prov-wg@w3.org>
Message-ID: <CAOMwk6yxhPFhKo7pHiDWKvKgHXLO3Q-npJ11yDgeg-im4757AQ@mail.gmail.com>
Hi all,
Sorry for joining this discussion so late.

Many of GK's point across multiple email threads about Entity and
complementOf seem to make sense to me - maybe the semantic web technologies
stack background...

Regarding the need to explicitly associating "characterizing attributes"
with an entity:
In case of SW (RDF, RDFS, OWL etc.), if we say "provenance of x attributes
of the web page" are we referring to the association of values to the x
attributes of the web page (by person A)?

On the other hand, if we want to distinguish between two versions of the Web
page (and their x attributes) we assign different identifiers to the two
versions of the Web page wp1 and wp2 (and associate version info with both).
But, for provenance applications it is not necessary to have both wp1 and
wp2 along with their x attributes to make additional provenance assertions.
The identifiers wp1 and wp2 in any SW (RDF, OWL) setting are
"self-contained" means/handles to refer to the two versions of the Web
pages.

Similarly, for Jim's example of "Luc", "Luc in Boston", if a provenance
application wants to have a mechanism to make assertions only for "Luc in
Boston" and distinguish it from "Luc in London" then it can have two
distinct identifiers LB and LL and make assertions about it. In fact, they
can be modeled as instances of a specialized Role class called
"PersoninGeographicalLocation" and LB and LL can be instances of the class
(from SW perspective). "LucinGeographicalLocation" is linked to "Luc" the
person by the proposed property "assumedRoleOf".

As we discussed in PROV-ISSUE-89 [1], in the SW context and many knowledge
representation languages such as earlier Frames, semantic networks, and
Topic maps (from my current understanding) having an explicit set of
attribute-value pairs to be "carried" to distinguish entities is not
necessary.

Thanks.

Best,
Satya

[1] http://www.w3.org/2011/prov/track/issues/89

On Tue, Oct 4, 2011 at 10:40 AM, Myers, Jim <MYERSJ4@rpi.edu> wrote:

> In a practical/syntactic sense - could the use of account make it easy to
> define fixed attributes? E.g. a rdf statement that A dc:creator B in an
> account can be taken to mean that B is a fixed attribute of A, whereas the
> same statement not in an account could not be interpreted that way (it is
> just an RDF statement that has no temporal window of validity associated
> with it, A could have a different dc:creator some other time, etc.). This
> may be heretical for some reason that I don't know but I naively thought it
> might be a useful way to avoid having to define fixed attributes directly as
> an rdf/owl construct and make a fairly readable way to identify which RDF
> statements about an entity correspond to fixed attributes in the data model.
> Thoughts?
>
>  Jim
>
> > -----Original Message-----
> > From: public-prov-wg-request@w3.org [mailto:public-prov-wg-
> > request@w3.org] On Behalf Of Graham Klyne
> > Sent: Friday, September 30, 2011 5:09 AM
> > To: Paul Groth
> > Cc: 'W3C provenance WG'
> > Subject: Re: Some thoughts about the revised provenance Model document
> >
> > On 30/09/2011 08:33, Paul Groth wrote:
> > > Hi Graham,
> > >
> > > I think the purpose of characteristic attributes is to define what
> > > your describing the provenance of.
> > >
> > > So when I hand you provenance information for example for a web page,
> > > you'll know that I'm describing the provenance of x attributes of the
> > > web page and not everything there.
> >
> > Hmmm... that's not quite how I was seeing this.
> >
> > If we talk about "the web page modified by Authur", what does it mean to
> > describe the provenance of the attribute "modified by Arthur"?  I thank
> that
> > we're talking about the provenance of the particular view/perspective of
> the
> > web page that happens to be modified by Arthur, which may be one of a
> > family of pages modified by different editors.
> >
> > ...
> >
> > I think there are two orthogonal issues here:
> >
> > (1) given some entity, to make provenance assertions about that entity.
>  To
> > be
> > meaningful, the entity needs to be static (or constrained) to the extent
> that
> > the assertions continue to be truthful in a variety of contexts.  So the
> entity
> > might be "the weather report for Monday" which was modified by Arthur
> > and will
> > always remain so, even though Tuesday's version of that weather report
> > may have
> > been modified by Brenda.
> >
> > (2) the other issue is:  given some dynamic entity, how to identify a
> "view"
> > or
> > "perspective" on that entity that has some static characteristics to
> allow
> > meaningful expression of provenance information?  To my way of thinking,
> > characterizing attributes are a perfectly reasonable way to do this, but
> not
> > the
> > only way.  Another possibility might be that the entity concerned has a
> > unique
> > name, and we can just use that (e.g. the published working versions of
> W3C
> > specifications, URIs of document versions in a version management
> system).
> >
> > So while I acknowledge that the attributes might be useful for some
> > applications
> > of provenance, I'm not convinced they are needed for all applications.  I
> also
> > acknowledge that the model specification allows that the attributes may
> be
> > not
> > specified.  But I am slightly concerned that there may be situations for
> > identifying an entity that don't fit the attribute model, which would
> formally
> > restrict our ability to express provenance information.
> >
> > ...
> >
> > In the big scheme of things, I think the attributes approach is quite
> workable
> > for any practical example I can think of.  My main objection is that I
> think
> > it's unnecessary for it to be baked in to the whole provenance model, and
> as
> > such runs the risk of making the whole framework more complex than it
> > needs to be.
> >
> > ...
> >
> > Another way to look at this might be to consider that provenance
> assertions
> > for
> > a given Entity MUST continue be true for that entity, or they are not
> > meaningful
> > as provenance assertions.  To the extent that provenance assertions
> actually
> > *are* static attributes of that entity, then the existence of static
> attributes
> > (in the style of "characterizing attributes") may be inferred.  In this
> respect,
> > the static attributes are a consequence rather than a defining aspect of
> the
> > existence of meaningful provenance information.
> >
> > #g
> > --
> >
> > > Graham Klyne wrote:
> > >> Jim,
> > >>
> > >> If I understand you correctly, the significance of attributes is for
> discovery
> > >> of of related resources.
> > >>
> > >> My understanding is that the primary purpose of provenance is to
> > establish a
> > >> basis for trust, a reason to believe (or not) some information that is
> > presented
> > >> about some subject. It's not clear to me what need there is to use
> > attributes
> > >> for resource discovery to achieve this end. (But I may well be missing
> > >> something here.)
> > >>
> > >> So, on this basis, there may be perfectly good reasons to have defined
> > >> attributes and values for discovery purposes, I'm not seeing why they
> are
> > needed
> > >> to achieve the goals of *provenance* information.
> > >>
> > >> (But it's getting late here, and maybe I'm missing some key point in
> your
> > >> message.)
> > >>
> > >> In summary: I think your concerns are reasonable, but what makes them
> > in scope
> > >> specifically for *provenance* information?
> > >>
> > >> #g
> > >> --
> > >>
> > >> On 29/09/2011 18:44, Myers, Jim wrote:
> > >>> Graham,
> > >>>
> > >>> How would we use provenance to find, for example, how Luc got to
> > Boston? It's
> > >>> clear if we have fixed attributes for name and location such that we
> > could
> > >>> query for an entity with name Luc that has an ivpOf relationship with
> an
> > >>> entity in Boston and then look at the provenance from there. How
> would
> > it
> > >>> work without fixed attributes in the prov model? I'm guessing that
> > you're
> > >>> thinking that we can find those attributes outside the language
> > somewhere
> > >>> (e.g. non-prov RDF statements) but what are the minimal requirements
> > there
> > >>> and what language/models exist that meet them? Can we only model
> > provenance
> > >>> of things for which ontologies have been developed? Presumably it has
> > to be
> > >>> possible to associate descriptive metadata with the entities through
> > some
> > >>> path (what relationship(s)?)? And it has to be clear which metadata
> is
> > fixed?
> > >>> You mention being able to infer across ivpOf relationships - is there
> one
> > set
> > >>> of inference rules for all possible descriptive metadata? Or do we
> need
> > to be
> > >>> able
> > >
> > >> to distinguish further between types of metadata?
> > >>> --> As you can probably guess from the questions above, I'm concerned
> > that
> > >>> kicking fixed attributes out will end up being more complex and place
> a
> > >>> higher burden on users than keeping them in, but I may be
> > misunderstanding
> > >>> how such an alternative would work. Part of that concern is that I
> think I
> > >>> hear that modeling experts in this group can handle defining classes
> for
> > >>> different types of entities that would allow discovery by attribute,
> but
> > I'm
> > >>> concerned that being able to do this becomes a requirement for using
> > >>> provenance (versus asserting entities defined solely by
> attributes(entity,
> > >>> name=Luc) or perhaps in a mixed mode (e.g. an entity representing Luc
> > that
> > >>> 'hasBaseType' foaf:person and one representing him in Boston that
> also
> > >>> hasBaseType foaf:person and location=Boston as a fixed attribute.)
> > Again -
> > >>> perhaps I'm misunderstanding how discovery based on descriptive
> > information
> > >>> could be done if we don't have fixed characterizing attributes in the
> prov
> > >>> standard....
> > >>>
> > >>> Jim
> > >>>
> > >>>> 3. Do we need to model "Characterizing attributes"?
> > >>>>
> > >>>> The notions of "characterizing attributes" have developed to derive
> the
> > >>>> relationship between different entities that are views of some
> > common
> > >>>> thing in the world. I am not convinced that we need to model these
> > >>>> attributes, and I'm not sure the way they are modelled can
> necessarily
> > apply
> > >>>> in all situations that applications might wish to represent.
> > >>>>
> > >>>> At heart: when it comes to exchanging provenance information, why do
> > we
> > >>>> *need* to know exactly what makes one entity a constrained view of
> > >>>> another? What breaks (at the level of exchanging provenance
> > information) if
> > >>>> we have no access to such information? How are applications that
> > exchange
> > >>>> provenance information about entities for which they don't actually
> > know
> > >>>> about these attributes to know about their correspondences with
> real-
> > world
> > >>>> things?
> > >>>>
> > >>>> I think the role of attributes here is mainly to *explain* some
> aspects
> > of the
> > >>>> provenance model, but they do not need to be part of the model.
> > >>>>
> > >>>> To my mind, a simpler approach would be to allow for assertion of an
> > IVPof
> > >>>> type of relationship between entities, from which some useful
> > inferences
> > >>>> about any attributes present might flow, but I don't see the need
> for
> > the
> > >>>> attributes to be in any sense defining of the entities.
> > >>>>
> > >>>> <aside>
> > >>>> My suggested definition of IVPof might be something like this:
> > >>>>
> > >>>> A IVPof B iff forall p : (Entity -> Bool) . p(B) => p(A)
> > >>>>
> > >>>> where A, B are Entities, and the values of p are predicates on
> Entities.
> > >>>> </aside>
> > >>>>
> > >>>> ...
> > >>>>
> > >>>> #g
> > >>>>
> > >>>
> > >>>
> > >>
> > >
>
>
>
Received on Monday, 17 October 2011 00:01:10 UTC