RE: How to attach a prov graph to an RDF Triple? from Stian Soiland-Reyes on 2017-10-25 (public-prov-comments@w3.org from October 2017)

From: Stian Soiland-Reyes <soiland-reyes@manchester.ac.uk>
Date: Wed, 25 Oct 2017 09:40:32 +0000
To: "Svensson, Lars" <L.Svensson@dnb.de>, Martin Doerr <martin@ics.forth.gr>, "public-prov-comments@w3.org" <public-prov-comments@w3.org>
Message-ID: <D5780135E58FC940BDB87E7D499910184B4E61BB@MBXP14.ds.man.ac.uk>
Yes, in nanopublications it just those three graphs for assertion/provenance/pubinfo (their graph URIs could be anything) using whatever namespaces you feel appropriate (I would hope some PROV :-)

In addition there is a “glue” nanopublication resource to connect the other graphs; those statements are in a fourth “head” graph (in earlier work was kept in the default graph).



Although their examples use TriG, any of the RDF syntaxes that support named graphs work well; the nanopub servers expose as trig, nq, trix and jsonld – see http://openphacts.cs.man.ac.uk:8080/nanopub-server/nanopubs.html



In particular JSON-LD lends itself well to nanopubs, as you can express them as a nested JSON structure with almost no [lists]: https://gist.github.com/stain/c2ad32ef4423b4814e83c2821bb03506





BTW; the reason Nanopub splits into provenance (how the knowledge was derived) and pubinfo (how were these RDF statements made) is basically to avoid the old HTTPRange-14 problem of the distinction of what a resource “is” and its representation.  In PROV we also have specializations/alternates for this purpose, where you relate prov:Entity’s at different levels.



You could in theory do the same by building your own multi-graph structure with prov:Bundle, rdfg:Graph and prov:specializationOf – but how would you recognize this pattern, and how do you identify the “unit of knowledge” that your prov:Bundle specializes?

So as Lars points out you would then need to make new constraints or detailed typing of these graphs to do something useful. Well, this is basically what NanoPub does (although I’ll admit it predates PROV).





--
Stian Soiland-Reyes, eScience Lab
School of Computer Science, The University of Manchester
http://orcid.org/0000-0001-9842-9718



From: Svensson, Lars<mailto:L.Svensson@dnb.de>
Sent: 23 October 2017 10:16
To: Martin Doerr<mailto:martin@ics.forth.gr>; Stian Soiland-Reyes<mailto:soiland-reyes@manchester.ac.uk>; public-prov-comments@w3.org<mailto:public-prov-comments@w3.org>
Subject: RE: How to attach a prov graph to an RDF Triple?



Martin, all,

On Tuesday, October 17, 2017 5:42 PM, Martin Doerr [mailto:martin@ics.forth.gr] wrote:

> I think there are too many reification constructs around.

+1; for someone new to this topic, it's a pretty steep learning curve...

> The distinction of "source",
> "micropublication" "NamedGraph" shouldn't play any role. The logic of giving set of
> triples a name and class, and a reference count for the referred triples should be
> generic.

Yes, and if I understand Stian's examples correctly, nanopublications _are_ essentially named graphs in TriG syntax. Is that correct?

> It is up to implementers to optimize for small, one-triple and large graphs in
> their code. I think constructs like the "bundle" are good, but should fall under one
> generic construct. What is needed is a constraint mechanism on NamedGraphs, or
> however you may call it: As in the Bundle construct, we need to be able to define an
> entity class which instantiates as a named set of triples contraint to an RDF or OWL
> subschema.

Do you have any suggestion what that class might look like? My imagination fails me here.

> Applications go far beyond provenance, basically all epistemological
> information, such as argument and inference models, annotation models, observation
> models. In these applications, graphs mixing schema and instance data are not
> particularly relevant and could be left out in a first attempt.

In my humble opinion, we should always try to avoid mixing schema and instance data. When it comes to actually performing inference operations, we of course have to, but publishing the schema data separately ought to increase the possibilities of data re-use.

Best,

Lars

> Opinions?

Yes, but perhaps not very well phrased.

Best,

Lars

> On 10/17/2017 1:49 PM, Stian Soiland-Reyes wrote:
> Yes, thinking of named graphs as contexts rather than sources means you it is natural
> to see the same triple in multiple graphs.
>
> You could have one medium-sized graph for how an RDF file was loaded, a tiny graph
> of perhaps just that triple to assess its quality/assertion etc, and larger combined
> graphs for aggregated reasoning (including the all-inclusive union graph).
>
> In PROV we generalized this concept as “PROV bundles” https://www.w3.org/TR/prov-
> dm/#component4 – which allows you to describe provenance and provenance, and
> relate alternate histories of entities that are loosely the “same thing” using
> https://www.w3.org/TR/prov-links/
>
> You will see that will also allow you to talk about the entity in another bundle – but it
> does admittedly not allow you to talk about particular triples/attributes separately.
>
>
> The miniature graph approach is also used by Nanopublications
> http://www.nanopub.org/guidelines/  – which basically have a single statement or so in
> an “assertion” graph, and then two associated graphs with “provenance” (how was that
> assertion made) and “publication info” (citation info).  Both of these would use prov:
> statements. The nanopublication graph itself just ties these other three graphs together
> as well as declaring itself as a nanopublication.
>
> Example from http://www.nanopub.org/2013/WD-guidelines-20131215/#well-formed-
> nanopublications
>
> :nanopubEx {
>      :nanopubEx a np:Nanopublication .
>      :nanopubEx np:hasAssertion :assertion .
>      :nanopubEx np:hasProvenance :provenance .
>      :nanopubEx np:hasPublicationInfo :pubInfo .
> }
>
> :assertion {
>     :trastuzumab :is-indicated-for :breast-cancer .
>     :assertion a np:Assertion .
> }
>
> :provenance {
>     :assertion prov:generatedAtTime "2012-02-03T14:38:00Z"^^xsd:dateTime  .
>     :assertion prov:wasDerivedFrom :experiment .
>     :assertion prov:wasAttributedTo :experimentScientist .
>     :provenance a np:Provenance .
> }
>
> :pubInfo {
> :nanopubEx prov:wasAttributedTo :paul .
> :nanopubEx prov:generatedAtTime "2012-10-26T12:45:00Z"^^xsd:dateTime .
> :pubInfo a np:PublicationInfo .
> }
>
>
> The nanopublication servers can propagate these - see
> https://github.com/tkuhn/nanopub-server - using trusty URIs as graph names, the
> URIs containing a hash of the content of the nanopublication so it does not matter
> where it lives. In this aspect you have captured a “knowlet” and can cite it and talk
> about it in other ways – in particular deriving one nanopublication from another in their
> publication info.
>
> This split also allow you to state separately who made the asserted claim (but perhaps
> not in RDF) and who shaped it into RDF/nanopub.
>
>
>
> --
> Stian Soiland-Reyes, eScience Lab
> School of Computer Science, The University of Manchester
> http://orcid.org/0000-0001-9842-9718
>
> From: martin
> Sent: 17 October 2017 11:21
> To: public-prov-comments@w3.org
> Subject: Re: How to attach a prov graph to an RDF Triple?
>
> We use the RDF language TRIG to define named graphs. Currently, all
> important triple stores and RDF-enabled graph databases support the
> concept as "contexts" or whatever. As implementation, it works exactly
> as you expect. We have implemented in this way argumentation and
> annotation models. Even though theory is somehow lagging behind, if you
> use Named Graphs only for instance level for reification, it works well,
> and has the advantage that you keep related triples in the context of
> reference.
>
> Best,
>
> Martin
>
> On 10/12/2017 6:59 PM, Svensson, Lars wrote:
> > Hi Olaf,
> >
> > On Tuesday, October 10, 2017 9:58 PM, Olaf Hartig [mailto:olaf.hartig@liu.se]
> wrote:
> >> To: public-prov-comments@w3.org
> >> There is another approach that we are currently working on. It goes by the
> >> name RDF* and SPARQL*. The basic idea is to allow for nesting of triples and,
> >> similarly, nesting of triple patterns in queries. Find a short, 4-pages
> >> description of the proposal and of our initial results in the following
> >> document:
> >>
> >> http://olafhartig.de/files/Hartig_ISWC2017_RDFStarPosterPaper.pdf
> >>
> >> The document also includes pointers to more detailed documents.
> >>
> >> Let me know if you have any questions about it.
> > Thanks, I shall look closely at it and then come back to you. I'm out of office next
> week so please bear with me that it will take a few days...
> >
> > Best,
> >
> > Lars
> >
> >
>
> --
>
> --------------------------------------------------------------
>   Dr. Martin Doerr              |  Vox:+30(2810)391625        |
>   Research Director             |  Fax:+30(2810)391638        |
>                                 |  Email: martin@ics.forth.gr |
>                                                               |
>                 Center for Cultural Informatics               |
>                 Information Systems Laboratory                |
>                  Institute of Computer Science                |
>     Foundation for Research and Technology - Hellas (FORTH)   |
>                                                               |
>                 N.Plastira 100, Vassilika Vouton,             |
>                  GR70013 Heraklion,Crete,Greece               |
>                                                               |
>               Web-site: http://www.ics.forth.gr/isl           |
> --------------------------------------------------------------
>
>
>
Received on Wednesday, 25 October 2017 09:41:08 UTC