[Prov-O] Review of examples

This email is mainly my notes as I have been checking the examples
used by PROV-O to make sure they are valid RDF and valid according to
the ontology.

I would need volunteers to complete the job (such as fixing the
examples that are broken!). Any takers? Also suggestions on other ways
to check this would be appreciated.



I've made a little shell script that checks all PROV-O examples (which
again are embedded into HTML):

http://dvcs.w3.org/hg/prov/file/fd4982cfd7f7/examples/eg-24-prov-o-html-examples/rdf/create/check-examples.bash

It uses CWM to check the syntax only, assuming N3 rather than Turtle.
The output is the filename if it is OK, or error message on stderr:


> stain@ralph-ubuntu:~/src/prov/examples/eg-24-prov-o-html-examples/rdf/create$ ./check-examples.bash
> rdf/property_qualifiedDerivation.ttl
> rdf/property_inserted.ttl
>
> (..)
>     Failed to parse file:///home/stain/src/prov/examples/eg-24-prov-o-html-examples/rdf/create/rdf/class_ContextualizedEntity.ttl
>    file:///home/stain/src/prov/examples/eg-24-prov-o-html-examples/rdf/create/rdf/class_ContextualizedEntity.ttl
> Traceback (most recent call last):
>   File "/usr/local/bin/cwm", line 750, in <module>
>     doCommand()
> rdf/property_hasAnchor.ttl
> rdf/property_key.ttl
> (..)
> stain@ralph-ubuntu:~/src/prov/examples/eg-24-prov-o-html-examples/rdf/create$


Some of these errors are due to them being in TriG format rather than
Turtle format, which is not supported by CWM.

The remaining syntax errors I have corrected and committed.


If we use this style for the named graphs:

:G1 = {
  :a :b :c
} .

Then it is valid N3 (but not Turtle), and can be included in the test.
I changed this temporarily (not checked in) using:

for f in $(find rdf -name '*ttl') ; do cat $f | sed 's/{$/= {/' | sed
's/^}/} ./' > /tmp/$$ && cp /tmp/$$ $f ; done

.. to ensure they are valid enough - and then check with
./check-examples.bash again.


BTW - I found it odd that prov:Note is asserted inside the note graph
in rdf/class_Note.ttl and rdf/class_Trace.ttl, rather than in the
graph outside that has prov:hasAnnotation - so I moved that out. (Also
I guess we don't want to reinvent an annotation ontology -
http://www.w3.org/community/openannotation/ )



Then on to checking if there are some spelling wrong, like prov:Acvititity.

To include the named graph files in this, I temporarily flattened them
rather than use the Turtle syntax:
  hg revert -C rdf
  for f in $(find rdf -name '*ttl') ; do cat $f | sed 's/{$/ a
prov:Entity ./' | sed 's/^}//' > /tmp/$$ && cp /tmp/$$ $f ; done
  hg revert -C rdf




I've made a little shell script that merges all the PROV-O examples to
a single TTL file - ./merge-examples.bash

This generates merged.ttl - see
http://dvcs.w3.org/hg/prov/file/tip/examples/eg-24-prov-o-html-examples/rdf/create/merged.ttl

(Note - you do not need to do the above sed-trick if you are OK to
ignore the 4-5 TriG files from the merge - merge-examples will
silently skip files that don't parse)




I then opened merged.ttl in Protege 4.2 beta, and saved it out again in Turtle:

http://dvcs.w3.org/hg/prov/file/a068151bca91/examples/eg-24-prov-o-html-examples/rdf/create/implied.owl


As a side note - to distinguish between 'our' classes (like Painting)
and classes in other namespaces, like prov:Activity - turn on prefix
qnames by going to View -> Custom rendering -> Render by Qualified
Name.



As you see, Protege has filled in the blanks for 'new' classes and
properties, such as:

prov:inContext rdf:type owl:AnnotationProperty .
prov:contextualized rdf:type owl:AnnotationProperty .
prov:hadQuoter rdf:type owl:AnnotationProperty .
(..)
prov:Actvity rdf:type owl:Class .
prov:CompleteCollection rdf:type owl:Class .




Now, the boring job (which I have not done - any takers?) - is to go
through this list for the properties and classes (you can ignore
'Individuals') - and for any prov: keyword, look in merged.ttl to see
where it came from.

For instance, prov:hadQuoter - can be found in merged.ttl as:

@base <http://dvcs.w3.org/hg/prov/raw-file/tip/examples/eg-24-prov-o-html-examples/rdf/create/rdf/property_hadQuoter.ttl>
.
# ...
      prov:hadQuoter <http://data.semanticweb.org/person/luc-moreau>;
# ...

And so it is rdf/property_hadQuoter.ttl that needs to be fixed (or deleted).



The final step is in Protege 4.2 is to turn on the Reasoner and see if
our mega-merge is consistent. This turn out not to be the case - but
mainly because we have reused some terms in the examples - like :

* First flip in the top drop-down box to PROV-O to run the reasoner on this.
* Under Reasoner, select Fact++
* Under Reasoner, click Start Reasoning
** This should succeed
* Now in the drop-down, go back to merged.ttl
* Under Reasoner, click Start Reasoning
** It will complain that the ontology is inconsistent
** Wait 2 seconds (not sure why)
** Click "Explain"
*** Protege will compute explanations for a while, I got at least 36
before I clicked Stop.

Unfortunately I don't seem to be able to export the explanations in text form.


Example explanation:

> 'The Painter' invalidatedAtTime "2012-09-02T01:31:00Z"
> invalidatedAtTime Range: dateTime

(here the error is probably that ^^xsd:dateTime is missing in the literal)


> filling-fuel endedAtTime "2012-04-24T18:31:00Z"^^xsd:dateTime
> Functional: startedAtTime
> filling-fuel endedAtTime "2012-04-24T18:21:00Z"^^xsd:dateTime

(Here the error is that two examples state endedAtTime for the same
instance - but have different literal values)


> > draft2 qualifiedAssociation _:genid113
> Activity DisjointWith Entity
> qualifiedAssociation Domain Activity
> draft2 wasRevisionOf e1
> wasRevisionOf Domain Entity

Here the problem is that 'draft2' has been asserted with both
wasRevisionOf and qualifiedAssociation - but their domains are
disjoint; it can't both be an Activity and Entity. This bug is in
rdf/class_Revision.ttl.


There are a few more - but there are some common patterns. I suggest:

* Rename example instances that are distinct
* Fix errors such as qualifiedAssociation above (check: Is the error
in the example or the OWL? Note that merged.ttl imports from
http://dvcs.w3.org - so if you change ProvenanceOntology.owl - do a
push before reloading in Protege)
* Re-merge and try again ..



Once the ontology is consistent - it is possible to do File -> Export
inferred axioms to ontology.

As an exercise - here are the inferred axioms from ProvenanceOntology.owl:

http://dvcs.w3.org/hg/prov/file/a068151bca91/examples/eg-24-prov-o-html-examples/rdf/create/implied-provo.owl

I've had a look at this, and most of this are obvious inferences that
reasons need:

    56 <http://www.w3.org/ns/prov#agent> rdf:type owl:ObjectProperty ;
    60                                   owl:inverseOf [ owl:inverseOf
<http://www.w3.org/ns/prov#agent>
    61                                                 ] .

(this could be one of the reasons why we don't have inverses for every
property!)

I looked through this list and did not find anything controversial.




-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester

Received on Tuesday, 3 July 2012 11:19:28 UTC