- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Tue, 06 Mar 2012 18:09:40 +0100
- To: RDF WG <public-rdf-wg@w3.org>
Warning: this email is very long.
What's written here is describing how I would address the UCs, if only
the semantics of [1] was standardised. Several UCs require the semantics
be extended (that is, adding semantic constraints on the notion of
interpretation given in [1]). When exchanging a dataset between
applications, it must be known what semantic extension is used. To do
that, I rely on "meta-statements" that should be put in the dataset file
(or accompanying the file). There are many ways that these
meta-statements can be provided (like in a separate voiD description).
Note that a formal semantics does not *do* anything. So UCs always
require something to be done that the formal semantics does not mandate.
A reasoner is not magically understanding what people want, no matter
what statements are added to the data.
1st case: Simple report of different beliefs
----example-1----
:h2g2 { ex:truth owl:sameAs 42 .}
:devil { ex:truth owl:sameAs 666 .}
---end-of-ex-1---
Here, we have different parties asserting things that may or may not be
accepted. They express opinions, and opinion can contradict. The fact
that the opinion contradicts (the graph merge is inconsistent) does not
mean that the report on the opinions is inconsistent (the dataset is not
inconsistent).
2nd case: we additionally want to say something about the graph
(endorsement, etc):
This can be done by stated explicitly that the graph "labels" are meant
to denote the graphs themselves. Some syntactic sweetness should be
added, something like that:
----example-2----
@graph-iris-denote-graph
:h2g2 :is :right .
:devil :is :wrong .
:h2g2 { ex:truth owl:sameAs 42 .}
:devil { ex:truth owl:sameAs 666 .}
---end-of-ex-2---
Note that here, it is necessary that the IRI ex:truth is interpreted
differently in the two graphs. The additional meta-statement
@graph-iris-denote-graph simply helps a dataset application determine
what convention is used. The formal semantics is still exacly the same,
but the statement can be used to operate certain treatments in a
different way as if the graph IRIs denote, e.g., the primary topic of
the graph.
3rd case: crawlers and similar stuff:
----example-3----
@graph-iris-are-urls
<http://ex.org/doc1.rdf> { ex:truth owl:sameAs 42 .}
<http://ex.net/doc2.ttl> { ex:truth owl:sameAs 666 .}
---end-of-ex-3---
Again, there is no reason to enforce IRIs to denote exactly the same
thing when found in different sources, since documents online can be
wrong, contain mistakes, express beliefs, etc. Many applications will
not have the means to determine which one is correct (if only one is).
Thde meta-statement simply help an application decide what to do with
this, but the formal semantics would not be affected.
As far as archiving or versioning crawled data, I think each crawl
should be put in different files, and version data are kept separately,
using specific vocabularies for dataset metadata (e.g., voiD). I don't
think the dataset semantics has to address how these metadata are described.
4rd case: terminological axioms denote universal truth
An application may store ontologies in the default graph and expect that
the axioms of the ontologies hold everywhere, all the time.
----example-4----
@default-graph-is-universal-truth
foaf:Person owl:disjointWith foaf:Organization .
foaf:Person rdfs:subClassOf foaf:Agent .
:g1 { ex:this a foaf:Person .}
:g2 { ex:this a foaf:Organization .}
---end-of-ex-4---
The meta-statement induces a semantic restriction here. Formally, a
dataset intereptation would satisfy this iff it "dataset-satisfies" the
dataset (as defined in [1]) *and* for all graph "names" <g>, Con(<g>)
satisfies the default graph (I use the new notation that Pat suggested).
In this case, we would infer that:
:g1 { ex:this a foaf:Agent .}
assuming RDFS or OWL semantics is used as a "local" semantics.
5th case: default graph must be the merge of "named" graphs (this has
been reported as common in real implementations)
----example-5----
@default-as-merge
:foaf { foaf:Person rdfs:subClassOf foaf:Agent .}
:g2 { ex:me a foaf:Person .}
---end-of-ex-5---
The meta-statement would induce an additional restriction here: that the
"local" interpretation of the default graph has to satisfy all RDF
graphs inside the "named" graphs. So, in this case, the dataset would
entail:
#default graph:
:ex:me a foaf:Agent
By using both:
@default-graph-is-universal
@default-as-merge
one emulates the case where all graphs are merged.
6th case: Pat's case (IRIs must denote the same thing, but relationship
between things may evolve across time/context)
----example-6----
@iri-is-identity
:g2010-09-11 { ex:joe foaf:worksFor ex:ibm .}
:g2012-01-16 { ex:joe foaf:worksFor ex:cisco .}
---end-of-ex-6---
The meta-statement enforces that, for all graph IRIs <g>, <g'> and all
IRI <u>, the interpretation of <u> in context <g> is the same as the
interpretation of <u> in context <g'>. Formally, Con(<g>)(<u>) =
Con(<g'>)(<u>). In this case, the following would be inconsistent:
----example-7----
@iri-is-identity
:g2010-09-11 { ex:james owl:sameAs ex:jim .}
:g2012-01-16 { ex:james owl:differentFrom ex:jim .}
---end-of-ex-7---
7th case: temporal validity changes (in intervals)
This case is trickier, but I need to introduce it, as it better explains
how we could do more complex reasoning with datasets (then it makes it
either to explain how to address "separation of inferences" in [2]. For
this case, I would address it by using literals in the fourth position,
rather than IRIs.
----example-8----
@temporal-reasoning
ex:chadhurley a ex:YoutubeEmployee . "[2005,2010]"^^interval
ex:YoutubeEmployee a ex:GoogleEmployee . "[2006,2011]"^^interval
---end-of-ex-8---
Here, the additional restriction on the semantics is that each literal
in the datatype "interval" would be assigned a distinct interpretation.
Additionally, anything that is true in an interval [x,y] must be true in
all subintervals. As a consequence, in the example above, the following
quads would be inferred:
ex:chadhurley a ex:YoutubeEmployee . "[2006,2010]"^^interval
ex:YoutubeEmployee rdfs:subClassOf ex:GoogleEmployee .
"[2006,2010]"^^interval
Since the fourth column is now identical for the two triples, the
semantics of [1] says that all normal RDF(S) inferences hold, therefore,
I can conclude:
ex:chadhurley a ex:GoogleEmployee . "[2006,2010]"^^interval
Note that, according to XSD, datatypes provide not only a lexical space,
a value space, a L2V mapping, but they should normally provide "facets"
which are kinds of functions on datatypes. The "literal" datatype could
provide, as a facet, the comparison "included-in".
8th case: generalisation of 7th case
Other kinds of annotations could be used. For instance, a simplpe trust
measure for graphs (possibly calculated from page-rank-like algorithms).
Instead of using the "included-in" relation to define the semantic
restriction for satisfaction, the "less-than" relation would be used. It
can be generalised further, assuming an order on the values (and some
other restrictions).
One particular case in this generalisation is provenance annotations:
----example-8----
@provenance-reasoning
foaf:Person rdfs:subClassOf foaf:Agent . "foaf:"^^prov
ex:chadhurley a foaf:Person . "dbpedia:"^^prov
---end-of-ex-8---
We write provenance as a conjunction of URL, then one can infer:
ex:chadhurley a foaf:Agen . "foaf: \and dbpedia:"^^prov
This partly address Sandro's UC on "separating inferences".
[1] RDF Datasets Proposal.
http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal
[2] Why Graphs. http://www.w3.org/2011/rdf-wg/wiki/Why_Graphs
--
Antoine Zimmermann
ISCOD / LSTI - Institut Henri Fayol
École Nationale Supérieure des Mines de Saint-Étienne
158 cours Fauriel
42023 Saint-Étienne Cedex 2
France
Tél:+33(0)4 77 42 83 36
Fax:+33(0)4 77 42 66 66
http://zimmer.aprilfoolsreview.com/
Received on Tuesday, 6 March 2012 17:09:52 UTC