- From: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Date: Tue, 06 Mar 2012 18:09:40 +0100
- To: RDF WG <public-rdf-wg@w3.org>
Warning: this email is very long. What's written here is describing how I would address the UCs, if only the semantics of [1] was standardised. Several UCs require the semantics be extended (that is, adding semantic constraints on the notion of interpretation given in [1]). When exchanging a dataset between applications, it must be known what semantic extension is used. To do that, I rely on "meta-statements" that should be put in the dataset file (or accompanying the file). There are many ways that these meta-statements can be provided (like in a separate voiD description). Note that a formal semantics does not *do* anything. So UCs always require something to be done that the formal semantics does not mandate. A reasoner is not magically understanding what people want, no matter what statements are added to the data. 1st case: Simple report of different beliefs ----example-1---- :h2g2 { ex:truth owl:sameAs 42 .} :devil { ex:truth owl:sameAs 666 .} ---end-of-ex-1--- Here, we have different parties asserting things that may or may not be accepted. They express opinions, and opinion can contradict. The fact that the opinion contradicts (the graph merge is inconsistent) does not mean that the report on the opinions is inconsistent (the dataset is not inconsistent). 2nd case: we additionally want to say something about the graph (endorsement, etc): This can be done by stated explicitly that the graph "labels" are meant to denote the graphs themselves. Some syntactic sweetness should be added, something like that: ----example-2---- @graph-iris-denote-graph :h2g2 :is :right . :devil :is :wrong . :h2g2 { ex:truth owl:sameAs 42 .} :devil { ex:truth owl:sameAs 666 .} ---end-of-ex-2--- Note that here, it is necessary that the IRI ex:truth is interpreted differently in the two graphs. The additional meta-statement @graph-iris-denote-graph simply helps a dataset application determine what convention is used. The formal semantics is still exacly the same, but the statement can be used to operate certain treatments in a different way as if the graph IRIs denote, e.g., the primary topic of the graph. 3rd case: crawlers and similar stuff: ----example-3---- @graph-iris-are-urls <http://ex.org/doc1.rdf> { ex:truth owl:sameAs 42 .} <http://ex.net/doc2.ttl> { ex:truth owl:sameAs 666 .} ---end-of-ex-3--- Again, there is no reason to enforce IRIs to denote exactly the same thing when found in different sources, since documents online can be wrong, contain mistakes, express beliefs, etc. Many applications will not have the means to determine which one is correct (if only one is). Thde meta-statement simply help an application decide what to do with this, but the formal semantics would not be affected. As far as archiving or versioning crawled data, I think each crawl should be put in different files, and version data are kept separately, using specific vocabularies for dataset metadata (e.g., voiD). I don't think the dataset semantics has to address how these metadata are described. 4rd case: terminological axioms denote universal truth An application may store ontologies in the default graph and expect that the axioms of the ontologies hold everywhere, all the time. ----example-4---- @default-graph-is-universal-truth foaf:Person owl:disjointWith foaf:Organization . foaf:Person rdfs:subClassOf foaf:Agent . :g1 { ex:this a foaf:Person .} :g2 { ex:this a foaf:Organization .} ---end-of-ex-4--- The meta-statement induces a semantic restriction here. Formally, a dataset intereptation would satisfy this iff it "dataset-satisfies" the dataset (as defined in [1]) *and* for all graph "names" <g>, Con(<g>) satisfies the default graph (I use the new notation that Pat suggested). In this case, we would infer that: :g1 { ex:this a foaf:Agent .} assuming RDFS or OWL semantics is used as a "local" semantics. 5th case: default graph must be the merge of "named" graphs (this has been reported as common in real implementations) ----example-5---- @default-as-merge :foaf { foaf:Person rdfs:subClassOf foaf:Agent .} :g2 { ex:me a foaf:Person .} ---end-of-ex-5--- The meta-statement would induce an additional restriction here: that the "local" interpretation of the default graph has to satisfy all RDF graphs inside the "named" graphs. So, in this case, the dataset would entail: #default graph: :ex:me a foaf:Agent By using both: @default-graph-is-universal @default-as-merge one emulates the case where all graphs are merged. 6th case: Pat's case (IRIs must denote the same thing, but relationship between things may evolve across time/context) ----example-6---- @iri-is-identity :g2010-09-11 { ex:joe foaf:worksFor ex:ibm .} :g2012-01-16 { ex:joe foaf:worksFor ex:cisco .} ---end-of-ex-6--- The meta-statement enforces that, for all graph IRIs <g>, <g'> and all IRI <u>, the interpretation of <u> in context <g> is the same as the interpretation of <u> in context <g'>. Formally, Con(<g>)(<u>) = Con(<g'>)(<u>). In this case, the following would be inconsistent: ----example-7---- @iri-is-identity :g2010-09-11 { ex:james owl:sameAs ex:jim .} :g2012-01-16 { ex:james owl:differentFrom ex:jim .} ---end-of-ex-7--- 7th case: temporal validity changes (in intervals) This case is trickier, but I need to introduce it, as it better explains how we could do more complex reasoning with datasets (then it makes it either to explain how to address "separation of inferences" in [2]. For this case, I would address it by using literals in the fourth position, rather than IRIs. ----example-8---- @temporal-reasoning ex:chadhurley a ex:YoutubeEmployee . "[2005,2010]"^^interval ex:YoutubeEmployee a ex:GoogleEmployee . "[2006,2011]"^^interval ---end-of-ex-8--- Here, the additional restriction on the semantics is that each literal in the datatype "interval" would be assigned a distinct interpretation. Additionally, anything that is true in an interval [x,y] must be true in all subintervals. As a consequence, in the example above, the following quads would be inferred: ex:chadhurley a ex:YoutubeEmployee . "[2006,2010]"^^interval ex:YoutubeEmployee rdfs:subClassOf ex:GoogleEmployee . "[2006,2010]"^^interval Since the fourth column is now identical for the two triples, the semantics of [1] says that all normal RDF(S) inferences hold, therefore, I can conclude: ex:chadhurley a ex:GoogleEmployee . "[2006,2010]"^^interval Note that, according to XSD, datatypes provide not only a lexical space, a value space, a L2V mapping, but they should normally provide "facets" which are kinds of functions on datatypes. The "literal" datatype could provide, as a facet, the comparison "included-in". 8th case: generalisation of 7th case Other kinds of annotations could be used. For instance, a simplpe trust measure for graphs (possibly calculated from page-rank-like algorithms). Instead of using the "included-in" relation to define the semantic restriction for satisfaction, the "less-than" relation would be used. It can be generalised further, assuming an order on the values (and some other restrictions). One particular case in this generalisation is provenance annotations: ----example-8---- @provenance-reasoning foaf:Person rdfs:subClassOf foaf:Agent . "foaf:"^^prov ex:chadhurley a foaf:Person . "dbpedia:"^^prov ---end-of-ex-8--- We write provenance as a conjunction of URL, then one can infer: ex:chadhurley a foaf:Agen . "foaf: \and dbpedia:"^^prov This partly address Sandro's UC on "separating inferences". [1] RDF Datasets Proposal. http://www.w3.org/2011/rdf-wg/wiki/TF-Graphs/RDF-Datasets-Proposal [2] Why Graphs. http://www.w3.org/2011/rdf-wg/wiki/Why_Graphs -- Antoine Zimmermann ISCOD / LSTI - Institut Henri Fayol École Nationale Supérieure des Mines de Saint-Étienne 158 cours Fauriel 42023 Saint-Étienne Cedex 2 France Tél:+33(0)4 77 42 83 36 Fax:+33(0)4 77 42 66 66 http://zimmer.aprilfoolsreview.com/
Received on Tuesday, 6 March 2012 17:09:52 UTC