- From: Peter F. Patel-Schneider <pfps@research.bell-labs.com>
- Date: Fri, 02 Nov 2007 05:56:00 -0400 (EDT)
- To: public-rif-comments@w3.org
- Cc: public-owl-wg@w3.org
- Message-Id: <20071102.055600.96493878.pfps@research.bell-labs.com>
Here are comments on the RIF BLD from the OWL WG, both in-line and as an attachment. Peter F. Patel-Schneider Bell Labs Research Comments on RIF Basic Logic Dialect (undated version, downloaded 15 October 2007) from the OWL WG prepared by Peter F. Patel-Schneider These comments on the RIF BLD document from the OWL WG are in the form of a summary of some of the basic technical parts of the RIF BLD plus questions and comments on these parts. The summary itself is a comment in a sense, as it is what was gleaned from the document. If the summary contains errors it may be that there are parts of the technical content of the document that were hard to understand. * General Notes on the RIF The RIF is designed to be a method for exchanging rules or rule-like information between different systems. The RIF Basic Logic Dialect (BLD) is the logic language that is to form part of this exchange method. Part of this language is the condition sublanguage, which is to "provide a common basis for all the dialects of RIF". The syntax used in these comments is the presentation syntax, except that for reasons of space "rif:iri" is replaced by "iri" in examples. There is a fully-striped XML syntax as well. * Syntax of the RIF Condition Sublanguage The full RIF Condition Sublanguage is a polymorphic, polyadic logical language with a higher-order syntax (i.e., it allows variables to occur as predicates and functions in formulae) but with first-order semantics (i.e., quantification of predicate and function variables is not over all possible relations and mappings). There is a special non-logical symbol (=) for equality. Conditions can be like Exists ?x ( p^^iri ( a^^iri ) ( p^^iri ( ?x , p^^iri ) , p^^iri ( a^^iri ( ), ?y , ?z ) ) Or ?x ( b^^iri ) Or ( a^^iri ( 1.2^^xsd:decimal ( a^^iri ) -> ?z , ?w -> ?x ) And p^^iri ( ?x ) ## p^iri [ q^^iri -> ( p^^iri # p^^iri )] ) ) where non-logical symbols starting with ? are variables and the others are constant terms. The first and second atomic formulae above are normal syntactically-higher-order atomic formulae, the third atomic formula is a syntactically-higher-order atomic formula with named (slotted) arguments, the fourth atomic formula is a frame formula containing a subclass formula and a membership formula. Syntactic well-formedness in the RIF BLD is governed by a set of signatures which provide syntactic typing for variables and constant terms. A term (or formula) is well-formed if its components are well-formed and its predicate (or function, respectively) component can have the number and types of arguments that it has in the term (or formula). This construction methodology, however, does not cover the constructs involving #, ##, and [...] which are well-formed if their components are well-formed. The actual RIF BLD Condition Sublanguage "carefully selects signatures so that the corresponding logic will be first order" and monomorphic and mono-adic. Conditions here can be like Exists ?x ( p^^iri ( q^^iri ( ?x , a^^iri ) , r^^iri ( a^^iri, ?y , ?z ) ) Or ( 1.2^^xsd:decimal ( r^^iri -> ?z , ?w -> ?x ) And a^^iri ## b^iri [ f^^iri (a^^iri) -> ( a^^iri # a^^iri ) ] ) ) Here the first atomic formula is a normal formula, the second is a slotted formula, and the third is a frame formula containing a subclass and a membership formula. Argument (slot) names in slotted formulae cannot be function applications or variables, but slot names in frame formulae can be. There is also a distinction between normal predicate applications and built-in predicate applications, but this is only important in the rule language. ** Comments on the Syntax Why is the complex signature mechanism in this document, as the only used language doesn't really need it? Further, the signature mechanism cannot capture the syntax of the BLD condition sublanguage, so it is not even adequate for distinguishing between dialects. Why are there are three different kinds of atomic formulae (regular, slotted, and frame)? This could cause problems with OWL integration, as it is not obvious which kind of formulae should be used for integration with OWL. In particular, frame formulae might be the target for OWL Full and regular formulae the target for OWL DL. * Semantics of the RIF BLD Condition Sublanguage The semantics is only for the RIF BLD Condition Sublanguage, not the full RIF Condition Sublanguage. The semantics is a model-theoretic semantics, based on a partially-ordered set of truth values. The set of truth values for the RIF BLD Condition Sublanguage contains true and false, with false less than true. Interpretations have a domain and nine mappings, 1/ IC from constant terms to domain elements, providing meanings for constant terms used as individual symbols, 2/ IV from variables to domain elements, providing meaning for variables; 3/ IF from constant terms to functions from D* to D, providing meaning for constant terms used as function symbols; 4/ IR from constant terms to *partial* mappings from D* to truth values, providing meaning for constant terms used as predicate symbols, that maps = to the identity predicate; 5/ ISF from constant terms to (total? partial?) mappings that map a bag of pairs from D to D, providing meaning for slotted terms; 6/ ISR from constant terms to (total? partial?) mappings that map a bag of pairs from D to a truth value, providing meaning for slotted predicate terms; 7/ Islot from D to functions from DxD to truth values, providing meaning for frame (not slotted!) terms, a slotted term is essentially a conjunction of is various parts; 8/ Isub from D x D to truth values, providing meaning for subclass, a transitive relationship; 9/ Iisa from D x D to truth values, providing meaning for membership, that distributes over ##. A combination interpretation function, I, is defined to give meaning to terms. I(c) = IC(c) for constant terms used as individual symbols I(v) = IV(v) for variables I(f(t1,...,tn)) = IF(f) ( I(t1),...,I(tn) ) I(f(a1->v1,...,an->vn)) is ISF(f) applied to the n-element bag containing each of the pairs < I(ai),I(vi) > A combination interpretation, Itruth, is defined to give meaning to atomic formulae. Itruth(P(t1,...,tn)) = IR(P) ( I(t1),...,I(tn) ) Itruth(P(a1->v1,...,an->vn)) is ISP(P) applied to the n-element bag containing each of the pairs < I(ai),I(vi) > Itruth(f[p->v]) = Islot(I(p))(I(p),I(v)) Itruth(f[p1->v1 ... pn->vn]) is the conjunction of Itruth(f[pi->vi]) Itruth(t1 ## t2) = Isub(I(t1),I(t2)) Itruth(t1 # t2) = Iisa(I(t1),I(t2)) Note that as the RIF BLD Condition Sublanguage is monomorphic and mono-adic, IC will not be used for any constant term that is typed as a function symbol or a predicate symbol. As well, if a constant term is typed as an n-ary function symbol (predicate symbol) only the n-ary part of its IF (IR, respectively) mapping will be used. Note that there is no interpretation given for frame formulae that contain subclass or membership formulae. These are instead pre-expanded into conjunctions of the frame formula with the subclass or membership formula replaced by its left-hand side and the subclass or membership formula itself. Constant terms (e.g., "1.2"^^xsd:decimal) must be interpreted according to the rules for their symbol spaces (here xsd:decimal). Ill-typed constant terms (e.g., "abc"^^xsd:decimal) are not allowed. The known symbol spaces are xsd:long, xsd:string, xsd:integer, xsd:decimal, xsd:time, xsd:dateTime, rdf:XMLLiteral, rif:text, rif:iri, and rif:local. Constant terms that belong to rif:text are text strings with language tags. Constant terms that belong to rif:iri have a common denotation "regardless of the context in which that constant occurs". Constant terms that belong to rif:local in one rule set "are viewed as unrelated distinct constants" from those in another rule set. * Comments on the Semantics The mappings for predicates are partial. It seems that this means that the truth value of some formulae are thus undefined, but no account is taken of this in the later development of the semantics. Why is a new treatment of data values needed? Why does the set of known data types not include XSD data types like xsd:short? Why does there need to be a symbol space for IRI identifiers? This may cause problems with OWL integration. The treatment of slotted formulae is unusual in that the predicates have a direct map to their extension but the slot names are first mapped into the domain. This means that a=b implies that f[a->3] is equivalent to f[b->3]. * General Comments on the Condition Language The language is very complex. It appears to have been designed to mirror several other languages. In particular, the frame formulae appear to have been designed to mirror F-logic. The logic is not like RDF, as it is monomorphic and predicates are not first mapped into domain elements. The frame part of the logic is not like regular frames, as the slot names are first mapped into domain elements. * RIF Rule Language RIF rules are horn rules made up of an optional RIF BLD condition and a RIF BLD atomic formulae that cannot be a built-in predicate application. Rules are treated like implications in the semantics. A RIF rule set is a set of RIF rules. * RIF-RDF Compatability RIF is combined with RDF D-entailment where D is the set of datatype-like symbol spaces known in RIF. RDF graphs are translated into RIF-space by replacing URI references by the obvious constant term in the rif:iri symbol space. Plain literals without language tags are replaced by constant terms in the xsd:string symbol space. Plain literals with language tags are replaced by constant terms in the rif:text symbol space. Ill-typed literals are replaced by an IRI constructed from the literal. RIF-RDF combinations are given paired interpretations such that 1/ the RIF domain is a superset of the set of RDF resources; 2/ RDF properties are a superset of the elements of the RIF domain that are the slot in a true slot relationship; 3/ the RIF domain is the union of the set of RDF resources and the set of RDF properties; (Yes, this implies condition 1/.) 4/ the set of RDF literals includes all the values in the known datatypes; (Yes, this is already a condition for RDF interpretations.) 5/ the RDF (property) extension of a domain element is the set of pairs that are true for the Islot of that domain element; 6/ the extension of an RDF URI reference (U) is the same as the RIF extension of its translation into RIF ("U"^^rif:iri); 7/ well-typed RDF typed literals map the same as the same RIF typed literal; (Yes, this is true for datatyped interpretations.) 8/ ill-typed RDF typed literals map the same as their RIF translation. Condition 5 makes the connection between the RDF triple s p o . and the RIF frame atomic formula s [ p -> o ]. The rest of the semantics is obvious. Basically the RIF interpretation has to be a RIF model of the RIF rule set and the RDF interpretation has to be a simple, RDF, RDFS, or datatype model of the RDF graph. * Notes on RIF-RDF compatability Why worry about interpretations where IP is not a subset of IR? This only happens in simple entailment. As there are already datatypes in RIF why not just go to datatype-entailment? The treatment of ill-typed literals appears to allow accidental capture if the replacement IRI also occurs in the RDF graph. For example, "abc"^^xsd:decimal ex:a ex:b . RIF-RDF entails http://www.w3.org/2005/rif/rdf-ill-typed-literal/uri-encode("abc"^^xsd:decimal) ex:a ex:b . Note that rdf:type is not related to membership formulae (i#c) and rdfs:subClassOf is not related to subclass formulae (c1##c2). This does not seem to be reasonable. * RIF-OWL Compatability There was a section on RIF-OWL compatability in an earlier draft of the document but it has been removed. * Notes on RIF-OWL Compatability. There is a question as to which part of the syntax OWL should map to. There is also a question as to whether OWL syntax should map to RIF facts.
Comments on RIF Basic Logic Dialect (undated version, downloaded 15 October 2007) from the OWL WG prepared by Peter F. Patel-Schneider These comments on the RIF BLD document from the OWL WG are in the form of a summary of some of the basic technical parts of the RIF BLD plus questions and comments on these parts. The summary itself is a comment in a sense, as it is what was gleaned from the document. If the summary contains errors it may be that there are parts of the technical content of the document that were hard to understand. * General Notes on the RIF The RIF is designed to be a method for exchanging rules or rule-like information between different systems. The RIF Basic Logic Dialect (BLD) is the logic language that is to form part of this exchange method. Part of this language is the condition sublanguage, which is to "provide a common basis for all the dialects of RIF". The syntax used in these comments is the presentation syntax, except that for reasons of space "rif:iri" is replaced by "iri" in examples. There is a fully-striped XML syntax as well. * Syntax of the RIF Condition Sublanguage The full RIF Condition Sublanguage is a polymorphic, polyadic logical language with a higher-order syntax (i.e., it allows variables to occur as predicates and functions in formulae) but with first-order semantics (i.e., quantification of predicate and function variables is not over all possible relations and mappings). There is a special non-logical symbol (=) for equality. Conditions can be like Exists ?x ( p^^iri ( a^^iri ) ( p^^iri ( ?x , p^^iri ) , p^^iri ( a^^iri ( ), ?y , ?z ) ) Or ?x ( b^^iri ) Or ( a^^iri ( 1.2^^xsd:decimal ( a^^iri ) -> ?z , ?w -> ?x ) And p^^iri ( ?x ) ## p^iri [ q^^iri -> ( p^^iri # p^^iri )] ) ) where non-logical symbols starting with ? are variables and the others are constant terms. The first and second atomic formulae above are normal syntactically-higher-order atomic formulae, the third atomic formula is a syntactically-higher-order atomic formula with named (slotted) arguments, the fourth atomic formula is a frame formula containing a subclass formula and a membership formula. Syntactic well-formedness in the RIF BLD is governed by a set of signatures which provide syntactic typing for variables and constant terms. A term (or formula) is well-formed if its components are well-formed and its predicate (or function, respectively) component can have the number and types of arguments that it has in the term (or formula). This construction methodology, however, does not cover the constructs involving #, ##, and [...] which are well-formed if their components are well-formed. The actual RIF BLD Condition Sublanguage "carefully selects signatures so that the corresponding logic will be first order" and monomorphic and mono-adic. Conditions here can be like Exists ?x ( p^^iri ( q^^iri ( ?x , a^^iri ) , r^^iri ( a^^iri, ?y , ?z ) ) Or ( 1.2^^xsd:decimal ( r^^iri -> ?z , ?w -> ?x ) And a^^iri ## b^iri [ f^^iri (a^^iri) -> ( a^^iri # a^^iri ) ] ) ) Here the first atomic formula is a normal formula, the second is a slotted formula, and the third is a frame formula containing a subclass and a membership formula. Argument (slot) names in slotted formulae cannot be function applications or variables, but slot names in frame formulae can be. There is also a distinction between normal predicate applications and built-in predicate applications, but this is only important in the rule language. ** Comments on the Syntax Why is the complex signature mechanism in this document, as the only used language doesn't really need it? Further, the signature mechanism cannot capture the syntax of the BLD condition sublanguage, so it is not even adequate for distinguishing between dialects. Why are there are three different kinds of atomic formulae (regular, slotted, and frame)? This could cause problems with OWL integration, as it is not obvious which kind of formulae should be used for integration with OWL. In particular, frame formulae might be the target for OWL Full and regular formulae the target for OWL DL. * Semantics of the RIF BLD Condition Sublanguage The semantics is only for the RIF BLD Condition Sublanguage, not the full RIF Condition Sublanguage. The semantics is a model-theoretic semantics, based on a partially-ordered set of truth values. The set of truth values for the RIF BLD Condition Sublanguage contains true and false, with false less than true. Interpretations have a domain and nine mappings, 1/ IC from constant terms to domain elements, providing meanings for constant terms used as individual symbols, 2/ IV from variables to domain elements, providing meaning for variables; 3/ IF from constant terms to functions from D* to D, providing meaning for constant terms used as function symbols; 4/ IR from constant terms to *partial* mappings from D* to truth values, providing meaning for constant terms used as predicate symbols, that maps = to the identity predicate; 5/ ISF from constant terms to (total? partial?) mappings that map a bag of pairs from D to D, providing meaning for slotted terms; 6/ ISR from constant terms to (total? partial?) mappings that map a bag of pairs from D to a truth value, providing meaning for slotted predicate terms; 7/ Islot from D to functions from DxD to truth values, providing meaning for frame (not slotted!) terms, a slotted term is essentially a conjunction of is various parts; 8/ Isub from D x D to truth values, providing meaning for subclass, a transitive relationship; 9/ Iisa from D x D to truth values, providing meaning for membership, that distributes over ##. A combination interpretation function, I, is defined to give meaning to terms. I(c) = IC(c) for constant terms used as individual symbols I(v) = IV(v) for variables I(f(t1,...,tn)) = IF(f) ( I(t1),...,I(tn) ) I(f(a1->v1,...,an->vn)) is ISF(f) applied to the n-element bag containing each of the pairs < I(ai),I(vi) > A combination interpretation, Itruth, is defined to give meaning to atomic formulae. Itruth(P(t1,...,tn)) = IR(P) ( I(t1),...,I(tn) ) Itruth(P(a1->v1,...,an->vn)) is ISP(P) applied to the n-element bag containing each of the pairs < I(ai),I(vi) > Itruth(f[p->v]) = Islot(I(p))(I(p),I(v)) Itruth(f[p1->v1 ... pn->vn]) is the conjunction of Itruth(f[pi->vi]) Itruth(t1 ## t2) = Isub(I(t1),I(t2)) Itruth(t1 # t2) = Iisa(I(t1),I(t2)) Note that as the RIF BLD Condition Sublanguage is monomorphic and mono-adic, IC will not be used for any constant term that is typed as a function symbol or a predicate symbol. As well, if a constant term is typed as an n-ary function symbol (predicate symbol) only the n-ary part of its IF (IR, respectively) mapping will be used. Note that there is no interpretation given for frame formulae that contain subclass or membership formulae. These are instead pre-expanded into conjunctions of the frame formula with the subclass or membership formula replaced by its left-hand side and the subclass or membership formula itself. Constant terms (e.g., "1.2"^^xsd:decimal) must be interpreted according to the rules for their symbol spaces (here xsd:decimal). Ill-typed constant terms (e.g., "abc"^^xsd:decimal) are not allowed. The known symbol spaces are xsd:long, xsd:string, xsd:integer, xsd:decimal, xsd:time, xsd:dateTime, rdf:XMLLiteral, rif:text, rif:iri, and rif:local. Constant terms that belong to rif:text are text strings with language tags. Constant terms that belong to rif:iri have a common denotation "regardless of the context in which that constant occurs". Constant terms that belong to rif:local in one rule set "are viewed as unrelated distinct constants" from those in another rule set. * Comments on the Semantics The mappings for predicates are partial. It seems that this means that the truth value of some formulae are thus undefined, but no account is taken of this in the later development of the semantics. Why is a new treatment of data values needed? Why does the set of known data types not include XSD data types like xsd:short? Why does there need to be a symbol space for IRI identifiers? This may cause problems with OWL integration. The treatment of slotted formulae is unusual in that the predicates have a direct map to their extension but the slot names are first mapped into the domain. This means that a=b implies that f[a->3] is equivalent to f[b->3]. * General Comments on the Condition Language The language is very complex. It appears to have been designed to mirror several other languages. In particular, the frame formulae appear to have been designed to mirror F-logic. The logic is not like RDF, as it is monomorphic and predicates are not first mapped into domain elements. The frame part of the logic is not like regular frames, as the slot names are first mapped into domain elements. * RIF Rule Language RIF rules are horn rules made up of an optional RIF BLD condition and a RIF BLD atomic formulae that cannot be a built-in predicate application. Rules are treated like implications in the semantics. A RIF rule set is a set of RIF rules. * RIF-RDF Compatability RIF is combined with RDF D-entailment where D is the set of datatype-like symbol spaces known in RIF. RDF graphs are translated into RIF-space by replacing URI references by the obvious constant term in the rif:iri symbol space. Plain literals without language tags are replaced by constant terms in the xsd:string symbol space. Plain literals with language tags are replaced by constant terms in the rif:text symbol space. Ill-typed literals are replaced by an IRI constructed from the literal. RIF-RDF combinations are given paired interpretations such that 1/ the RIF domain is a superset of the set of RDF resources; 2/ RDF properties are a superset of the elements of the RIF domain that are the slot in a true slot relationship; 3/ the RIF domain is the union of the set of RDF resources and the set of RDF properties; (Yes, this implies condition 1/.) 4/ the set of RDF literals includes all the values in the known datatypes; (Yes, this is already a condition for RDF interpretations.) 5/ the RDF (property) extension of a domain element is the set of pairs that are true for the Islot of that domain element; 6/ the extension of an RDF URI reference (U) is the same as the RIF extension of its translation into RIF ("U"^^rif:iri); 7/ well-typed RDF typed literals map the same as the same RIF typed literal; (Yes, this is true for datatyped interpretations.) 8/ ill-typed RDF typed literals map the same as their RIF translation. Condition 5 makes the connection between the RDF triple s p o . and the RIF frame atomic formula s [ p -> o ]. The rest of the semantics is obvious. Basically the RIF interpretation has to be a RIF model of the RIF rule set and the RDF interpretation has to be a simple, RDF, RDFS, or datatype model of the RDF graph. * Notes on RIF-RDF compatability Why worry about interpretations where IP is not a subset of IR? This only happens in simple entailment. As there are already datatypes in RIF why not just go to datatype-entailment? The treatment of ill-typed literals appears to allow accidental capture if the replacement IRI also occurs in the RDF graph. For example, "abc"^^xsd:decimal ex:a ex:b . RIF-RDF entails http://www.w3.org/2005/rif/rdf-ill-typed-literal/uri-encode("abc"^^xsd:decimal) ex:a ex:b . Note that rdf:type is not related to membership formulae (i#c) and rdfs:subClassOf is not related to subclass formulae (c1##c2). This does not seem to be reasonable. * RIF-OWL Compatability There was a section on RIF-OWL compatability in an earlier draft of the document but it has been removed. * Notes on RIF-OWL Compatability. There is a question as to which part of the syntax OWL should map to. There is also a question as to whether OWL syntax should map to RIF facts.
Received on Friday, 2 November 2007 10:06:31 UTC