Re: [SWC] comments/review SWC from Jos de Bruijn on 2008-06-26 (public-rif-wg@w3.org from June 2008)

From: Jos de Bruijn <debruijn@inf.unibz.it>
Date: Thu, 26 Jun 2008 11:34:58 +0200
To: RIF WG <public-rif-wg@w3.org>
Message-ID: <486362C2.5090803@inf.unibz.it>
Axel,

Thanks a lot for the detailed review.  I implemented all your 
suggestions, with the exception of the ones mentioned below.


> *) "A typical scenario for the use of RIF with RDF/OWL is the
> exchange of rules that either use RDF data or an RDFS or OWL
> ontology" -> "A typical scenario for the use of RIF with RDF/OWL is
> the exchange of rules that use RDF data possibly along with some
> RDFS or OWL ontologies"

Your suggested wording seems to require the use of RDF data when using 
RDFS or OWL ontologies.  Instead, I changed the first or to "and/or":
"RDF data and/or RDFS or OWL ontologies"

> 
> *) "The notation of certain symbols particularly IRI's and plain
> literals is slighlty different from the noation in RDF/OWL. This
> diference is illustrated in Section [...]." --> "The notation of
> certain symbols particularly IRI's and plain literals is slighlty
> different from the noation in RDF/OWL. These differences is
> illustrated in Section [...]."

Done.

> 
> Do we need this at all? do we need section 2 at all? the latest PS
> reconciles most of the differences, and section 2 has some flaws
> anyway, see later comments below.

We need a section to explain the differences.  Even if the shortcut 
syntax looks the same for some cases, the syntax is still different, as 
illustrated, for example, at the end of section 3.2.1.2.

> 
> *) "The Appendix [...] describes how reasoning with combinations of
> RIF rules with RDF and a sub set of OWL DL can be reduced to
> reasoning with RIF documents."
> 
> If this is a particular fragment of OWL 2 which we can *name*, then
> this should be mentioned, or another Editor's note should be added.


We are *not* concerned with OWL 2; we are concerned with OWL.

> 
> *) "If the interchange partner B does not have an RDF/OWL aware rule
> system,but B can process RIF rules[...]" --> "If the interchange
> partner B does not have an RDF/OWL aware rule system,but B can
> process RIF BLD rules[...]"

At the end of the second paragraph in the overview I state "In the 
remainder, RIF is understood to refer to RIF BLD".  I think that is a 
sufficient explanation; I think it would be a bad idea to change all 
mentions of RIF to RIF BLD .

> 
> *) "All RIF statements are written using the RIF presentation syntax
> (RIF-BLD). Where possible, this document uses the shortcut syntax for
> IRIs and strings as defined in (RIF-DTB)." It was descided that the
> shortcut notation will be copied to BLD, so probably no need to
> cross-reference DTB here.

Wake me up when this happens...
In the meantime I included an editor's note saying that the reference 
might change.

> BTW, section 2 seems to duplicate some od
> this stuff anyway.

This text in the overview is about notational conventions used in the 
document.  Section 2 is just some explanation of the differences between 
the RDF/OWL syntax and the RIF syntax.  Where is the duplication exactly?

> 
> *) "[...] compact IRIs prefix:localname or typed literals
> "literal"^^datatype-IRI." "[...] compact IRIs prefix:localname or
> typed literals "literal"^^<datatype-IRI>."

Isn't it also allowed to write compacts IRIs as datatype identifiers? 
The previous sentence already says how IRIs may be written; so it should 
be clear that they are either delimited by angle brackets or they are 
written as compact IRIs.

> 
> Section 2 : Symbols in RIF vs RDF/OWL (Informative) 
> ---------------------------------------
> 
> *) "Unicode sequences with symbol space IRIs (DTB)." --> "Unicode
> sequences  with IRIs denoting their symbol spaces (DTB)."

I am reluctant to make this change, because I use the same wording in 
several places in this paragraph.

> 
> *) possible problem: In simple RDF there is not entailment for plain
> literals to literals of type xsd:string, but in your treatment you
> seem to treat both synonymously, 

Indeed, they are simply the same.  That they have a different syntax in 
RDF is simply a bug (or a feature, depending on how you look at it).

> isn't that a problem for simple RDF
> entailment?

If you want to support RIF you have to support data types.  So if you 
want to combine your RDF data with RIF rules you have to support data 
types.  What's the problem?

> I used xsd:  as the prefix
> throughout DTB, I suggest to stick with this and not use xs:

I guess I did not catch that in DTB. You should actually not use the xsd 
prefix, but rather the xs prefix.
The xsd prefix is conventionally associated with the namespace 
http://www.w3.org/2001/XMLSchema-datatypes#, which is deprecated.
The xs prefix is conventionally associated with the namespace 
http://www.w3.org/2001/XMLSchema#, which is the one we use.

> 
> *) I find several things confusing in Table 1. All this should be
> clear from the explanation of shortcuts in DTB/BLD, no need to
> duplicate here.

There *is* a need for duplication here, because this section is meant to 
explain the syntax for RIF symbols to the semantic Web crowd who may not 
have read DTB or BLD.

>  I find this table more confusing than enlighning.
> E.g. -  "IRI" is used instead of "constant in the <tt>rif:iri</tt>
> symbol space -  "String" is used instead of "constant in the
> <tt>xs:string</tt> symbol space - "Symbol in  symbol space" looks
> strange. - 

> we need to mention then that our
> entailment is strictly speaking doing more than simple RDF
> entailment.

There is already an example at the end of section 3.2.1.2.  Do you think 
we need more?

> 
> Section 3: RDF Compatibility ---------------------
> 
> 
> *) "conclusions that may be drawn from the RIF rules are reflected in
> the RDF graphs" -> "conclusions that may be drawn from RIF rules are
> reflected in the RDF graphs"

In that case, I guess the article in front of RDF graphs would also have 
to be removed.
I actually think we should keep the articles, because we are referring 
to rules and graphs in a particular combination.

> 
> *) "there is  a corresponcence between RDF triples of the form s p o
> and RIF frame formulas of the form s[p->]" -> "there is  a
> corresponcence between ground (i.e., blank node free) RDF triples of
> the form s p o and RIF frame formulas of the form s[p->]"

This seems to excludes the use of blank nodes and of variables.  I don't 
think we want to do that.

> 
> *) in the example: why use john, jack and mary and not the good old
> alice bob and charles everybody is familiar with? ;-)

John, Jack, and Mary are much nicer people :-)

> 
> *) I find the "nameBearer" example a bit too artificial ... personal
> taste... what about "namedObject" instead?

John is not an object, but a person. how about "named"?

> *) "The syntax of the names in these sets [...]" -> "an abstract
> syntax for the names in these sets [...]"

Isn't there just one abstract syntax?  Besides, the reader might not be 
familiar with the term "abstract syntax", so I'm rather hesitant to make 
this change.

> 
> *) My only worry about "generalized RDF graphs is: Does this imply
> that RIF compliant implementations need to support generalized
> graphs?

No
First of all, there is no notion of compliance defined for RIF-RDF 
combinations.
Second, if one does not encounter generalized RDF graphs, one does not 
need to support them.
Third, if you already support RIF-RDF combinations with standard RDF 
graphs, then dealing with generalized RDF graphs is easy (nearly trivial).

> As mentioned in earlier discussions I think there is reasons
> NOT to assume the generalization for bnodes in pred positions in
> future versions of RDF.

If there are no RDF graphs with blank nodes in predicates positions, you 
do not have to deal with them.
Note also that one of the editors of the RDF concepts document is 
strongly in favor of generalized RDF graphs [1].

[1] 
http://lists.w3.org/Archives/Public/public-rif-comments/2008May/0003.html 
(comment A)


> *) I consistently changed "datatype" to "data type" in DTB, we should
> b e consistent over all documents.

I tried to be consistent and chose "datatype", as in the XML schema 
datatypes specification, as well as the RDF and OWL specification documents.
It would indeed be good to be consistent in all documents and change 
"data type" to "datatype".

> *) "conforming datatype map" ... conforming to what? maybe better use
> something like "well-defined datatype map"

"Conforming" may not be the best term, but I could not think of anything 
better. I would argue that any partial mapping from IRIs to datatypes is 
a well-defined datatype map.

> 
> *) "The notion of well-typed literal loosely correspond with the
> notion of legal symbol in RIF" -> "The notion of well-typed literal
> loosely corresponds to the notion of legal symbol in RIF" - is there
> a link reference here to "legal symbol"? (can't see in my pdf print) 
> - I don't like "loosely coresponds", this doesn't say anything.

I removed the reference to "legal symbol", because you said you had no 
intention of defining it in DTB.

> 
> 
> Section 3.2 =======
> 
> *) I do not understand your rationale on capitalizing or
> non-capitalizing rdf and rdfs. In my opinion, you should use RDF and
> RDFS in capital letter anywhere. rdf/rdfs looks awkward.

I do not capitalize semantic concepts, following the RDF semantics 
specification.

> 
> Section 3.2.1 ========
> 
> *) "correspondence between RDF triples of the form s p o and RIF
> frames ... (cf. Table 1)" "correspondence between ground RDF triples
> of the form s p o and RIF frames ." - Table 1 doesn't vocer bnodes!
> 

If we want to be very precise here, we need a lot of additional text to 
cover the case of variables.  I would prefer to keep the more general 
statement that's there now.

> 
> *) possible problem/remark: OWL/RDF data or knowledge bases by no
> means restrict the usage of RIF reserved identifiers or built-ins  as
> identifiers for arbitrary resources in RDF triples or as OWL classes
> etc. Is this a possible problem which we should make a remark over?

I don't see a problem.

> e.g. that we ignore triples using the "RIF vocabulary" (to be
> defined.)

We certainly do not ignore such triples , and in some cases the use of 
RIF vocabulary has semantic consequences (e.g., the example at the end 
of section 3.2.1.2).

I do not think further remarks are necessary.

> 
> *) "frame is a mapping from Dind to functions of the form
> SetOfFiniteBags(Dind × Dind) ? D,"
> 
> I am not sure wether I understand this. E.g. rdf:type, are not
> finite... so what does this setoffinitebags mean?

The bag is necessary for interpreting frames consisting of one object 
with multiple properties.  The set (of bags) may be infinite.
this comes from the BLD specification.

> 
> *) "considered datatypes" ... I am a bit worried about these pointers
> to dtb. Actually, you definition of required RIF datatype points to
> the required *symbol spaces* definition in DTB
> 
> Did you add the respective anchor <span id="def-required-datatypes"
> class="anchor"> in the DTB document? this anchor is confusing, since
> the list defines symbol spaces, not data types.

The anchor was in the wrong location.  I moved it to the definition in 
section 2.2.

> 
> Section 3.2.1.2: ==========
> 
> *) In ther first definition, I miss a bullet point for plain
> literals, 6. only seems to cover typed literals.

The interpretation of plan literals in RDF interpretations is fixed. 
So, we don't need a condition here.

> 
> 
> Section 3.2.2: =========
> 
> *) You use  "a" = "b" as an inconsistency... 

I'm not "using" this statement.  There is merely an example putting 
something out.

> Don' we again have
> problems in simple entailment, if we treat plain literals
> synonymously with xsd:strings? In simple RDF interpretations
> "a"^^xsd:string = "b"^^xsd:string wouldn't be an inconsistency,
> right?

Again, I don't see what the problem is.


> Section 5: ======
> 
> *) "Here, ti is an IRI constant of the form <absolute-IRI>, where
> absolute-IRI is the location of an RDF graph to be imported, and pi
> is an IRI constant denoting the profile to be used."
> 
> - by "location" you mean web-accessible? 

Not necessarily.  It may also be the location on the local file system, 
for example.
You may even print a book consisting of an RDF graph, in which case you 
would could the ISBN-IRI :-)

> or named graphs a la Carrol?

I don't know what those are.  Is there a standard about them?

> maybe needs clarification 

I thought to use of IRIs to denote locations, especially locations on 
the Web, is quite obvious to a Web audience.

> - it would be worthwhile to
> forward-reference to the predifined list of profiles in this document
> in section 5.1 here. Otherwise, I am a bit lost what a profile IRI
> actuall is.

There is a mention of profile in the first paragraph.  If anything 
should be further explained, it should be done there.  Do you think any 
change is necessary?

> 
> *) "In case several graphs are imported in a document, and these
> imports specify different profile, the highest of these profiles is
> used."
> 
> - At this point the reader has no idea that there is actually an
> order among profiles, so the meaning of "highest" is unclear... again
> a remark/forward-reference would be worthwhile for clarification.

I added the text "Profiles are assumed to be ordered.".  Do you think 
this is sufficient?

> *) to the best of his ability - > to the best of its ability

I'm not so sure about this; "its" seems rather impersonal.
you have some native speakers around in your office in Galway.  Address 
them what they think?

> 
> *) The sentence: "Any profile that is used with RIF must specify an
> IRI that identifies it and notions of model, satisfiability, and
> entailment for combinations." seems to contradict the notion of the
> "generic" profile, probably you want to weaken or drop this
> statement. e.g. "Any non-<tt>generic</tt> profile [...]"

I do not think others should specify generic profiles.  There is already 
one generic profile and one should choose this whenever one needs a 
generic profile.

> 
> *) Final remark: I have some worries about the following: What if/How
> can someone define profile orders for new profiles, particularly, how
> does someone define profiles which are in a "<" relation with
> existing profiles defined in this document? I.e., what is rthe
> intended semantics of "<"? does it mean "monotonicity of (ground?)
> entailments" or something else?

it does not have an intended semantics.

> I think this should be defined
> somehow. Otherwise, some third party could define a complete nonsense
> order of profiles.

People can always do nonsensical things, whatever constraints you can 
think of.
I do not really the need for defining further restrictions.

> 
> Section 7: Appendix =============
> 
> *) "RIF-RDF combinations can be embedded into RIF Documents in a
> fairly straightforward way, thereby demonstrating how a RIF-compliant
> translator without native support for RDF can process RIF-RDF
> combinations." --> "RIF-RDF combinations can be embedded into RIF
> Documents which enables RIF-compliant translators without native
> support for RDF to process RIF-RDF combinations."

I like my text better, because the appendix shows just one possible 
embedding, so it is a demonstration of how it can be done, not an 
enabling technology.

> 
> *) "The embeddings are defined using the embedding function tr," make
> tr italic.

I already use italicizing for meta-variables, so I don't think it is a 
good thing to italicize tr.

> 
> BTW: maybe you should simply move this to the overview and also
> introduce the func: and pred: prefixes there.

I deliberately did not include them in the overview, because they are 
not used anywhere else (besides the appendix) in the document.

> 
> Section 7.1 =======
> 
> *) "The embedding of RIF-RDF combinations is not defined for
> combinations that include infinite RDF graphs and for combinations
> that include RDF graphs with RDF URI references that are not absolute
> IRIs."
> 
> Why are relative IRIs a problem? can be resolved anyway, or no?

I'm not talking about relative IRIs.  I'm talking about the RDF URI 
references that are not IRIs.

Relative IRIs are merely a surface syntax issue; the definition of 
combinations is on the abstract syntax level, and there all IRIs are 
absolute.

> probably, a reference to the respective section on relative/absolute
> IRIs in BLD is in order.

I included a reference to the endnote explaining the issue

> 
> Section 7.1.1 ========
> 
> *) I am worried about row 3 in the table... i.e. the translation of
> plain literals into xsd:string constants, as I think this can be
> problematic with respect to simple entailment.

Your concern is a semantics issue, not an issue for this appendix

> 
> *) in the last row of the table: "Local constant s^^u' that is not
> used in C" What is u'?

s^^u' is obtained from "s"^^u.

> 
> *) "tr("s"^^u) = "s^^u'"^^rif:local" Unfortunately,  "s^^u'" is not
> possible in the lexical space of rif:local: From DTB (this was
> basically discussed at the f2f in galway): "The lexical space of
> rif:local is a subspace of the lexical space of xsd:string. Namely,
> we allow unicode strings which are also valid XML NCNames as defined
> in [XML-NS]."

As discussed (and no also changed in DTB) "The lexical space of 
rif:local is the same as the lexical space of xsd:string, i.e. all 
Unicode strings."

> 
> Section 7.1.2 ========
> 
> *) write the function "sk" in italic.

As for tr, I am reluctant to italicize functions.

> 
> *) "and variables are Skolemized, i.e., replaced with constant
> symbols" -> "and variables are Skolemized, i.e., replaced with
> "fresh" constant symbols"
> 
> This is not full Skolemization, maybe you should write rather
> "replaced with fresh constants, similar to Skolemization."

why is it not full Skolemization?

> 
> This constant-only-Skolemization only works for a RIF representation
> of an RDF graph, 

Yes
> 
> *) In the last row of the translation table: It seems to me that
> variable names and bnodes need to be standarized apart as a

What does "standarized apart" mean?

> preprocessing step  before in <R,S>, to make this work.

Is there an error in the embedding?  what is the error precisely?

> 
> Section 7.1.3 ========
> 
> *) "Even though the semantics of the RDF vocabulary does not need to
> be axiomatized for simple entailment, the connection between RIF
> class membership and subclass statements and the RDF type and
> subclass statements needs to be axiomatized."
> 
> hmmm, something in this sentence is confusing, not sure what...

Let me know if you find out.  In the meantime I changed the text to

"The semantics of the RDF vocabulary does not need to be axiomatized for 
simple entailment.  Nonetheless, the connection between RIF class 
membership and subclass statements and the RDF type and subclass 
statements needs  axiomatization."

> mayb
> just cutting it down to: "The connection between RIF class membership
> and subclass statements and the RDF type and subclass statements
> needs to be axiomatized." would be fine.
> 
> *) "by using the embeddings of RDF graphs defined in the previous
> section." -> "by using the embeddings of RDF graphs defined above." 
> (the previous section is section 6...)
> 
> *) General comment on the embedding theorems and proofs:
> 
> It seems to me that you should rather group.
> 
> "A RIF-RDF combination <R,{S1,...,Sn}> is satisfiable iff there is a
> semantic multi-structure I that is a model of merge({R, Rsimple,
> trR(S1), ..., trR(Sn)}). "
> 
> and
> 
> "A RIF-RDF combination C=<R,{S1,...,Sn}> simple-entails a generalized
> RDF graph T if and only if merge({R, trR(S1), ..., trR(Sn)}) entails
> trQ(T)."
> 
> in one theorem, and proof it by the Lemma:
> 
> "C simple-entails an existentially closed RIF-BLD condition formula ?
> if and only if merge({R, Rsimple, trR(S1), ..., trR(Sn}) entails ?."
> 
> That seems more logical than grouping the latter two in a theorem,
> i.e.: the important (theorem) are the former two, the latter is a
> lemma from which these two follow. As it looks now, the second
> theorem says two different things at once, which are not logically
> related. If you group it the other way around, you can just proof the
> last one, and the other two follow logically...

Entailment of condition formulas is certainly just as important as 
entailment of graphs, especially when considering RDFS/OWL ontologies as 
data model for the ruleset.
Thinking a bit about it, it also seems logical to consider both notions 
of entailment in the same theorem, as there are also defined in the same 
definition.

I did move the theorems concerning satisfiability down, below the 
theorem concerning entailment.

> 
> proof: => side:
> 
> *) In the proof, why doe we need the paragraph "Assume now that R'
> does not entail trQ(T), [...]" in the end of the => direction at all?
> It seems that the proof of the => side is complete before, and this
> paragraph could, especially if you regroup the theorem/lemma as
> suggested just be mentioned as "a consequence of the lemma".

It is necessary, because the embedding of non-ground graphs is different 
between the entailing and the entailed graph.

> 
> proof: <= side:
> 
> *) What is I$,

it is a definition, not a reference.

> I find the use of the '$' symbol a bit confusing.
> Couldn't you use I' instead of I$?

I could not get the "'" to work with the wiki formatting.

> 
> *) By the way: I find the ambiguous use of the letter 'C' for C and
> I_C possibly confusing.

C is consistently used for combinations, so there will probably not be 
that much confusion.
Do you have a better suggestion?

> 
> *) Again, the paragraph  "Assume now that C doe not entail ..." seems
> a bit redundant. Again, regrouping the theorem, and then saying that
> this is a consequence of the application of the lemma as mentioned
> above, would make this shorter.

Again, this is certainly not redundant, because the embedding of the 
graph is different.

> 
> Section 7.1.4 ========
> 

> *) Can't we use pred:isXMLLiteral instead of ex:illxml? If not,
> shouldn't we define ex:illxml as a predicate in DTB?

As discussed over the phone, the answer is no in both cases.


> 3) The lemma has a long proof, but you can drop some paragraphs.

This doesn't help me much.  Which paragraphs can be dropped and why?


best, Jos



-- 
Jos de Bruijn debruijn@inf.unibz.it
+390471016224 http://www.debruijn.net/
----------------------------------------------
Public speaking is the art of diluting a two-
minute idea with a two-hour vocabulary.
- Evan Esar
Received on Thursday, 26 June 2008 09:34:04 UTC