review of RIF Core document

Review  RIF Core Design (version: 2007-02-12T10:20 GMT)
Reviewer: Jos de Bruijn

I divided my review comments into three parts: discussion items for the 
working group,

general comments about the document, and editorial comments.

First two general comments:
- From reading the document, it looks more like an internal discussion 
document than a technical specification.  There is a lot of discussion 
about the rationale of certain choices, as well as possible future 
extensions, in between the definitions.
Additionally, many of the definitions are not crisply and concisely 
formulated, making them hard to understand.  The rationale of certain 
choices in the language design might be written down in an introduction 
or in an accompanying document, if we want to publicize them at all.  A 
discussion of possible extensions of the RIF Core could be a separate 
chapter in the specification, or a separate document.
-The connection with the Web is not really apparent in the language; in 
fact, the only real connection with the Web seems to be in the use of 
sorts related to Web standards (URI and XML schema). See my comments 4 
and 5 below.

== Discussion Items ==

1- core syntax: allowing disjunction in the conditions: I am not sure 
whether disjunction in the conditions should be allowed in the Core.  I 
know that rules with disjunction in the bodies can be equivalently 
rewritten to Horn rules, but this rewriting might lead to an exponential 
blowup in the number of rules.  Additionally, I am also not convinced 
that every dialect which extends the Core would need such disjunctions.
2- the document seems to commit to four RIF dialects: LP, FO, PR, and 
RR.  Does the working group want to commit to developing these dialects 
at this point?
3- core syntax: I am not sure what the merit is of the notion of an 
anonymous variable in the abstract syntax.  It seems to me that this 
only complicates matters, and that anonymous variables are not really 
useful in an abstract syntax; they can be introduced in concrete syntaxes.
4- syntax: in order to reflect the fact that RIF is a Web language, we 
might want to give URIs a more prominent place in the language, rather 
than treating it as a subtype of string.  URI could be a "built-in" sort.
5- syntax: in order to reflect the connection with the semantic Web, we 
might want to use the standard RDF abstract notation for typed literals, 
such as strings and integers.  For example, "hello world"^^xsd:string. 
In fact, this syntax could be generalized towards any kind of sort, and 
not only XML schema datatypes.
Additionally, the connection with XML schema datatypes should be made 
more explicit in the document, IMHO.

== General Comments ==

6- section 1: it is not entirely clear to me what is meant with a 
"fragment" of a dialect.  Additionally, the listing seems to mix up 
dialects and fragments: e.g., LP and FO are dialects, where as IC is a 
fragment(?). In general, this section should define the notions of 
"dialect" and "fragments", and describes the mechanism(s) to identify 
them (a step in this direction was made through the discussion at the 
end of the section, but the working group needs to decide on the 
mechanism to be used, and the section needs to be updated to reflect this).
7- section 1.1: from the introduction it is not really clear which data 
types are supported; if I remember correctly, we agreed on a number of 
datatypes which should be initially supported, in a recent telephone 
conference.  Additionally, there should be a reference to XML schema 
datatypes.
8- section 1.1: introduction: there is a reference here to features in 
possible future dialects for which the Core should cater.  This seems 
rather ad hoc: there is no analysis of features in possible future 
dialects which should be taken into account in the Core.  I would 
propose to either not include the motivation for having an unsorted 
basic language, or refer to URIs and/or RDF for this motivation.
9- section 1.1, syntax: it is not entirely clear what the status is of 
the syntax introduced here.  Is this the abstract syntax for RIF Core? 
If so, this should be stated.  If not, then an abstract syntax should be 
introduced.
10- section 1.1: the scoping rules for quantifiers in RIF should be 
written here explicitly, IMHO.
11- XML syntax: it is not entirely clear to me what the status is of the 
XML syntax mentioned in section 1.1.  In general, I think it would be 
better to describe the XML syntax in a different location, together with 
a complete mapping from the abstract syntax.
12- section 1.1, semantic structures: the first and second paragraph I 
written more in a discussion style, rather than definition style.  This 
is, IMHO, not suitable for a technical specification.
13- section 1.1, semantic structures: the notion of "formulas" is not 
defined here.  Additionally, there is no reference to the syntax 
presented in the previous subsection.  I would propose to define the 
notions of "language" and "formula" in the section "syntax" and refer to 
those notions in this subsection.
14- section 1.1, semantic structures: I is defined as a mapping from 
formulas to truth values.  However, it is later extended to map 
constants, variables, and constructed terms to the domain.  This leads 
to some confusion, especially since every constants and every construct 
a term is also an atomic formula. Furthermore, it is unclear why the 
mapping I_TV is necessary.
15- section 1.1, multi-sorted extensions: again, the introduction is 
written in a discussion style, rather than a definition style
16- section 1.1, multi-sorted syntax: I am missing here a concise BNF 
like notation.
17- section 1.1, multi-sorted syntax: the datatype "float" is completely 
different from "decimal", also in Java.  If you look closely at the XML 
schema specification, you will see that decimals correspond to real 
values, as in mathematics, whereas floats correspond to a specific 
approximation for storing such values in their limited amount of space, 
according to the ISO/IEEE (I don't remember exactly which) standard.
18- section 1.1, multi-sorted syntax: I do not really understand the 
definition of PSort; there are two definitions of the signature of this 
function, and one definition of the function itself, which recursively 
depends on itself, saying essentially: s\in PSort(t) iff s\in PSort(t), 
which is not very helpful.
19- section 1.1, semantics of multi-sorted RIF Core: this section 
mentions differences between semantic structures of multi-sorted RIF and 
(standard?) RIF, but does not define multi-sorted RIF structures as 
such, neither does it define satisfaction in such structures. 
Additionally, it is not clear what the status is of the functions ASort,
PSort, and BSort. More specifically, it is not clear whether they are 
the same for all of RIF, or whether they differ per language (rule base).
20- section 2.1, syntax: the symbol "true" is used, but was never defined
21- section 2.1, syntax: it is claimed that the production "CLAUSE" 
generates a universally closed rule; this is, however, not guaranteed.
22- is section 2.1, intended models of rules: I think this section can 
be removed for the working draft, since it is really a discussion of 
possible future extensions.


== Editorial Comments ==

23- editorial: abstract: the second paragraph should first explain that 
in phase 1, RIF develops the positive Horn Language, before introducing 
the condition language.
24- editorial: the paragraph following the bulleted list in section 1 is 
written in the form of a discussion, rather than descriptive text; this 
should be rewritten for a working draft.
25- section 1.1, syntax, editorial: it is not clear what NAME is; this 
should be explained together with the grammar.
26- editorial: section 1.1, syntax: "are assumed to be free" => "are free"
27- editorial: section 1.1, syntax: example 1a contains a rule, whereas 
the syntax for rules has not been explained yet.
28- editorial: section 1.1, syntax: there is no explanation in the text 
about the examples.
29- editorial: section 1.1, syntax: the terms "role element" and "class 
element" are not defined, but used in the second paragraph following 
example 1b.
30- editorial: section 1.1, semantic structures: the symbols "Con,Var" 
are used with two different typefaces: times and courier; please use one 
typeface consistently throughout the section.
31- editorial: it would be helpful if you use subscript when indexing 
symbols (e.g. t1,c1).


-- 
Jos de Bruijn,        http://www.debruijn.net/
+43 512 507 6475         jos.debruijn@deri.org
DERI                      http://www.deri.org/
----------------------------------------------
The outcome of any serious research can only
be to make two questions grow where only one
grew before.
   - Thorstein Veblen

Received on Monday, 12 February 2007 15:43:46 UTC