[Fwd: comments on current version BLD document: symbols, datatypes, semantics] from Jos de Bruijn on 2007-10-09 (public-rif-wg@w3.org from October 2007)

From: Jos de Bruijn <debruijn@inf.unibz.it>
Date: Tue, 09 Oct 2007 17:05:36 +0200
To: RIF <public-rif-wg@w3.org>
Message-ID: <470B98C0.8070403@inf.unibz.it>
I realized I should probably send these comments to the list, and not
just to Michael.....

-------- Original Message --------
Subject: comments on current version BLD document: symbols, datatypes,
semantics
Date: Tue, 09 Oct 2007 16:30:05 +0200
From: Jos de Bruijn <debruijn@inf.unibz.it>
To: Michael Kifer <kifer@cs.sunysb.edu>

Hi Michael,

Sorry for the late comments on the current version of the document.
I've been a bit ill over the last days.

I divided my comments into two parts: editorial/readability and language
definition.

best, Jos

Some editorial comments and suggestions for improving readability:
- Section "symbols and symbol spaces":
1- I'm not entirely comfortable with the structure.  This section
already mentions connectives and quantifiers (in the first paragraph),
but does not define formulas (conditions). Then, the section
"presentation syntax" defines formulas.  I would propose to make the
following changes: remove the references to the connectives and
quantifiers in the symbols section, move the definition of "well-formed
condition formulas to the next section", and rename the "presentation
syntax" section it something like "Conditions", because this is in fact
the sections in which conditions are defined (using the presentation
syntax).

2- regarding the structure of this particular section: constants are
introduced in the first paragraph, and (a couple of pages) later
re-defined as pairs of literals and symbol spaces.  I think this
duplication (off definition) is unnecessary and confusing.  I would
propose to do the following: move the definition of constant symbols as
pairs (including symbol spaces) to the front, before the definition of
terms, and create a new subsection "Constant symbols" (or similar) for
these definitions.  I would also propose to create a subsection for the
signatures and well-formed terms and formulas.
I think that the document would become a little easier to read.

3- for readability, I think that, when introducing variables, their
(presentation) syntax should also be given

4- "symbol spaces", first paragraph: "long" -> "string" (since long is
not such a common datatype; at least not as common as string)

5- "symbol spaces", fourth paragraph: it is not entirely clear what is
meant with "RIF mandates the following symbol spaces".  I would propose
to rephrase it to something like "RIF-compliant implementations must
support the following symbol spaces".  Furthermore, I would propose to
move this sentence (about supporting symbol spaces) to after the
introduction of the individual symbol spaces; I believe this will
improve readability.

6- section "symbol spaces" second paragraph, first sentence: symbol
spaces are actually not subsets of the constant symbols in RIF.  First
of all, a symbol space is not a set; second, constant symbols are pairs
(literal, symbol space IRI). I would propose to rephrase this sentence
to something like "every constant symbol in the RIF as an associate
symbol space"

7- section "symbol spaces": for readability, I would propose to move
forward the definition of the syntax of RIF constants, to before the
definition of symbol spaces; this would give the reader a better idea of
the purpose of the symbol spaces.

8- section "symbol spaces": I think "data types" (including value space
and lexical-to-value mapping) should be defined in this section, and the
distinction with symbol spaces which do not correspond to datatypes
should be made clear.  I believe that many people who read this document
will be aware of a notion of datatypes, and from the text which is
presented people might easily assume that symbol spaces and datatypes
aren't the same thing.

- section "model theory for condition language of RIF BLD":
9- I find it rather confusing to talk about value spaces and
lexical-to-value mappings of symbol spaces, where these actually only
apply to datatypes. I would propose to only talk about the effect of
datatypes.



Comments about the language definition:

- Section "symbols and symbol spaces":

1- section "signatures": there is currently only an informal description
of what partial order between signatures means.  There should be a
formal definition.

2- section "signatures and a condition language of RIF^BLD^": the
definition of equality atoms is not entirely clear: the symbol = is not
a constant symbol in RIF, according to the syntax definition in section
"presentation syntax" (it does not have the symbol space).  Furthermore,
as it is correctly mentioned that equality is not a built-in predicates,
I feel there is an impedance mismatch between this predicate symbol and
all other kinds of predicate symbols.  Finally, equality is currently
not mentioned when atomic formulas are initially defined.  Therefore, I
would propose to define equality atoms a=b directly when first defining
atomic formulas.

3- section "symbol spaces": why is the lexical space of the symbol space
non-empty?  Why not simply define it as a set?  I think it does not hurt
to allow empty lexical spaces.

4- section "symbol spaces": there should be a restriction on the kinds
of datatypes which may be used.  Namely, the lexical-to-value mapping
and the value space need to be well-defined; otherwise, the RIF
semantics is not well-defined. The text in the paragraph "symbols with
ill-formed lexical parts" will then need to be updated accordingly.

5- "symbols with undefined symbol spaces": the description in this
paragraph is incorrect, since, if the symbol space corresponds to a
known datatype, and implementation will interpret the symbol according
to this datatype.  I propose to remove this paragraph.


- section "model theory for condition language of RIF BLD":

6- I see the following problems with the mapping of constant symbols:
   a- it is hard to grasp from the definition how a single constant
symbol is interpreted; it is necessary to carefully read and try to
understand all the (too lengthy) text. The definitions can be much
crisper, as I showed in earlier proposals for this definition.
   b- it is unclear for which datatypes the mapping applies
   c- it is unclear which symbol spaces are "primitive datatypes"
   d- no distinction is being made between the identifier of a symbol
space and the symbol space itself, whereas they are different things
   e- according to the second bullet in "the effect of the symbol
spaces", the mapping IC(lit^^symsp) should be defined for every constant
symbol of the form lit^^symsp with lit in the lexical space of the
symbol space identified by symsp.  This would mean that every such
symbol is in the vocabulary of every RIF language. I think this is
highly undesirable (and should have been mentioned in the syntax
section). The mapping should only be defined for symbols in the vocabulary.
   f- the sentence immediately following the bullets is confusing.  I
would propose to remove it.
   g- the value space is required to be a subset of the domain.  This
means that every interpretation includes all value spaces of all data
types.  This is unnecessary.  I would propose to remove this condition,
because, by definition of IC, every value will to which there is a
mapping will be in D (since the range of IC is D).


-- 
Jos de Bruijn            debruijn@inf.unibz.it
+390471016224         http://www.debruijn.net/
----------------------------------------------
The third-rate mind is only happy when it is
thinking with the majority. The second-rate
mind is only happy when it is thinking with
the minority. The first-rate mind is only
happy when it is thinking.
  - AA Milne
Received on Tuesday, 9 October 2007 15:05:50 UTC