Re: CORE WD - a few remarks from Michael Kifer on 2007-03-16 (public-rif-wg@w3.org from March 2007)

From: Michael Kifer <kifer@cs.sunysb.edu>
Date: Thu, 15 Mar 2007 20:39:56 -0400
To: =?UTF-8?B?RnJhbsKNY29pcyBCcnk=?= <bry@lmu.de>
Cc: "Boley\, Harold" <Harold.Boley@nrc-cnrc.gc.ca>, public-rif-wg@w3.org
Message-ID: <15379.1174005596@cs.sunysb.edu>
> Dear Harold, dear Michael,
> 
> Having a flue, I could spend time carefully reading the CORE working 
> Draft. Impressive work! Below are a few remarks on this document in two 
> lists: principles and style. I hope, my remarks are clear enough. If 
> not, please, tel me!
> 
> Regards,
> 
> François

Francois,
Thank you for your comments. As you correctly picked up from the minutes,
most of them were already incorporated. Here are the answers to the rest of
the issues that you raised.


    cheers
	--michael  


> 1. principles
> 
> - are tuple of length 0, i.e. (), allowed or precluded? Both choices 
> make sense. Mathematical logicians tend, or better tended in the past, 
> to preclude tuples of length 0. Computational logicians often allow 
> them, as e.g. in the programming language ML. Allowing a (single) tuple 
> of length 0 make a systematic treatment of tuples easier. Both choises 
> are possible but RIF CORE should, I think, be explicit. Otherwise there 
> will be different interpretations and inconsistent implementations.

Added as a green blurb issue in the document. Right now they are different.


> - stating that the empty set as the set of sorts for a variable means 
> that this variable 'can be bound to any term' poses, I think, three 
> problems. First,  this is inconsistent with the usual assignment of set 
> of sorts to variables. Indeed, one expects here the set of all sorts.

Which is exactly what it was -- an untyped variable.
The meaning of a set of sorts was actually supposed to be
the intersection of all those sorts. But now it has been simplified.

> I appreciate that using the empty set might be convenient. But this might 
> turn out to be misleading and makes a systematic treatment (e.g. in 
> proofs) more difficult.

Why difficult? (Anyway, this is not relevant any more.)

> Why not introducing a notation for the set of 
> all sorts and use it instead of the empty set?
> Second, the usage assumed 
> is that specifying no (set of) sort(s) for a variable is like specifying 
> the empty set. It might be misleading and surely will make the parsing 
> more complicated to identify 'nothing' with the empty set.

You probably didn't read far enough or skipped something. The syntax for
sorted and unsorted variables specified later. There is nothing in it that
is hard to parse.

> Third, according to the semantics introduced in the document, variables
> are not bound to terms (as would be the case with Herbrand structures)
> but to elements of the universe or domain D.

I don't get you. The semantics is specified in a general way, not just for
Herbrand structures. The notion of "binding a variable to a term" is
**operational** -- it never appears in a general model theory.  Doesn't
even need to appear in a Herbrand semantics (which is a special case).

> The sentence 'variable can be 
> bound to term' is incorrect in this context and eould only be correct if 
> the samenctics would preclude other structures than Herbrand structures.
> 
> - the notion of 'well formed atomic formula' is introduced but that of 
> 'well formed formula' or 'well formed condition' is missing. The later 
> is needed, e.g. for properly conjuncting or disjuncting formulas and for 
> existential quantification.

Fixed, thanks.

> - somewhere it is said that quotes in scalar terms, i.e. constants, are 
> escaped using backslash. A more precise specification of the syntax of 
> scalar terms is needed, stating e.g. that / cannot be used (except for 
> escaping). Furthermore, RIF should not specify yet another such 
> spoecification but instead take over that of XML and/or RDF.

Yes, thanks. Fixed.

> - the notion of 'well sorted expression' (with expression meaning here 
> term or formula), or however this concept is to be called, should make 
> it possible not to attached sort(s) to each occurrence of a constant or 
> variable but only to some, in the very same style as type assignementsi 
> n ML do not have to be given everywhere.

This is precisely how it is done. You need to read a little further about
the shorthand notation for various common sorts.

> Otherwise, the language might be unnecessarily difficult to use.

One shouldn't get carried away with this "use" thing. We are defining an
XML based exchange language. These things are generated by machines, but
should be readable by humans.

The above provision about shorthands is not even for XML, but just for the
presentation syntax. Meaning: for examples, which we will write by hand.
(A precaution against the carpal syndrome.)

> - the notation proposed for assigning sorts to constants and variables 
> dooes not seem to easily extend to assigning several sorts to a constant 
> or variable.

Right. It wasn't intended to. In fact, the earlier definitions were a bit
of an overkill and I changed them. Now only one sort can be attached to a
constant and at most one to a variable. For the RIF Core, we don't need
sets of sorts. For the extensions, the dialects can extend the core sorts
with whatever they need and introduce a partial order over them (even a
lattice, if they wish).  As I mentioned, the intent was to treat sets of
sorts as intersections, but this was awkward and unnecessary (and is gone
now).

But for signatures the notation allows several of them to be assigned to
the same constant.

> - using equality atoms for assigning several sorts to a constant or 
> variable seems rather undesirable because it might unnecessarily 
> complicate the processing of the language: those equalities serving this 
> purpose would have to be recognized in a preprocessing state. Why not 
> simply, or more rigourously, have a specific notation for assigning 
> several sorts to a constant or a variable.

This is actually not the case. The equality stuff is just a formalization
of the idea of subsorts. For instance _long"123" = _decimal"123".
How you implement subsorts is up to you. But if the person you hire to
implement this thing will do it by explicitly representing these equalities
then you better fire the guy as soon as possible.

For more general cases when you want multiple sorts in the dialects, the
partial order (or even a lattice) can be defined over sorts. I believe this
will take care of the needs of even a very demanding user.

> Why not further make it possible to specify such an assignement in a
> foreword to a foormula (r rule) very much like in a variable declaration?
> Why not have a simple sort language making it possible to give names to
> finite sets of sorts?

First, I don't think variable declarations are appropriate in logic
languages, so it is not a good comparison.

Second, if we talk about constants, then what you are proposing is possible
in theory (and, in fact, this is how it is done in theory), but has
problems in an actual practical language. For instance, suppose you declare
the symbol  x%$@yz as something of sort foo:

sort foo x%$@yz, sdfree%$.

But these things are also strings (you might want to consider them as
such).  And strings are (presumably) disjoint of the foo-things. So, how do
I do this exactly?

sort string x%$@yz, sdfree%$.  ???

Do you see a problem? It is possible to overcome this, of course, but the
question is how much ugliness are we willing to tolerate in doing so.

The third reason is that such declarations don't quite jive with XML's
philosophy.

The fourth reason is that there can be way too many constants. You might
have a KB with millions of facts and billions of constants.  Therefore,
even the syntax proposed for signatures (in the document) is not practical
enough.  So we will need to decide on the defaults or on ways to assign
signatures to groups of symbols.

> - the paragraph beginning with "Many primitive sorts in RIF may be based 
> ..." concludes saying "these sorts will haved the form _uri"..."." This 
> means that some sorts have exactly the same syntax as a constant with a 
> sort assignement. My guess is this was not meant!

Thank you for catching the typo.

> 2. style
> 
> - the words 'constants', 'names' (like in 'predicate names') and 
> 'symbols' (like in 'function symbol') are not consistently used. Why not 
> to speak only of 'constants' with the three special cases 'term 
> constant', 'predicate constant' and 'function constant'. This would be 
> consistent and pave the way for dialect à la HiLog where there also are 
> 'predicate variables'.

It would be good to clean this up. Except that I can't make sense of the
term "term constant". I believe that by "term constant" you meant the real
"constant-constant", like `foobar', and not a term like `f(a,b)'.  So, we are
back to constants and predicate/function symbols. What we really have are
"symbols that occur as constant names", "symbols that occur as function
names", and "symbols that occur as predicate names".

> - in the definition of a structure, the notation is not fully 
> consistent. I_TV denotes the interpretation of formulas, the index TV 
> refers to truth value, i.e. the values formulas are assigned. All other 
> functions, I_C, I_R, I_F, have index referring to what is given a value. 
>   Why not rename I_TV in I_Formula (however this is abbreviated)?

I renamed it to I_Truth. This mapping is still quite different from all
others.  I think I_Formula is not descriptive enough. These mappings are
commonly called truth valuations, so I_Truth seems better.

> - is it necessary to denote a Boolean signature <s1 x ... x sk> with > 
> ... >. the < > seem later not to be used.

This was just in an explanation, but I deleted the angles.

> - well formed function term. Under this name, the document specified 
> terms that are 'well sorted'. It would probably be preferable to 
> distinguish between terms that are properly constructed, i.e. well 
> formed, regardless of sort assignments, and (properly constructed terms) 
> with well formed sort assigments.

What are those terms that are not well-constructed? The name "term" is given
in our definitions only to things that are well-constructed.
Received on Friday, 16 March 2007 00:41:43 UTC