Re: comments on current version BLD document: symbols, datatypes, semantics

Michael Kifer wrote:
>>>>>> 6- section "symbol spaces" second paragraph, first sentence: symbol
>>>>>> spaces are actually not subsets of the constant symbols in RIF.  First
>>>>>> of all, a symbol space is not a set; second, constant symbols are pairs
>>>>>> (literal, symbol space IRI). I would propose to rephrase this sentence
>>>>>> to something like "every constant symbol in the RIF as an associate
>>>>>> symbol space"
>>>>> They are named subsets. (I added "named" to their abstract definition).
>>>> they are not subsets, because symbol spaces are pairs of lexical spaces
>>>> and identifiers
>>> Symbol spaces are defined as subsets of the set Const of all constants.
>> I was thrown off by the phrase below the definition, which says "Each
>> symbol space is described through its lexical space and its
>> identifiers.", which appears to be a definition.
>> I would propose to rephrase it to something like "Each symbol space has
>> an associated lexical space and a number of identifiers."
>> Furthermore, I would propose to start the definition of symbol spaces on
>> a new line, rather than in the middle of a paragraph.
>> Finally, I would prefer it if it is written explicitly what this subset
>> of Const, which is the symbol space, actually looks like (i.e. a set of
>> pairs).  I think it is more readable if this is spelled out, rather than
>> it being implicit in the text.
> 
> OK. I made the changes that you suggested. I do not quite see how to
> implement your last suggestion without making the sentence look bad.
> If you have a concrete suggestion, please send it to me. Take a look at how
> it looks now.

A possibility would be (notice that I removed the word "just", which
seemed awkward in this sentence):

"Formally, a symbol space is a named subset of the set of all constants,
Const, consisting of pairs of strings and symbol space identifiers."

I recognize that this sentence is not really perfect, but I don't have a
better suggestion at the moment.

> 
> 
>>>>>> 2- section "signatures and a condition language of RIF^BLD^": the
>>>>>> definition of equality atoms is not entirely clear: the symbol = is not
>>>>>> a constant symbol in RIF, according to the syntax definition in section
>>>>>> "presentation syntax" (it does not have the symbol space).  Furthermore,
>>>>>> as it is correctly mentioned that equality is not a built-in predicates,
>>>>>> I feel there is an impedance mismatch between this predicate symbol and
>>>>>> all other kinds of predicate symbols.  Finally, equality is currently
>>>>>> not mentioned when atomic formulas are initially defined.  Therefore, I
>>>>>> would propose to define equality atoms a=b directly when first defining
>>>>>> atomic formulas.
>>>>> A good point! You were looking at a version before I moved = further down,
>>>>> but the point about its symbol space is well-taken. It should be either
>>>>> rif:local (my pref) or rif:iri.
>>>>>
>>>>> I experimented with requiring all symbols to have explicit symbol spaces in
>>>>> the examples, but I think they become unsightly due to that. Perhaps we
>>>>> should not require ^^rif:local explicitly. Then we could write a=b as
>>>>> before.  If people think that we should insist on explicit symbol space
>>>>> names (as it is done now, to make the syntax look more abstract and devoid
>>>>> of syntactic sugar) then I am fine with writing a =^^rif:local b or even
>>>>> equal^^rif:local(a,b).
>>>> This does not yet address my concerns about the impedance mismatch
>>>> between the equality symbol on the one hand, and predicate symbols on
>>>> the other: Predicate symbols are interpreted as sets of tuples, whereas
>>>> equality is interpreted as identity.
>>> Equality is a set of tuples -- it was just defined differently, but
>>> equivalently. I'll change its definition so that it will look like a normal
>>> predicate. But it is a good point. Other than that, would =^^rif:local
>>> satisfy your objection? (I already did it).
>> I would prefer = to be a special symbol in the language, rather than a
>> predicate symbol with a special interpretation.
>> However, I can live with the current definition.
> 
> 
> Why should it be a special symbol when it is just a predicate?

This is where our points of view differ; I do not see it as a predicate
 symbol. But, anyway, I can live with it being described as one.

>>>>>> 3- section "symbol spaces": why is the lexical space of the symbol space
>>>>>> non-empty?  Why not simply define it as a set?  I think it does not hurt
>>>>>> to allow empty lexical spaces.
>>>>> A symbol space defines a set of constants. If the lexical space is empty,
>>>>> the set of constants is empty. Does not make sense to contemplate such sets
>>>>> of symbols in the syntax, IMO.
>>>> Well, I can easily think of cases where one would want an empty symbol
>>>> space. I can, for example, easily imagine rule sets which do not use any
>>>> local identifiers, so the lexical space of the rif:local symbol space
>>>> will be empty.
>>>> furthermore, the definitions just become more complicated by requiring
>>>> the sets to be nonempty.
>>> The lexical space of rif:local is never empty. It is defined by RIF in a
>>> concrete way.
>>> Does not matter if a particular rule set or language uses that or not.
>>> There is no side effect.
>> This doesn't address my point about definitions becoming more
>> complicated.  I simply don't see why we would need such a requirement.
> 
> And I do not see why should we allow things that make no sense.  Why do we
> need to even consider allowing empty named subsets of the overall sets of
> constants?

Allowing is not the additional burden here.  Disallowing is the
additional burden. So, again, I don't see you real reason for including
this additional requirement.
In any case, I guess it is not a big issue, so I won't object to the
current text.

> 
>>>>>>    e- according to the second bullet in "the effect of the symbol
>>>>>> spaces", the mapping IC(lit^^symsp) should be defined for every constant
>>>>>> symbol of the form lit^^symsp with lit in the lexical space of the
>>>>>> symbol space identified by symsp.  This would mean that every such
>>>>>> symbol is in the vocabulary of every RIF language. I think this is
>>>>>> highly undesirable (and should have been mentioned in the syntax
>>>>>> section). The mapping should only be defined for symbols in the vocabulary.
>>>>> We are defining a logic, including some specific sets of symbols that
>>>>> belong to the language of that logic.
>>>> Yes, but it is not necessary.  Why not let the user decide exactly which
>>>> of the symbols to use?
>>> The user can use or not use whatever symbols they want. This has absolutely
>>> no effect on the user! When you define a language of a logic, you are
>>> defining a language for that logic, not for a particular set of formulas in
>>> that logic. This is a standard textbook way of doing things.
>> In most text books, papers, and standard specifications I read
>> (including your own paper [1]) this is not the way of doing things.
>> The usual definition of a logic looks something like:
>> Given some set of symbols A ( i.e. vocabulary/alphabet/signature ), the
>> language LA is the set of formulas constructed using the symbols in A
>> and the logical connectives.......
> 
> Exactly. This is precisely what we are doing in RIF right now (and
> precisely what I said above).
> I may have misunderstood what you said earlier:
> 
>      This would mean that every such symbol is in the vocabulary of every
>      RIF language. I think this is highly undesirable (and should have been
>      mentioned in the syntax section).
> 
> Indeed, every such symbol is in the vocabulary of every RIF language.
> I am yet to understand why is it "highly undesirable."
> 
> The standard way is to first define an alphabet (which includes all the
> symbols, connectives, etc.) and then define the rules for putting the
> alphabet symbols together into formulas.  This is not explicitly mentioned
> -- an omission. It is mentioned now. In fact, there was a bad typo, which
> said "The language of RIF ..." while it should have been "The alphabet of
> RIF ...".

As I understand, the standard way is to let the alphabet vary; the user
can choose an alphabet, and the logic defines which formulas can be
obtained from this alphabet and the logical connectives (i.e. the
language). in my example above, the alphabet A is chosen by the user,
and LA is the language obtained from A and the logical connectives and
syntax formation rules in logic.

Normally, the two different ways of defining an alphabet (fixed versus
varying) would not make much difference.  However, because we have data
types which require the fixed interpretation of certain constants, the
choice between the two has certain semantic consequences.

However, since I am no longer resisting against the value space being a
subset of the domain (see below), I guess it doesn't really matter in
the end.

>>>>>>    g- the value space is required to be a subset of the domain.  This
>>>>>> means that every interpretation includes all value spaces of all data
>>>>>> types.  This is unnecessary.
>>>>> So what? It makes the definition simple and uniform.
>>>> It makes every domain infinite. For most kinds of rules (especially
>>>> those without equality in the head) this is not really a problem.
>>>> However, as soon as we have full use of equality, or deal with
>>>> extensions in the direction of FOL, then one often wants to talk about
>>>> finite models.
>>> You can still talk about finite models. 
>> How?  Every model is infinite.
> 
> 
> You need to look at the relevant parts of the models, which would be finite.
> 
> If you want data types, then their symbols are part of the alphabet. If
> they are, the universe is infinite. You are trying to introduce something
> quite non-standard to bend the definitions to look in a nonstandard way
> because of your personal aesthetic views.

OK, I am convinced :-)
I agree now that value spaces should be subsets of the domain.

>>>> On a side note, some extensions (and possibly a combination with OWL DL)
>>>> will want to separate the (concrete) interpretation domain for datatypes
>>>> from the individual (abstract) interpretation domain in order to be able
>>>> to do effective reasoning.
>>>> I believe we currently do not have such a mechanism (to separate the
>>>> two), do we?
>>> I am not sure what you mean.
>> In Description Logics, a distinction is made between the abstract domain
>> and the concrete domain. The value spaces of data types are subsets of
>> the concrete domain.
>> Each predicate symbol has a signature (e.g. abstract x abstract or
>> abstract x concrete) which determines which kind of symbols can be used
>> there.
>> Data values are interpreted in the concrete domain, and variables
>> quantify either over the abstract or over the concrete domain.
>>
>> Because of this strict separation between the abstract and concrete
>> domains, reasoning with data types in Description Logics can be done
>> through issuing conjunctive queries to a datatype oracle.
> 
> So, take U minus the value spaces of data types, and this is your abstract
> domain.

The question is: how can you make sure that certain variables quantify
only over this domain?

> You are splitting hairs. Nah, you are arguing about something that
> is even less interesting than splitting hairs. :-)

I do not agree that this question is not interesting, since it impacts
OWL compatibility and possible DL-inspired dialects.
However, since it does not yet clear what the impact will be, let's
shelve this issue until we learn more (about OWL compatibility).

Best, Jos
-- 
                         debruijn@inf.unibz.it

Jos de Bruijn,        http://www.debruijn.net/
----------------------------------------------
In heaven all the interesting people are
missing.
  - Friedrich Nietzsche

Received on Tuesday, 16 October 2007 08:32:37 UTC