Re: Example of applying sorted-nucleus? was Re: [TED] CORE Pages on Positive Conditions and Horn Rules Edited: Slots & Constraints from Michael Kifer on 2006-12-12 (public-rif-wg@w3.org from December 2006)

From: Michael Kifer <kifer@cs.sunysb.edu>
Date: Tue, 12 Dec 2006 09:02:16 -0500
To: Dave Reynolds <der@hplb.hpl.hp.com>
Cc: W3C RIF WG <public-rif-wg@w3.org>
Message-ID: <14678.1165932136@cs.sunysb.edu>
> Michael Kifer wrote:
> 
> > Yes, you are right -- the current document is too terse. The intent was to
> > expand it in due time with examples. Let me elaborate.
> 
> Thank you, this was helpful.
> 
> > Mathematically, a sort is just a set and, in theory, it doesn't matter how
> > it is defined. But in a concrete KR language there must be a clear and
> > simple mechanism that tells which symbols belong to which sort.
> > 
> > This kind of thing is done in every programming language to various
> > degrees.  For instance, in C, things that are sequences of digits belong to
> > the sort of integers. Things that look like 123.456 are floats, "..." are
> > strings. Alphanumeric things are variable names, type names, and procedure
> > names. They are distinguished contextually by their declarations.
> 
> The notion of sorts of integers, strings etc and quantification over 
> sorts was at least somewhat familiar to me. It was the notion of 
> treating the variable names themselves as sorts, as opposed to 
> sorted-variables (i.e. where the variable is restricted to range over a 
> sort), that was a little surprising at first. Just ignorance on my part.

Sorry, must be some misunderstanding. We are talking about sorted
variables.  Can you please point me to where it says (or implies) that
variable names are sorts?


> > So, C and most other programming languages use a combination of syntax and
> > explicit declarations to classify their symbols into sorts.
> > 
> > Now, RIF will need to make similar kinds of decisions. Since RIF is a
> > family of languages rather than a single language, our job is harder and we
> > need to come up with a simple yet extensible framework for specifying sorts
> > to which RIF symbols belong. I am not proposing anything right now, but am
> > simply indicating what the choices might be.
> > 
> > The most flexible framework is based on declarations. That is, each symbol
> > can be explicitly or implicitly declared to belong to one of the sorts
> > (ints, dates, URIs, predicate symbols, etc.)
> > Unfortunately, this may not be a practical solution (at least, I don't know
> > of a practical one).
> 
> Fine. I couldn't see how the BNF syntax and the description of sorts 
> were supposed to tie up, now I understand that there is a known gap 
> there to be filled in sometime.

Yes, we need to settle how the sorts are defined. We had to do this anyway
for URIs and the basic data types. (These will probably be the only
initially defined sorts.)


> > There are also flexible syntactic solutions. For instance, in RDF they use
> > symbol^^type to indicate the sorts of the literals. We could extend that.
> 
> Actually that's the N3/Turtle syntax. The RDF/XML syntax uses the 
> "rdf:datatype" annotation. Both of which are, of course, just concrete 
> representations of the same abstract syntax (which could be expressed in 
> Sandro's asn06). The rdf:datatype annotation seems fairly similar to 
> Harold's rif:type annotation.

Yes, as I said, there is no problem doing it in XML. A more human-oriented
abstract syntax is a different matter.

Sorry for misattributing the ^^ syntax to RDF instead of N3.


> > Unfortunately, this syntax is UGLY. It also is not very flexible if we want
> > some sorts to be non-disjoint (e.g., one to extend the other, which is very
> > useful; such logics are called order-sorted).
> 
> Not sure I follow that. In RDF those datatypes include types which are 
> non-disjoint (xsd:decimal contains xsd:integer contains xsd:int etc). It 
> is true that a literal can only have a single explicit type and this is 
> a limitation but I don't think that limitation fundamentally stems from 
> that aspect of the syntax; non-literal individuals are not so restricted 
> or course.

The problem is this. Suppose that sort A is a subset of sort B and we have
a literal of sort A. We would write this literal as, say, foo^^A.
On the other hand, the literals in B must look like foo^^B, which is different.
Of course, we could start adjusting things and say that the literals of the
sort B must have the form  foo^^B or foo^^A or ........... (and the list
goes on, as we have to include all the subsorts of A, etc.). This is
particularly inconvenient if subsorts are going to be added by the various
dialects. 

But now what if we want to allow sorts to be nondisjoint but not subsets of
each other? Say, for some particular foo I may want to be both in A and B
(but neither A < B nor B < A)?

One way to do this is to introduce equations such as foo^^A = foo^^B.
We could even invent a simple metalanguage for specifying sorts. It could
even be rule-based! Something like

       ?X^^?S1 = ?X^^?S2 and wellformed(?X^^S2)
                           if ?S1 < ?S2 and wellformed(?X^^?S1)

(if x is of sort s1 and s1 is a subsort of s2 then x^^s2 is a legal
literal of sort s2 and it is the same as x^^s1).

I don't know if we want to go in this direction. I'd rather look around for
simpler ways before trying the above.


> > But all this has to do with the abstract syntax only. In XML things are
> > quite simple. We could use a combination of implicit sorts based on syntax
> > (suitable for primitive data types like strings, integers, time, and date)
> > and explicit declarations using the type attribute (for user-defined and
> > dialect-defined sorts).
> 
> Fair enough, though I think we will probably want explicit types for 
> time, date. That's not a problem of course since we've already agreed to 
> adopt xsd.
> 
> > Now, regarding the RDF instantiation, we could use
> > &^$#$%%rdf(prop,object,subject), where &^$#$%%rdf is a special
> > predicate symbol for RDF triples.
> > 
> > The sort of the symbol &^$#$%%rdf would have to be the Boolean sort
> > 
> >     iri x iri x (iri+literal)
> > 
> > where iri is a sort that contains all iri symbols and "literal" is a sort
> > that contains all primitive data types. iri+literal is the least upper
> > bound of the sorts iri and literal.
> 
> I realize this just an example but a direct mapping in which rdf 
> Properties can be treated as binary predicates (as in the UC8 example) 
> would be preferable, I think.

If the Core will allow variables over predicate symbols like in HiLog then
we could encode RDF directly as prop(object,subject).

> That comment aside this example does reduce my uncertainty, thanks. So 
> the components of the boolean sort are what I would informally think of 
> as the syntactic entities like 'iri' (URIRefs in RDF-speak) rather than 
> the interpretation domain elements that they correspond to ('resources' 
> in RDF-speak). Makes sense.

I am not sure I understand what you said.

It may be that the name "Boolean sort" is not good, since one might think
of a sort of {t,f} or a sort of truth values in general. What I meant was
the sort sort1 x ..., sort_k -> Bool. Perhaps "predicate sort" is a better
name. "Truth-valued arrow sort" is probably the most precise name.


	cheers
	  --michael  


> > From this follows a requirement that sorts have to be partially ordered
> > (order-sorted).  Most likely the sorts will be organized in a lattice or
> > semi-lattice with lub being the union and glb being the intersection of the
> > respective sorts.
> 
> Sure.
> 
> Dave
> 
>
Received on Tuesday, 12 December 2006 14:07:04 UTC