Re: Example of applying sorted-nucleus? was Re: [TED] CORE Pages on Positive Conditions and Horn Rules Edited: Slots & Constraints from Michael Kifer on 2006-12-12 (public-rif-wg@w3.org from December 2006)

From: Michael Kifer <kifer@cs.sunysb.edu>
Date: Tue, 12 Dec 2006 14:32:18 -0500
To: Dave Reynolds <der@hplb.hpl.hp.com>
Cc: W3C RIF WG <public-rif-wg@w3.org>
Message-ID: <5980.1165951938@cs.sunysb.edu>
> 
> Michael Kifer wrote:
> 
> >>> This kind of thing is done in every programming language to various
> >>> degrees.  For instance, in C, things that are sequences of digits belong to
> >>> the sort of integers. Things that look like 123.456 are floats, "..." are
> >>> strings. Alphanumeric things are variable names, type names, and procedure
> >>> names. They are distinguished contextually by their declarations.
> 
> >> The notion of sorts of integers, strings etc and quantification over 
> >> sorts was at least somewhat familiar to me. It was the notion of 
> >> treating the variable names themselves as sorts, as opposed to 
> >> sorted-variables (i.e. where the variable is restricted to range over a 
> >> sort), that was a little surprising at first. Just ignorance on my part.
> > 
> > Sorry, must be some misunderstanding. We are talking about sorted
> > variables.  Can you please point me to where it says (or implies) that
> > variable names are sorts?
> 
> I read the text:
>     "Things that look like 123.456 are floats..."
> as saying that there could be a sort of "floats" (no surprise!).
> 
> By analogy I took the following sentence :
>    "Alphanumeric things are variable names, type names, and procedure
>     names"
> as meaning that there could be a sort of "variables" or "variable names".
> 
> My misunderstanding again.

I see. But this was in the message, not in the document.
Anyway, I hope it is clear now.


> This was at the heart of why I was confused first time around. To me 
> "webizing" is about how you spell things like variable-names and 
> procedure-names, and not to do with the type system that constrains the 
> types of variables or the signatures of procedures.

I think webizing means the use of IRIs as global identifiers for constants,
properties, and concepts. Because the same symbol (IRI) can be used in all
these different contexts (and some other symbols, like integers may not be
allowed to be used in some contexts), it is necessary to have one domain
and also some rules to specify which combos of syntactic elements are
allowed and which are not. This is essentially the main use of multi-sorted
languages.


> >>> Unfortunately, this syntax is UGLY. It also is not very flexible if we want
> >>> some sorts to be non-disjoint (e.g., one to extend the other, which is very
> >>> useful; such logics are called order-sorted).
> >> Not sure I follow that. In RDF those datatypes include types which are 
> >> non-disjoint (xsd:decimal contains xsd:integer contains xsd:int etc). It 
> >> is true that a literal can only have a single explicit type and this is 
> >> a limitation but I don't think that limitation fundamentally stems from 
> >> that aspect of the syntax; non-literal individuals are not so restricted 
> >> or course.
> > 
> > The problem is this. Suppose that sort A is a subset of sort B and we have
> > a literal of sort A. We would write this literal as, say, foo^^A.
> > On the other hand, the literals in B must look like foo^^B, which is different.
> > Of course, we could start adjusting things and say that the literals of the
> > sort B must have the form  foo^^B or foo^^A or ........... (and the list
> > goes on, as we have to include all the subsorts of A, etc.). This is
> > particularly inconvenient if subsorts are going to be added by the various
> > dialects. 
> 
> OK.
> 
> I guess this isn't a problem in RDF because the interpretation of foo^A 
> and foo^B will be identical - the lexical-to-valuespace map will map 
> each onto the same valuespace element - and in RDF it's only the 
> interpretation we care about.

Which basically means that RDF has an implicit equality theory in which
foo^A = foo^B if A is a subtype of B.  But for an extensible (and much
richer) family of rule languages, like RIF, this is not good enough. One
needs more general means of specifying sorts and domains.


> >>> Now, regarding the RDF instantiation, we could use
> >>> &^$#$%%rdf(prop,object,subject), where &^$#$%%rdf is a special
> >>> predicate symbol for RDF triples.
> >>>
> >>> The sort of the symbol &^$#$%%rdf would have to be the Boolean sort
> >>>
> >>>     iri x iri x (iri+literal)
> >>>
> >>> where iri is a sort that contains all iri symbols and "literal" is a sort
> >>> that contains all primitive data types. iri+literal is the least upper
> >>> bound of the sorts iri and literal.
> >> I realize this just an example but a direct mapping in which rdf 
> >> Properties can be treated as binary predicates (as in the UC8 example) 
> >> would be preferable, I think.
> > 
> > If the Core will allow variables over predicate symbols like in HiLog then
> > we could encode RDF directly as prop(object,subject).
> 
> Agreed.
> 
> >> That comment aside this example does reduce my uncertainty, thanks. So 
> >> the components of the boolean sort are what I would informally think of 
> >> as the syntactic entities like 'iri' (URIRefs in RDF-speak) rather than 
> >> the interpretation domain elements that they correspond to ('resources' 
> >> in RDF-speak). Makes sense.
> > 
> > I am not sure I understand what you said.
> > 
> > It may be that the name "Boolean sort" is not good, since one might think
> > of a sort of {t,f} or a sort of truth values in general. What I meant was
> > the sort sort1 x ..., sort_k -> Bool. Perhaps "predicate sort" is a better
> > name. "Truth-valued arrow sort" is probably the most precise name.
> 
> I think I understood that and "boolean sort" is fine by me as a name.
> 
> What I was trying to do was clarify whether these sorts are sets of 
> syntactic elements or sets of domain elements (i.e. the things those 
> syntactic elements denote).

Sorts always specify things in the language, i.e., syntactic things.  They
basically tell which combinations of constants, functions, predicates are
allowed and which are not. Much like grammars.

In the semantics, sorts are interpreted as sets of elements, functions, and
relations over domains.


> I interpreted the example to mean that they are sets of syntactic 
> elements, i.e. things like IRIs. One might contrast this with an 
> owl:Class like owl:Thing - the elements of (the extension of) owl:Thing 
> are the domain elements not the IRIs that denote them. In RDF those go 
> by the somewhat overloaded term "resource".

In the language one asks questions like "is this particular constant (IRI,
for example) a member of the class owl:Thing?" The answer is always yes
because every constant is interpreted as a member of the domain of
interpretation, and owl:Thing is, by definition, interpreted by the set
that equals the entire domain.

There is always a dichotomy of syntactic things (which you use to specify
things and ask queries) and semantic things (domains, functions and
relations over them).

Sometimes you can identify the domain with the corresponding set of
syntactic things (like we do with numbers). This is also what people often
do with URIs. But if we allow equality among URIs (two URIs always point to
the same object) then the syntactic set and semantic domain are no longer
the same (for example, the semantic domain is now a set of equivalence
classes over the syntactic set).


> Though I suspect I'm just showing my ignorance again in thinking this is 
> a relevant or meaningful distinction :-(

This distinction is relevant and meaningful. In fact, it is essential. It
is important to have it in mind in order to get things logically right.


	--michael  

> 
> Dave
> 
> 
> 
> 
>
Received on Tuesday, 12 December 2006 19:33:27 UTC