Re: declaredAs from Pat Hayes on 2007-08-07 (public-owl-dev@w3.org from July to September 2007)

From: Pat Hayes <phayes@ihmc.us>
Date: Tue, 7 Aug 2007 13:48:33 -0500
To: "Michael Schneider" <m_schnei@gmx.de>
Cc: public-owl-dev@w3.org
Message-Id: <p06230902c2de607ea924@[10.100.0.67]>
>[forgot the subject]
>
>Hi, Boris!
>
>I still try to find out what declarations should be good for, and 
>what they provide.

Me too :-)

>In all your posts, including those you posted some months ago to 
>this topic here in the group, and in a paper you wrote (forgot its 
>name, but you will probably know what I mean), I think I always saw 
>these two main lines of argumentation:
>
>1) We need some means for doing "structural consistency" checking 
>for OWL files, and declarations provide such a means. Structural 
>consistency is defined in
>
>     http://webont.org/owl/1.1/owl_specification.html#8
>
>the following way:
>
>     "An ontology O is structurally consistent
>     if each entity occurring in an axiom from the axiom closure of O
>     is declared in O."

With that definition, this point is circular: SC requires 
declarations, by definition, so declarations are required for SC. 
Toss 'em both out, I say.

>
>2) When mistyping an URI in an ontology, we get a hard to find 
>semantic change. For instance, If I write
>
>     SubClassOf(Class1 Clas2)
>
>where I meant "Class2", the resulting ontology is probably 
>completely consistent (has a model), while definitely not what I 
>intended. I will later (with luck!) find strange behavior when doing 
>inference, but it will be hard to tell what the problem is. If I had 
>non-semantical declarations, I could add a declaration
>
>     Class2 declaredAs owl:Class
>
>(and, hopefully, I do not misspell it, too :)), and would then be 
>reported by a declaration aware parser that there is some class 
>"Clas2" in the ontology, for which there is no declaration.

The trouble with this point is that it works the other way when you 
make the mistake somewhere else, in particular when typing the 
declaration itself. Then the declaration is what causes all the 
trouble. Surely the real point here is that putting assertional 
declarations into an assertional language introduces lexical 
redundancy. You get to say the same thing twice, in effect, in 
different ways. That increases the chances both of introducing 
lexical errors and catching lexical errors. Hopefully the latter 
outweighs the former (because you declare it once but use it many 
times?).

>First, to the 2nd point, because I can faster come to a personal 
>conclusion here. While I definitely see that such URI misnamings (or 
>what you prefer to call them) are an ugly problem in principle, I 
>had already argued in my previous mail that I do not see these typos 
>to be a realistic problem, because when authoring an OWL ontology 
>with a modeling tool, it is not very likely to produce such 
>misnamings. But if I really manage to accidentally create two 
>classes "Class2" and "Clas2", declarations wouldn't help, because my 
>tool would either /not/ produce any such declaration at all, if it 
>does not know about declarations. Or if it automatically produces 
>declarations, it would then produce /two/ of them, for both, the 
>correct and the wrong class name! Right? So, declarations would 
>provide no help here, but instead would provide some non-justified 
>sense of safeness.

I agree. And more generally, safety against things like mistyping 
belong in the design of a GUI, not in the exchange language itself.

>And this was only the case of a single declaration-using person 
>creating a complete ontology with a single full featured modeling 
>tool. What, if I share ontology development with other people, who 
>use other tools, which are not declaration aware? Or with people who 
>did not hear about declarations yet? Or people who do not like 
>declarations and have turned them off in their tools? Or those smart 
>colleagues who try to maintain the set of declarations /manually/, 
>which of course leads to mistakes? What is with this imported 
>OWL-1.0 ontology from the SemWeb, for which there will never be any 
>kind of declarations? Or this OWL-1.1 ontology, where its creator 
>doesn't care about declarations?
>
>So, to conclude the 2nd point: Neither see I any practical need for 
>declarations with regard to URI misnamings (such misnamings are not 
>really possible with tool support), nor would declarations help me 
>much in practice.
>
>
>Now to the first point:
>
>I strongly agree that it is a good thing to have an ontology, which 
>allows me for every name (i.e. URI) used within this ontology to 
>determine, if this name denotes either a class, or an individual, or 
>an object property, or a datatype property (I will call this the 
>"resource kind" of a named entity from now on). I really want to 
>have this property of an OWL file, because otherwise my modeling 
>tool gets into problems, if it isn't able to determine the resource 
>kind for each named entity in the ontology.

That sounds, then, as though your tool is designed around OWL-DL 1.0, 
which is the only language that makes these distinctions. They have 
no absolute or fundamental meaning, and are not really resource 
distinctions (in the sense of distinctions between kinds of resource) 
at all. The distinction between 'object' and 'datatype' properties, 
in particular, is there solely to keep the inference engines happy. 
It has no fundamental rationale: datatypes are not a distinct kind of 
resource, only a different way to refer to resources. So, suppose a 
language came along which did away with these distinctions. Would you 
adapt your modelling tool to it, or would you deplore the existence 
of such a language? They have many advantages, however. At the very 
least, you should be aware that your requirement here amounts to a 
very tight restriction on the kind of language you are willing to 
consider.

>So what I like to see from a good tool will be some means to check 
>an ontology at read time, if for each name the resource type can be 
>detected. If the resource kind cannot be detected for some named 
>entity, than I expect a warning from my tool. Of course, this check 
>should be efficient enough to not bother me as a user by making me 
>wait too long.
>
>What I do not get is why I should want to use declarations for this 
>purpose. I think that OWL already has everything included to allow a 
>parser or whatever tool to do such a check. Let's see:
>
>1) Classes:
>
>AFAICS from OWL's abstract syntax, it can always be determined, from 
>each single axiom in an ontology containing a name N, if N denotes a 
>class or not. A few examples:
>
>     * ... complementOf(N)                ==> N is a class
>     * ... unionOf(... N ...)             ==> N is a class
>     * ... restriction(p allValuesFrom(N))==> N is a class
>     * ... oneOf(... N ...)               ==> N is NOT a class
>     ...
>
>Did I overlook a situation, where this is not always possible?

You are referring to OWL-DL. But in OWL-Full it is possible for 
example to write

aaa rdf:type bbb .

and know that bbb is a class but not know whether aaa is, because 
classes may have other classes as members. In general, as assertional 
languages become more relaxed and more like a full logic, it becomes 
harder to always determine what categories things must be in. (This 
is a very general problem which goes beyond SWeb languages, eg 
systems like Kestrel's SpecWare are built around complex Lisp-like 
functional languages whose sole function is to specify what classes 
things are required to be in, checkable at parse time.)

>If not, a parser can contain a map including all possible axiom 
>types of OWL-1.1, where some pattern matching is applied.

It may depend on inferences drawn from these axioms. It can get very 
complicated and subtle, even in simple RDFS (We tried to have 
'implict datatyping' in RDFS, but already it wasn't possible to avoid 
things like domain and range assertions having unexpected 
consequences for class membership. Matching static patterns doesn't 
hack it.)

>Not hard to implement for the parser's author, and probably pretty 
>efficient (I will happily wait up to two seconds on my 2GHz machine 
>:-)).
>
>2) Individuals:
>
>Again, I cannot see a case where one cannot easily decide from each 
>single axiom containing a name N, if N denotes some individual or 
>not.

Just as an aside, in OWL-Full and Common Logic and many other 
formalisms, *everything* is an individual. Individuals are not a kind 
of resource: it's just a way of saying "one of the things we are 
talking about".

>3) Properties:
>
>Ok, here it is at least not always possible to determine, from a 
>single axiom, if a property is an object or a datatype property. 
>Examples:
>
>     * ... restriction(N cardinality(1)) ...
>     * FunctionalProperty(N)
>
>But then, there is still hope that the more concrete information 
>(object or datatype property) is deducible from the rest of the 
>ontology. So we have to look at the complete ontology. Maybe, 
>somewhere else is some explicit typing axiom like 
>'ObjectProperty(N)', or some other axiom like
>
>     ... restriction(N someValuesFrom(C)) ...
>
>>from which we can deduce that N is indeed an object property. So, 
>>again, no need for additional (and redundant in this case) 
>>declarations in this case!
>
>The only problematic situation I see is when it is /not/ possible to 
>deduce from the ontology, what kind of resource N stands for.

How about where it may be possible, but the general problem is as 
hard as the general inference problem? I think that is the chief 
motivation for having a syntactically distinct declaration syntax.

>This would, for instance, be the case, if the only axiom, where the 
>name N occurs, would be the above cardinality restriction. In this 
>case, I would like to see my parser to spit out a warning:
>
>     "Hey, while your ontology is formally correct, it is probably 
>not in the state you wish to have it. So explicitly say what N is 
>meant to be!"

I'd rather have the GUI say this, not the parser. In fact, I never 
want to see a message directly from the parser when composing 
ontology content.

>
>If the parser stops parsing in this case, or if it tries to apply 
>some heuristics, or perhaps in the concrete case it is not affected 
>at all... I can live with all of these cases, as long as I get 
>warned in some way.
>
>The only question which I have (and which you, Boris, probably are 
>able to answer much better than me) is: Is it always efficiently 
>decidable, if a Name N denotes a class, individual, object or 
>datatype property, or if its kind cannot be determined?

No.

>Only if this question is answered with "NO!", than I can see at 
>least a /theoretical/ point for declaredAs statements. But of 
>course, this question can always be answered /pragmatically/, by 
>trying to find out the kind for a name N for a second or so, and 
>than surrender by saying
>
>     "Cannot deduce the kind of N, so please tell it explicitly!".
>
>The important point, when such a situation occurs, is: I would then 
>add a typing axiom, NOT a 'declaredAs' statement, because adding a 
>declaration wouldn't fix my problem at all

As I understand the intended semantics here, these are logically 
equivalent, but labelling one of them as a declaration amounts to 
giving advice to future software that this particular assertion can 
be used in a special way during parsing. It has priority in the sense 
that if it is contradicted by something else, that the other thing is 
an error. And it is of a form which allows the parser to use it 
efficiently.

>: After adding such a declaration statement, my ontology is still in 
>the same under-determined state as before! Adding a declaredAs 
>statement wouldn't be more than just adding a comment to the OWL 
>file of the form: "N is an object property".

Its more because unlike a comment, it can be used by the parser. But 
I agree, it ought to be useable by inference as well.

>Of course, a declaredAs statement is a very specific kind of 
>comment, which is machine processable, and which is dedicated to the 
>special purpose of telling parsers, what kind of resource a named 
>entity within an axiom has. But IMO, it simply misses the point for 
>what it is intended. If it is not possible to determine from the 
>ontology axioms itself, what kind of information a name denotes, 
>than somewhat is wrong with my ontology

Well, call the declarations a special kind of ontology axiom. (For 
more on this notion, in a more general framework, see
http://www.ihmc.us:16080/users/phayes/IKL/GUIDE/GUIDE.html#Sorts
)

>- not really formally wrong (an ontology is just a set of axioms, 
>and if this set is semantically consistent, than the ontology is 
>formally ok), but the modeling is probably quirky. Modeling quirks 
>can only be fixed by adding, changing or removing axioms, not by 
>adding comments!

I don't think it is sensible to think of declarations as comments. 
They are more like axioms with a special relationship to the parser. 
Just like axioms, they talk about what classes things are in.

>So, instead of inventing a new feature, why not try to find out how 
>far we come with the things we already have? I think that one of the 
>worst problems in this discussion is that the problem of "structural 
>consistency" hasn't yet been largely discussed and understood within 
>the OWL community. Wouldn't it therefore be a good idea to try out 
>to create some kind of "best practice" document, which tells people 
>what to do to create "structural consistent" ontologies, and how to 
>work with them?

Blech. This PRESUMES that such 'structural consistency' is desirable, 
which is highly controversial. I think we should be striving to 
overcome these archaic distinctions (between for example classes and 
individuals) rather than finding 'best practice' ways to set them in 
stone. The idea that all logics must be typed is horrendous; it just 
creates new, artificial, barriers to interoperability.

>Such a document would have to provide the following:
>
>     * Giving a complete list of how to deduce from the axioms in an 
>ontology, if a given name N stand for a class, individual, object 
>property, or datatype property, or if N's kind of resource cannot be 
>deduced. This would be in the line of my discussion above, but much 
>more thoroughly analyzed then my three minute check.
>
>     * Giving parser implementors useful hints, how to implement the 
>list of deduction rules above.
>
>     * Giving modeling tool implementors a set of guidelines, which 
>tell them what information they should insert into an ontology in 
>order to make it always possible to check for structural consistency.

This was tried for OWL-DL. In practice, people don't do it: there are 
a large number of ontologies out there which *would* be OWL-DL *if* 
all the type assertions had been put into them. I guess the 
declarations proposal is partly motivated by this phenomenon: their 
parsers would have warned them before publishing these "broken" 
things. However, since its so easy to "repair" them automatically, 
this seems to me to be a non-problem. Particualrly as I don't see 
them as being "broken", myself.

>     * Giving OWL authors a (small!) number of easy to follow 
>guidelines for keeping their ontologies in a state, which is 
>supposed to pass each parser, which does structural consistency 
>checking in the way outlined in the document. These guidelines would 
>only be needed, if the ontology author wants to do manual edits, or 
>if he likes to do an eye-check over an ontology to see, if it is 
>probably in a good state.
>
>Additionally, one could build a reference implementation, which 
>works in the way discussed in the document. The reference 
>implementation would not do more than just the things outlined in 
>the document (ok, one should additionally be able to decide to 
>either do single file parsing or to regard the whole import closure 
>:)). The tool implementors could use this reference implementation 
>for studying it, and as a base for their own product, and they could 
>then add all kinds of additional things, which make their product 
>more interesting than those of their competitors, like heuristics, 
>providing assistance on ontology "repair", etc. Users could use the 
>reference implementation as a simple stand-alone tool to check their 
>OWL files, if they do not have any of those much cooler products 
>from the tool implementors.

We already have an OWL GUI which does this. It parses OWL-DL without 
needing the 'missing' declarations, and (switchably) warns you if it 
finds them missing, generating a side ontology with them all in, in 
case you'd like to import it; which is switchably automatic. So it 
runs the gamut from warn/auto-repair/ignore, you choose. It would do 
pretty much the same with declarations, but it would run a bit 
faster. It works largely by pasting together various open source 
tools written by others (such as Bijan).

>
>This approach might fail, or at least turn out to be not a good 
>idea. And /then/, this might bring back declarations into play. 
>/Then/ the problem has at least been discussed and understood by a 
>few more people, and the declaredAs fans can /then/ point to the 
>document and say: "Look, we really need declarations, because the 
>"more obvious" approach working with OWL axioms alone does not 
>work." /Then/ one can think about adding such a declaration 
>mechanism into the OWL standard. /Then/, but it is much to early to 
>do so now!
>
>[BTW: I wonder if Jena Eyeball already contains some functionality 
>in the direction proposed above? Or Pellet within its ontology 
>repair functionality? I Will have a look, if I find the time.

Im sure you can figure out how to do it :-)

Pat Hayes

>]
>
>Cheers,
>Michael
>
>--
>Michael Schneider <m_schnei@gmx.de>
>
>
>--
>Michael Schneider <m_schnei@gmx.de>


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Tuesday, 7 August 2007 18:48:46 UTC