Re: declaredAs from Alan Ruttenberg on 2007-08-05 (public-owl-dev@w3.org from July to September 2007)

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Sun, 5 Aug 2007 15:43:35 -0400
To: Boris Motik <bmotik@cs.man.ac.uk>
Cc: <public-owl-dev@w3.org>
Message-Id: <5959D5AF-BFFF-4E5D-A69D-D6D9F6A56F8B@gmail.com>
All discussion below is just that - I'm trying to work out, by  
argument, whether I believe this is a useful feature. So far not...

On Aug 5, 2007, at 4:12 AM, Boris Motik wrote:

> Hello,
>
> I probably shouldn't have said "checking for typos" - this was  
> misleading.
> The point is that you somehow need to add a class, a property, etc.  
> to an
> ontology. Declarations do exactly that: they allow you to say e.g.  
> "the
> class C is a part of the ontology O1" without having to say  
> anything else
> about C.

Except that it doesn't do that. There is no ontology scope other than  
the file (at least as far as the RDF rendering goes), and the file  
basis fails when you try to aggregate (as e.g. in a triple store)  
unless you resort to named graphs (which renders them useless for  
other tasks). Moreover you've also said that C is a class.  I do  
think it would be a good thing to be able to say that an axiom is  
part of an ontology in a way that isn't dependent on documents.

> You need this kind of a feature even if you decide to edit your  
> ontologies
> using a visual editor. When you select the "create a new class"  
> from a menu
> in the editor, you'll be prompted for a class name, and then you  
> will have
> to add the class to the ontology. How do you do that? Well, you add  
> the
> declaration for the entered name. Without explicitly being able to add
> classes to an ontology, you can't implement the "create a new class"
> function in the editor. Thus, declarations are needed even if you  
> just use
> ontology editors to edit an ontology.

C subClassIf owl:Thing works just as well AFAIKT

> You might think that all this is unnecessary, because it already  
> worked in
> OWL 1.0; after all, we do have editors for OWL 1.0 that work  
> properly. In
> OWL 1.0, all of these existed, but was never made explicit. Most APIs
> allowed you to add a class to an ontology, but it was not clear  
> what they
> are supposed to do at the RDF level. Therefore, in OWL 1.1 we  
> separated the
> conceptual level from the RDF serialization. Now at the conceptual  
> level
> (i.e., at the level of the definition of OWL 1.1 using its structural
> specification), it seems to me that it is unequivocal that you need
> declarations: the "add class to an ontology" method implemented in  
> most OWL
> 1.0 APIs did exactly that. Furthermore, you need a conceptual  
> notion of a
> declaration even if you use editors.

Saying it again doesn't make it any more true. As a counterexample, I  
edit my ontologies in a completely different syntax based on the  
functional syntax, in a text editor, and also generate them  
programatically. I have no need for these declarations.

> Before proceeding to the issues of serializing the structural  
> specification
> into RDF, let me juts say that the structural integrity check is quite
> useful. Perhaps there was no a public cry for this particular  
> feature, but
> this is because the users often do not know what to complain about.
> Throughout the years, I have often received a number of related  
> complaints.
> People would often send me an ontology and would said that KAON2  
> has a bug
> because it did not produce some expected inference on their  
> ontology. After
> investigation, I would see that some axioms in their ontology  
> referred to
> completely wrong classes. Perhaps this was not due to typos, but it  
> was
> quite often due to URI resolution. URI resolution in RDF is quite  
> brittle
> (there are namespaces, XML base, and ontology URIs, and people get  
> confused
> by this), so people would think that their axiom refers to one URI,  
> but in
> reality it referred to a completely different URI. This is  
> particularly true
> in case of imports: many people find it quite difficult to manage  
> the URIs
> and URI resolution properly in such cases. In fact, I have even  
> seen tools
> that spit out wrong RDF (i.e., RDF where, if you applied the RDF
> specification correctly, the URIs would get resolved differently  
> from what
> the users and tool builders thought would happen).

We share the same experience. I'm just not convinced that this  
solution will help.

> If we had an explicit declaration feature, we could detect all  
> these errors
> with a press of a button.

*If* people use the *optional* mechanism. If it is optional then no  
one can count on it. You're familiar with the situation with RDF  
reification...

But there are other problems that declarations might cause that  
balance out the equations. Again, it doesn't seem to have level of  
maturity and experience that the other features that exhibit. Even  
using the OWLED criteria for task forces I don't see this as anything  
other than at the report level.

And our very initial experiences with declarations are bad.

> URI mismatches are not exactly typos, but are
> similar to typos: you use a class with a different URI than what is
> intended. In order to simplify the discussion, I called them typos  
> (which I
> probably shouldn't have).

I haven't seen a counter to my argument that the most common sorts of  
errors, more than even these, can't be detected by a series of  
heuristics that report warnings, like "lint", or a lisp compiler does.

> To summarize, I believe that we need declarations, for at least the
> following two reasons:
>
> - We need a well-defined way to say that an entity is a part of an  
> ontology.

Which declarations don't do. Something that would do it would be  
something like:

"http://example.com/C" usedAsNameIn <http://example.com/ontology.owl>

> - If we know which entities are intended to belong to an ontology,  
> then we
> might as well check whether all axioms reference the correct  
> entities. In
> combination with imports, this can be a really useful feature.

No, we would only check that axioms referenced the declared entities.  
We wouldn't detect errors such as when a spelling checker misses a  
substitution of "too" for "two" because they are both spelled correctly.

> Please note also that both of these two features are optional: you  
> don't
> have to declare anything, and you don't need to check the structural
> consistency of anything.

Which means that in practice they can't be counted on, that x% of the  
external ontologies you import won't use them, that a tool which is  
aware of them will complain about it, and that as a  result a new  
class of users will be confused, eventually figuring out how to turn  
off the consistency check (if they are lucky).

> Furthermore, even if the structural consistency of an ontology  
> fails, you can still use the ontology as if declarations were not  
> there. Hence, I really do not see why this would be such a hard  
> feature
> to swallow.

Because, for instance, I have to manage the removing two statements  
when I, e.g. edit a class out of an ontology, and this is an error  
prone process, likely, in my experience, to leave leftovers of one  
sort or another. I've already had this problem, which has been  
discussed in other fora, around protege and swoop adding unsanctioned  
type axioms to my ontologies.

> The discussion about declarations in the structural specification  
> of OWL 1.1
> should be considered independently from the discussion how  
> declarations are
> to be encoded into RDF. We should not conflate the two, because the  
> encoding
> issue is subordinate to the issue whether we need declarations or not.

None of the responses I saw conflated the two. They were saying that  
they don't see the need for the declarations in the first place.  
OTOH, the proposed workaround that was offered did conflate the two.

> In the RDF encoding, we have the problem that (1) we need to correctly
> decode the declaredness status of some entity, and (2) we need the
> appropriate typing information to be able to construct the correct  
> axioms.
> Conflating the two seems like a really bad idea, because this means  
> that,
> whenever you see a triple of the form <C, rdf:type, owl:Class>, you  
> don't
> know whether this triple just provides you with typing information or
> whether this also declares a class. This is particularly  
> problematic in
> combination with imports.

?: If you see C, rdf:type, owl:Class then C is a Class.

> Finally, I understand that many people do not have sympathy with  
> parser
> writers. The problem is, however, that most people will in the end  
> use some
> parser, and parser writers have difficulties in implementing the
> specifications, the users will be stuck with parser errors. Already  
> with OWL
> 1.0 it was not always the case that you could simply load a piece  
> of OWL
> generated by one tool into another tool. This was often due to the  
> fact that
> the two tools had different ideas about which typing triples have  
> to be
> present in an ontology for the ontology be parsable.

As far as I can tell all were wrong. But this can be repaired. There  
are other, harder problems that take more work. For instance, last I  
checked pellet didn't complain about explicit cycles of anonymous  
individuals, which is disallowed. And I have yet to see a complete  
and bug free parser for the abstract/functional syntax. This  
situation has nothing to do with type triples. Rather it is just that  
writing a parser takes some effort. Adding declarations will solve  
one small problem, while introducing others. I'd rather see an effort  
that makes sure that the functional syntax actually works and can  
express the complete language (e.g. includes quoting mechanisms so  
all valid strings can be expressed).

> One potential problem is caused by the fact that OWL has syntaxes  
> other than
> RDF. The DIG people want to use the XML-based syntax of OWL. As  
> seen on
> previous OWLED workshops, many people want to have syntaxes that  
> are easier
> to read, more natural-language like, etc. Now given the multitude of
> syntaxes, it does make sense to make each file parsable alone. Let  
> me define
> this more precisely:
>
>   Given a file F, you should be able to reconstruct the OWL 1.1  
> structural
>   axioms belonging to F by just looking at the contents of F, and  
> not at
>   the contents of any other files.
>
> Here is why this is (strongly) desired. Imagine you have an  
> ontology O1 in
> an ontology O1 in RDF/XML, an ontology O2 in the XML-based syntax, an
> ontology O3 in the functional-style syntax, and an ontology O4 in a
> proprietary syntax such as SOF (this format was proposed at this  
> year's
> OWLED). Imagine now that O1 imports O2, O2 imports O3, and O3  
> imports O4. If
> you can't parse each file by itself, then parsing all this (i.e.,
> reconstructing OWL 1.1 structural axioms) is a complete nightmare:  
> you need
> to orchestrate interaction between four different types of parsers.  
> This is
> going to be quite difficult to get right for anyone.

So will you require that declarations are *not* optional in this  
case. This would contradict the spec "Thus, an ontology can be used  
even if it does not contain any declarations". Quite aside from the  
issue of whether declarations are a good thing, mixed messages like  
this are not.

If they are optional, then parser writers will have to deal with the  
nightmare. Saying to users: If you use declarations then my parser  
will be less buggy will not inspire confidence.

As to whether this will be a nightmare, I am not sure. This would  
seem to me to be a useful thing to write a report about. The  
plausibility argument, given so much unknown doesn't do it more me.   
Off the top of my head it would seem that serializing each to RDF and  
then parsing the whole of the RDF together would work.

It's also not clear to me whether declarations are even necessary for  
the goal your propose (if we were to accept it). Would not simply  
stating that any syntax needs to render to the functional syntax also  
accomplish this. Then parsing is a matter of concatenating the  
functional syntax versions and parsing that.

(incidentally, not to start another war, but I find the overbearing  
strong typing of every statement in the new syntax in a similar vein  
to this declaration discussion. With all of that, which I'd also like  
to discuss at some point, it's particularly hard not to see  
declarations as even more redundant).

> Since I actually did have users wanting to use KAON2 with different
> syntaxes, and I did have to struggle with the issues of an ontology  
> in one
> syntax importing an ontology in another syntax, I thought that  
> fixing this
> problem would be for the common good.

Laudable. However, from a process point of view, it kind of comes out  
of nowhere - we had a set of goals for OWL 1.1 at the last OWLED and  
I don't remember this being on the list. Adding this mechanism, and  
other new vocabulary that isn't as strongly motivated as the other  
features that were identified potentially adds risk to the  
standardization process.

> And doing so is not difficult: we just
> need to have sufficient typing information included in each  
> ontology. This
> means that, by looking at the triples in the ontology O1 from the  
> above
> example, I should be able to find out the type of each URI  
> contained in it;
> I should not look into O2, O3, and so on to disambiguate the type.
>
> But if I add rdf:type statements to each ontology, then I can't use  
> rdf:type
> for declarations.

I also want a variety of syntaxes (or rather I want a variety of  
"little languages" or DSLs for writing pieces of OWL in). But I  
haven't come to the same conclusion you have, even though I have  
implemented some of these.  The solution I adopted was to always  
translate to the abstract syntax (something PFPS lectured me to do  
quite some time ago). See for example the usage of the syntax  
extension "reaction" in

http://svn.mumble.net:8080/svn/lsw/trunk/owl/reactions.lisp

(reaction !reaction3 (!L-valine !AlphaKetoglutarate) <=> (!2-keto- 
isovalerate !L-glutamate))

Which currently expands into a rather messy piece of OWL that no user  
would write, but which enables certain entailments.

> Finally, most users will never see the declarations anyway if they  
> use a
> visual editor to edit an ontology.

Although this is the dominant form of editing at the moment, I can  
hardly recommend it. We should talk about this some other time, but  
briefly I see such systems as often not very productive. We don't  
often see programmers writing programs by drawing diagrams, or  
authors writing papers by dragging and dropping words in to  
sentences. There's probably a reason for this. Certainly the focus on  
this aspect of the ontology building process has come at the expense  
of figuring out effective ways to add and work with more powerful  
axioms that can lend power to our ontologies.

I certainly wouldn't want to build anything in the spec that in any  
way is aimed at supporting some particular style of interaction.

> The editor would work like this:
>
> - When someone selects "create new class", then the editor would add a
> declaration axiom for the class to the ontology.
>
> - When the ontology is being saved to a file, the editor would  
> write our all
> proper typing information, and would also write out declarations  
> separately.
>
> Hence, the users would not be overloaded with this. The only  
> difference they
> would see is that they would have fewer broken ontologies.

The users are already overloaded by having to drag and drop classes  
around in order to get what they want done. We need ways to speak  
about and design ontologies that are much more powerful than this.
If we don't somehow escape this paradigm, I'm afraid that we will  
land up with lots more structurally sound, very uninteresting  
ontologies. I think you've identified a real problem - powering  
ontology writers with tools that help them as much as possible - but  
that this proposal doesn't really help very much in addressing what  
the core issues are, IMO.

Scrappily yours,

Alan
Received on Sunday, 5 August 2007 19:43:53 UTC