Re: LANG: owl:import - Two Proposals from Jeff Heflin on 2002-10-02 (www-webont-wg@w3.org from October 2002)

From: Jeff Heflin <heflin@cse.lehigh.edu>
Date: Wed, 02 Oct 2002 10:21:00 -0400
To: Jonathan Borden <jonathan@openhealth.org>
CC: WebOnt <www-webont-wg@w3.org>
Message-ID: <3D9B00CC.EAF65B5B@cse.lehigh.edu>
Hi Jonathan,

Please see my responses below.

Jonathan Borden wrote:
> 
> Jeff Heflin wrote:
> 
> > In [1], I listed what I consider to be a number of problems with
> > proposal #2. I haven't heard anyone address any of these concerns. If
> > these were addressed satisfactorily (perhaps by an alternate proposal)
> > then I would be happy to endorse that approach.
> >
> 
> I've suggested what may be an alternate approach, or simply a wrinkle on #2
> in http://lists.w3.org/Archives/Public/www-webont-wg/2002Sep/0519.html

Oh yes, sorry I missed that. I would probably consider it a wrinkle on
proposal #2. In particular, it has the same syntax, but a slightly
different semantics. However, I think all of the cons of proposal #2
still apply to it. Also, I do not think a processing model is a good way
to specify these semantics. In particular, it says in order to be OWL
compliant, you must implement things a certain way, which discourages
creative solutions that might have the same effect. I'd much rather give
semantics (i.e., what is the thing supposed to mean) and let
implementors have the freedom to decide what works best in their own
systems.

> Regarding your objections:
> 
> [[
> Cons:
> - Having valid syntax that has undefined semantics may lead to reduced
> interoperability. In particular, some users may build ontologies that
> rely on the arbitrary decisions made by their favorite tool vendors.
> ]]
> 
> I'm unsure how to interpret this statement. I would say that the OWL
> processor should include triples obtained by retrieving and parsing the
> _object_ of an owl:imports statement, into the current "graph", but that any
> URIref ought be retrieved and/or parsed only once.
>
> Is that 'semantics' undefined? (seems precise enough for me :-)

Note that the undefined semantics is not for things of the form

A imports B

but instead of the form

foo subClassOf imports
A foo B

We can come up with all kinds of examples along this theme that lead to
nightmares for implementation (imagine where I import something that
then says that another property I have is actually a subclass of
imports, then I have to import a whole new set of things, which in turn
might have other subclasses of imports). That's why proposal 2 suggested
that anything that has imports as a subject or object be undefined.
However, if we do this, some tools might decide to do some inferences
based on it, others might do a different set of inferences and still
others might consider it invalid syntax. If people start relying on the
processing aspects of their favorite tools (which they tend to do, see
web pages and web browsers as a case in point), then we have reduced
interoperability by not saying what all valid syntactic constructs must
mean.

> 
> [[
> - It is unclear what it should mean if a document C contains the
> statement A owl:imports B. Should this be another undefined construct?
> If so, how can you determine from a graph if the subject of an imports
> statement is the URI of the document from which the imports statement
> comes?
> ]]
> 
> Fair question, and one open for discussion. I'd say that regardless of the
> subject, the object be imported into the current graph/KB.

Note though that this is a question that does not come up if we do
proposal #1. We only have to answer it if we go with proposal #2.

> [[
> - The fact that an ontology's classes and properties do not occur
> between the <Ontology> tags is unintuitive
> ]]
> 
> Oh well. That's an artifact of the decision to use RDF, however we decided
> to use RDF/XML at F2F 2.

Sure this isn't a deal-breaker, but because of it, proposal #1 has the
advantage of us not having to constantly answer the question "So tell me
again why the contents of an ontology are described outside of the
<Ontology> tag?"

> [[
> - The use of about="" to make statements about the enclosing document
> seems like a hack. In particular is seems like we could be confusing the
> notion of a document that describes an ontology and the concept of an
> ontology itself.
> ]]

> Maybe, but what is the functional significance of this, and what requires us
> to use rdf:about=""?
> 

If we are using RDF triples for everything, then triples that say
something is of type owl:Ontology and that that something imports
something else need a subject. In particular, this subject should be the
same for all such triples that concern the same ontology. The use of
about="" is a quick and dirty way of saying "use the base URL of the
document, since this should at least be different from the URIs of other
ontologies." Once again, this isn't huge; so far it hasn't seemed to
cause significant problems in practice with DAML+OIL. However, it
contributes to the difference between a language with an elegant design
and one that appears to be cobbled together. Which I think is one of the
most important factors in acceptance.

> [[
> - The approach only partially succeeds in its goals, because although it
> represents ontologies and their properties, it loses the ability to
> recognize the boundaries of an ontology (i.e., what it contains) as soon
> as two or more graphs are merged together. In particular, if this
> approach is extended for use with versioning, then we lose the ability
> to know which statements come from which version of an ontology.
> ]]
> 
> Perhaps, but is the ability to recognize the boundaries of merged ontologies
> a requirement, or objective? That's to say, is meeting the above proposed
> goals worth major changes to the OWL syntax, ones that go against RDF
> compatibility?

You are right, recognizing the boundaries of merged ontologies is not an
explicit requirement or objective. However versioning, which I mentioned
above is (R6). I also believe that it is important for robust models of
explicit ontology extension (R3) and commitment to ontologies (R4),
although I recognize that not everybody agrees with me on this point.
More important though, is when the Semantic Web gets to matters of
trust, it will be essential to know which statements come from which
documents. Even though our WG is not doing anything about trust, why
should we make design decisions that we know cannot be extended by later
"layers" of the Semantic Web? Let's think ahead a bit, and not make the
job of the next WGs as difficult as our job has been.

> It seems that this issue is a general one with software modules e.g. when
> they are compiled together, it's hard to know what came from where unless
> out of band information, such as used by debuggers, is included. Yet ANSI
> C++ is ANSI C++.

But the Semantic Web is a lot more like distributed components than it
is like a compiled C++ program.

> Jonathan
Received on Wednesday, 2 October 2002 10:21:04 UTC