Re: [OPEN] and/or [PORT] : a practical question

Ian Horrocks <horrocks@cs.man.ac.uk> wrote on 03/24/2004 04:05:53 PM:

> > Well, "mismodelling their world" is not limited to classes as 
instances. I 
> > find it rather dangerous to make such statements.  People use subclass 

> > incorrectly, too, but that wasn't a reason to remove that axiom from 
OWL 
> > DL.
> 
> I would say that there is a big difference. Like any part of the
> language, subClass may occasionally be abused, but it is used very
> widely and most people seem able to use it more or less
> correctly. 

It is very widely used, yes, but people get it wrong about as much as they 
get it right. Since it's transitive, get one wrong near the "top" and your 
whole ontology will be off.

> Moreover, subClass fits into a family of logics which are
> theoretically well understood and for which there is considerable
> implementation experience.

Yes! THAT is the reason the choice was made for OWL DL, not:

> In contrast, classes as instances are relatively rarely used (most
> forms of conceptual modelling, databases etc., seem to have managed
> perfectly well without them), and hardly ever used
> "correctly". 

Well, I'm not surprised that a DL person would have this view.  However 
the statement is simply false.  Databases have had this since at least the 
mid-80s (see "Two Stage ER Diagrams" e.g.), and possibly earlier. Of 
course they don't do reasoning so they don't have the kinds of problems 
with their implementations that we do.

THe issue was also recognized quite early on in object-oriented modeling, 
as well, and both Smalltalk and CLOS had "meta object" facilities for 
handling it.

Outside the DL world, people building Knowledge-based systems use it all 
the time, and I would say that the per-capita misuse is about the same as 
it is for any other part of the language.  Clearly if you go counting 
relations, subclass wins in sheer volume, but I can't say I have ever done 
a serious ontology without encountering a need to model this correctly, 
and in fact my experience is that systems like DLs actually encourage 
people to get it wrong by forcing them to ignore the distinction.

> Moreover, the resulting logics are much less well
> understood 

That's not true at all, unless you are conflating "understood" with:

> and there is little implementation experience.

There is less, for sure, I'm not sure I would say "litte".  There has also 
been far less formal study of the complexity of reasoning algorithms for 
languages that have this expressiveness but not others.  Peter and I 
looked at this in the early 90s for neo-classic, but we never followed 
through.  The question was, like any language feature: what would you have 
to give up to have complete and decidable reasoning if you allowed 
instances to have instances.  This is an open problem.  Many people just 
throw up their hands prima facie because "2cnd order logic is incomplete 
and undecidable".  However, it's not that simple - you can subset it just 
as we have been doing for FOL to find decidable fragments.

> 
> Ian
> 
> 
> 
> 
> >  People just mismodel their worlds, I hope we can offer some advice on 

> > both how to do some of these things and how NOT to do it.
> > 
> > Anyway, your analysis exposed some important misconceptions, espcially 

> > regarding so-called "subject hierarchies" and class hierarchies.  I've 

> > written a paper or two about the problem, in this one: 
> > [http://dx.doi.org/10.1016/S0169-023X(99)90021-6] I basically show 
that 
> > subject taxonomies are actually "part"onomies, or more precisely 
spatial 
> > containment,  not subclass (in fact, etymologically, "subject" means 
to 
> > throw under, "topic" is a region, and "about" means near).  Some of 
the 
> > initial problems of representing subject taxonomies in DLs are 
discussed 
> > in a paper in the first FOIS conference, which may be hard to find.  I 

> > can't seem to find a softcopy myself.
> > 
> > > The class hierarchy in RDFS/OWL is there to describe hierarchies of 
> > classes 
> > > of resources. Just because you have a hierarchy of subject 
descriptors 
> > > doesn't make it a class hierarchy.
> > > 
> > > It seems to be confusing the human way of thinking of analogy and 
> > metaphor 
> > > (any hierarchy can act as a metaphor for any other hierarchy) with 
what 
> > is 
> > > a logical and implementation issue about how to say what we want to 
say 
> > > about our knowledge of our world in a way that machines can process 
it.
> > > 
> > > Thus if PhDThesis is an owl:Class what are the resources that we 
intend 
> > to 
> > > belong to it? Probably my PhD Thesis with title "Graph Grammars: an 
> > > approach to transfer based MT; exemplified by a Turkish-English 
system" 
> > is 
> > > one such resource, but the copy sitting on my bookshelf is probably 
not.
> > > 
> > > Then if that is the case what would we mean by dc:subject linking 
the 
> > > resource of my thesis with this class .... hmmm ... we mean my 
thisis 
> > > belongs to that class, i.e. rdf:type.
> > > So if we want to treat this subject hierachy as classes we really 
also 
> > want
> > > 
> > > dc:creator rdf:subPropertyOf rdf:type .
> > > 
> > > or perhaps
> > > 
> > > eg:creator rdf:subPropertyOf rdf:type .
> > > eg:creator rdf:subPropertyOf dc:creator .
> > > 
> > > But if we click on dc:creator we get to:
> > > http://purl.org/dc/elements/1.1/subject
> > > 
> > > <rdf:Property rdf:about="http://purl.org/dc/elements/1.1/subject">
> > > <rdfs:label xml:lang="en-US">Subject and Keywords</rdfs:label>
> > > <rdfs:comment xml:lang="en-US">The topic of the content of the 
> > > resource.</rdfs:comment>
> > > <dc:description xml:lang="en-US">
> > > Typically, a Subject will be expressed as keywords,
> > > key phrases or classification codes that describe a topic
> > > of the resource.  Recommended best practice is to select
> > > a value from a controlled vocabulary or formal
> > > classification scheme.</dc:description>
> > > <rdfs:isDefinedBy rdf:resource="http://purl.org/dc/elements/1.1/"/>
> > > <dcterms:issued>1999-07-02</dcterms:issued>
> > > <dcterms:modified>2002-10-04</dcterms:modified>
> > > <dc:type 
> > > 
> > 
rdf:resource="http://dublincore.org/usage/documents/principles/#element"/>
> > > <dcterms:hasVersion 
> > > 
rdf:resource="http://dublincore.org/usage/terms/history/#subject-004"/>
> > > </rdf:Property>
> > > 
> > > and we see that dc:subject should typically be a string from a 
> > controlled 
> > > vocabulary. Thus it seems particularly poor practice to deviate from 
the 
> > 
> > > preferred usage of dc:subject in order to (over-)simplify our model.
> > > 
> > > This points to the solution I was earlier advocating of using such 
> > strings, 
> > > using hasValue restrictions to map the strings into classes and then 

> > using 
> > > the class hierachy on those restrictions to show the hierarchical 
> > > relationships between the subject vocab terms. To do this well, we 
> > probably 
> > > want to specialise the dc:subject property with a subproperty 
> > eg:subject, 
> > > specify its range with an owl:Datarange explicitly enumerating the 
> > > controlled vocabulary, and for each term create a class using a 
hasValue 
> > 
> > > restriction.
> > > For further clarity and usablility we might want to create two 
related 
> > > properties, one indicating the (single) intended subject code, and 
the 
> > > other indicating all implicit subject codes formed from the class 
> > hierachy.
> > > The former would be a subproperty of both the latter and dc:subject; 
the 
> > 
> > > latter would be used to create the hasValue restrictions.
> > > 
> > > Hmmm ... quite a lot of work initially, but the end result is that 
the 
> > > subject indicators are marked up using text strings from an explicit 

> > > controlled vocab; we conform with the defn of dc:subject, even with 
the 
> > > advertised best practice; we fall within OWL DL with the expectation 

> > that 
> > > this will give us better reasoning performance, and we have been 
clearer 
> > 
> > > about we are trying to say. I think the complexity can be hidden 
from 
> > the 
> > > end users.
> > > 
> > > Jeremy
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > Bernard Vatant wrote:
> > > > 
> > > > *BV
> > > > 
> > > >>>- Is it worth the trade-off to switch one's ontology (otherwise 
DL)
> > > >>>to OWL-Full, just to
> > > >>>allow its classes to be used as objects in 'dc:subject' 
predicates?
> > > > 
> > > > 
> > > > *Jim
> > > > 
> > > >>That's a weird way to ask the question.  You mean, is it worth 
doing
> > > >>the extra work to break your naturally occuring model just so that
> > > >>you can be in DL?
> > > > 
> > > > 
> > > > The way I put it might seem weird indeed, but it's the way it was 
set 
> > in the real project
> > > > context (real world is weird). We had an OWL-DL ontology, and 
wanted 
> > to keep it so, and
> > > > suddenly after six months or so some user wants to be able to use 
a 
> > class as a subject of
> > > > a document ... which is one case out of one thousand, the 999 
others 
> > using 'regular'
> > > > subjects. So using a class as subject of a document is not exactly 

> > 'naturally occuring'.
> > > > It's a borderline case - not to say a weird one :))
> > > > 
> > > > *Jim
> > > > 
> > > >>I would argue this is indeed a BP issue, but probably for WORLD 
not
> > > >>for OPEN... we need to explain why and when you would do the extra
> > > >>work (and in every case we have explored it is extra work) to make
> > > >>sure your ontology is in the DL profile of OWL.
> > > > 
> > > > 
> > > > I suggested it might be in PORT scope, because it deals with the 
> > terminology vs ontology
> > > > general issue. For me the heart of the question is to know what it 

> > means to 'use a
> > > > concept' defined in a terminology (glossary, thesaurus, subject 
> > headings, index...) as a
> > > > class (or a property) in an ontology.
> > > > 
> > > > Is 'PhD Thesis' class the same 'subject' (using TM language here, 
> > sorry) or 'resource'
> > > > than the original concept? The more I think about it, the more I 
have 
> > to deal with it, and
> > > > the more I tend to say that they are distinct animals. Jim's PhD 
> > Thesis is an instance of
> > > > the class, but not of the concept. One subject of 'Social 
Functions of 
> > PhD Thesis in
> > > > Occidental University during 20th century', is the concept of PhD 
> > Thesis, not the class.
> > > > 
> > > > So it's not just an issue of OWL-DL vs OWL-Full, it's also an 
issue of 
> > making distinct or
> > > > not those two 'things'. This is a core issue in porting thesaurus 
to 
> > the SW, related to
> > > > others of the same kind, like if concepts A and B are interpreted 
as 
> > classes, and there is
> > > > a Broader-Narrower relationship between A and B in the Thesaurus, 
has 
> > it to be interpreted
> > > > as a class-subclass relationship in the ontology etc.
> > > > 
> > > > So I think in that case a BP definition would be two-fold
> > > > 
> > > > 1. Is it generally a BP to make terminology concepts distinct from 

> > ontology classes (and
> > > > properties)?
> > > > 2. If agnostic about 1, what is the trade-off when choosing to 
make 
> > them distinct or to
> > > > merge them ?
> > > > 
> > > > FWIW, having tried both terms of the alternative in the course of 
> > time, my personal view,
> > > > for above quoted reasons, is that they shoud be kept separate, and 

> > it's worth the extra
> > > > work (even before being aware of the DL vs Full issue)
> > > > 
> > > > Are there other concrete experience on that, not only theoretical 
> > considerations? Seems
> > > > like there are not so many people exploring the 
terminology-ontology 
> > interoperability. Or
> > > > are they?
> > > > 
> > > > Bernard Vatant
> > > > Senior Consultant
> > > > Knowledge Engineering
> > > > Mondeca - www.mondeca.com
> > > > bernard.vatant@mondeca.com
> > > > 
> > > > 
> > > 
> > 
> > <br><font size=2><tt>Jeremy wrote on 03/24/2004 04:24:16 AM:<br>
> > <br>
> > &gt; <br>
> > &gt; Yes, like Bernard, I have been thinking more about this, and 
about
> > Ian's <br>
> > &gt; insistence in WebOnt that classes-and-instances was almost always
> > raised by <br>
> > &gt; people wanting to mismodel their world. (cc Ian, wondering if I 
have
> > learnt <br>
> > &gt; my lessons well!, or misrepresented him)<br>
> > </tt></font>
> > <br><font size=2><tt>Well, &quot;mismodelling their world&quot; is not
> > limited to classes as instances. I find it rather dangerous to make 
such
> > statements. &nbsp;People use subclass incorrectly, too, but that 
wasn't
> > a reason to remove that axiom from OWL DL. &nbsp;People just mismodel 
their
> > worlds, I hope we can offer some advice on both how to do some of 
these
> > things and how NOT to do it.</tt></font>
> > <br>
> > <br><font size=2><tt>Anyway, your analysis exposed some important 
misconceptions,
> > espcially regarding so-called &quot;subject hierarchies&quot; and 
class
> > hierarchies. &nbsp;I've written a paper or two about the problem, in 
this
> > one: [http://dx.doi.org/10.1016/S0169-023X(99)90021-6] I basically 
show
> > that subject taxonomies are actually &quot;part&quot;onomies, or more 
precisely
> > spatial containment, &nbsp;not subclass (in fact, etymologically, 
&quot;subject&quot;
> > means to throw under, &quot;topic&quot; is a region, and 
&quot;about&quot;
> > means near). &nbsp;Some of the initial problems of representing 
subject
> > taxonomies in DLs are discussed in a paper in the first FOIS 
conference,
> > which may be hard to find. &nbsp;I can't seem to find a softcopy 
myself.</tt></font>
> > <br>
> > <br><font size=2><tt>&gt; The class hierarchy in RDFS/OWL is there to 
describe
> > hierarchies of classes <br>
> > &gt; of resources. Just because you have a hierarchy of subject 
descriptors
> > <br>
> > &gt; doesn't make it a class hierarchy.<br>
> > &gt; <br>
> > &gt; It seems to be confusing the human way of thinking of analogy and
> > metaphor <br>
> > &gt; (any hierarchy can act as a metaphor for any other hierarchy) 
with
> > what is <br>
> > &gt; a logical and implementation issue about how to say what we want 
to
> > say <br>
> > &gt; about our knowledge of our world in a way that machines can 
process
> > it.<br>
> > &gt; <br>
> > &gt; Thus if PhDThesis is an owl:Class what are the resources that we 
intend
> > to <br>
> > &gt; belong to it? Probably my PhD Thesis with title &quot;Graph 
Grammars:
> > an <br>
> > &gt; approach to transfer based MT; exemplified by a Turkish-English 
system&quot;
> > is <br>
> > &gt; one such resource, but the copy sitting on my bookshelf is 
probably
> > not.<br>
> > &gt; <br>
> > &gt; Then if that is the case what would we mean by dc:subject linking
> > the <br>
> > &gt; resource of my thesis with this class .... hmmm ... we mean my 
thisis
> > <br>
> > &gt; belongs to that class, i.e. rdf:type.<br>
> > &gt; So if we want to treat this subject hierachy as classes we really
> > also want<br>
> > &gt; <br>
> > &gt; dc:creator rdf:subPropertyOf rdf:type .<br>
> > &gt; <br>
> > &gt; or perhaps<br>
> > &gt; <br>
> > &gt; eg:creator rdf:subPropertyOf rdf:type .<br>
> > &gt; eg:creator rdf:subPropertyOf dc:creator .<br>
> > &gt; <br>
> > &gt; But if we click on dc:creator we get to:<br>
> > &gt; http://purl.org/dc/elements/1.1/subject<br>
> > &gt; <br>
> > &gt; &lt;rdf:Property 
rdf:about=&quot;http://purl.org/dc/elements/1.1/subject&quot;&gt;<br>
> > &gt; &lt;rdfs:label xml:lang=&quot;en-US&quot;&gt;Subject and 
Keywords&lt;/rdfs:
> label&gt;<br>
> > &gt; &lt;rdfs:comment xml:lang=&quot;en-US&quot;&gt;The topic of the 
content
> > of the <br>
> > &gt; resource.&lt;/rdfs:comment&gt;<br>
> > &gt; &lt;dc:description xml:lang=&quot;en-US&quot;&gt;<br>
> > &gt; Typically, a Subject will be expressed as keywords,<br>
> > &gt; key phrases or classification codes that describe a topic<br>
> > &gt; of the resource. &nbsp;Recommended best practice is to select<br>
> > &gt; a value from a controlled vocabulary or formal<br>
> > &gt; classification scheme.&lt;/dc:description&gt;<br>
> > &gt; &lt;rdfs:isDefinedBy 
rdf:resource=&quot;http://purl.org/dc/elements/1.
> 1/&quot;/&gt;<br>
> > &gt; &lt;dcterms:issued&gt;1999-07-02&lt;/dcterms:issued&gt;<br>
> > &gt; &lt;dcterms:modified&gt;2002-10-04&lt;/dcterms:modified&gt;<br>
> > &gt; &lt;dc:type <br>
> > &gt; rdf:resource=&quot;http://dublincore.
> org/usage/documents/principles/#element&quot;/&gt;<br>
> > &gt; &lt;dcterms:hasVersion <br>
> > &gt; rdf:resource=&quot;http://dublincore.
> org/usage/terms/history/#subject-004&quot;/&gt;<br>
> > &gt; &lt;/rdf:Property&gt;<br>
> > &gt; <br>
> > &gt; and we see that dc:subject should typically be a string from a 
controlled
> > <br>
> > &gt; vocabulary. Thus it seems particularly poor practice to deviate 
from
> > the <br>
> > &gt; preferred usage of dc:subject in order to (over-)simplify our 
model.<br>
> > &gt; <br>
> > &gt; This points to the solution I was earlier advocating of using 
such
> > strings, <br>
> > &gt; using hasValue restrictions to map the strings into classes and 
then
> > using <br>
> > &gt; the class hierachy on those restrictions to show the hierarchical
> > <br>
> > &gt; relationships between the subject vocab terms. To do this well, 
we
> > probably <br>
> > &gt; want to specialise the dc:subject property with a subproperty 
eg:subject,
> > <br>
> > &gt; specify its range with an owl:Datarange explicitly enumerating 
the
> > <br>
> > &gt; controlled vocabulary, and for each term create a class using a 
hasValue
> > <br>
> > &gt; restriction.<br>
> > &gt; For further clarity and usablility we might want to create two 
related
> > <br>
> > &gt; properties, one indicating the (single) intended subject code, 
and
> > the <br>
> > &gt; other indicating all implicit subject codes formed from the class
> > hierachy.<br>
> > &gt; The former would be a subproperty of both the latter and 
dc:subject;
> > the <br>
> > &gt; latter would be used to create the hasValue restrictions.<br>
> > &gt; <br>
> > &gt; Hmmm ... quite a lot of work initially, but the end result is 
that
> > the <br>
> > &gt; subject indicators are marked up using text strings from an 
explicit
> > <br>
> > &gt; controlled vocab; we conform with the defn of dc:subject, even 
with
> > the <br>
> > &gt; advertised best practice; we fall within OWL DL with the 
expectation
> > that <br>
> > &gt; this will give us better reasoning performance, and we have been 
clearer
> > <br>
> > &gt; about we are trying to say. I think the complexity can be hidden 
from
> > the <br>
> > &gt; end users.<br>
> > &gt; <br>
> > &gt; Jeremy<br>
> > &gt; <br>
> > &gt; <br>
> > &gt; <br>
> > &gt; <br>
> > &gt; <br>
> > &gt; <br>
> > &gt; <br>
> > &gt; <br>
> > &gt; <br>
> > &gt; Bernard Vatant wrote:<br>
> > &gt; &gt; <br>
> > &gt; &gt; *BV<br>
> > &gt; &gt; <br>
> > &gt; &gt;&gt;&gt;- Is it worth the trade-off to switch one's ontology 
(otherwise
> > DL)<br>
> > &gt; &gt;&gt;&gt;to OWL-Full, just to<br>
> > &gt; &gt;&gt;&gt;allow its classes to be used as objects in 
'dc:subject'
> > predicates?<br>
> > &gt; &gt; <br>
> > &gt; &gt; <br>
> > &gt; &gt; *Jim<br>
> > &gt; &gt; <br>
> > &gt; &gt;&gt;That's a weird way to ask the question. &nbsp;You mean, 
is
> > it worth doing<br>
> > &gt; &gt;&gt;the extra work to break your naturally occuring model 
just
> > so that<br>
> > &gt; &gt;&gt;you can be in DL?<br>
> > &gt; &gt; <br>
> > &gt; &gt; <br>
> > &gt; &gt; The way I put it might seem weird indeed, but it's the way 
it
> > was set in the real project<br>
> > &gt; &gt; context (real world is weird). We had an OWL-DL ontology, 
and
> > wanted to keep it so, and<br>
> > &gt; &gt; suddenly after six months or so some user wants to be able 
to
> > use a class as a subject of<br>
> > &gt; &gt; a document ... which is one case out of one thousand, the 
999
> > others using 'regular'<br>
> > &gt; &gt; subjects. So using a class as subject of a document is not 
exactly
> > 'naturally occuring'.<br>
> > &gt; &gt; It's a borderline case - not to say a weird one :))<br>
> > &gt; &gt; <br>
> > &gt; &gt; *Jim<br>
> > &gt; &gt; <br>
> > &gt; &gt;&gt;I would argue this is indeed a BP issue, but probably for
> > WORLD not<br>
> > &gt; &gt;&gt;for OPEN... we need to explain why and when you would do 
the
> > extra<br>
> > &gt; &gt;&gt;work (and in every case we have explored it is extra 
work)
> > to make<br>
> > &gt; &gt;&gt;sure your ontology is in the DL profile of OWL.<br>
> > &gt; &gt; <br>
> > &gt; &gt; <br>
> > &gt; &gt; I suggested it might be in PORT scope, because it deals with
> > the terminology vs ontology<br>
> > &gt; &gt; general issue. For me the heart of the question is to know 
what
> > it means to 'use a<br>
> > &gt; &gt; concept' defined in a terminology (glossary, thesaurus, 
subject
> > headings, index...) as a<br>
> > &gt; &gt; class (or a property) in an ontology.<br>
> > &gt; &gt; <br>
> > &gt; &gt; Is 'PhD Thesis' class the same 'subject' (using TM language 
here,
> > sorry) or 'resource'<br>
> > &gt; &gt; than the original concept? The more I think about it, the 
more
> > I have to deal with it, and<br>
> > &gt; &gt; the more I tend to say that they are distinct animals. Jim's
> > PhD Thesis is an instance of<br>
> > &gt; &gt; the class, but not of the concept. One subject of 'Social 
Functions
> > of PhD Thesis in<br>
> > &gt; &gt; Occidental University during 20th century', is the concept 
of
> > PhD Thesis, not the class.<br>
> > &gt; &gt; <br>
> > &gt; &gt; So it's not just an issue of OWL-DL vs OWL-Full, it's also 
an
> > issue of making distinct or<br>
> > &gt; &gt; not those two 'things'. This is a core issue in porting 
thesaurus
> > to the SW, related to<br>
> > &gt; &gt; others of the same kind, like if concepts A and B are 
interpreted
> > as classes, and there is<br>
> > &gt; &gt; a Broader-Narrower relationship between A and B in the 
Thesaurus,
> > has it to be interpreted<br>
> > &gt; &gt; as a class-subclass relationship in the ontology etc.<br>
> > &gt; &gt; <br>
> > &gt; &gt; So I think in that case a BP definition would be 
two-fold<br>
> > &gt; &gt; <br>
> > &gt; &gt; 1. Is it generally a BP to make terminology concepts 
distinct
> > from ontology classes (and<br>
> > &gt; &gt; properties)?<br>
> > &gt; &gt; 2. If agnostic about 1, what is the trade-off when choosing 
to
> > make them distinct or to<br>
> > &gt; &gt; merge them ?<br>
> > &gt; &gt; <br>
> > &gt; &gt; FWIW, having tried both terms of the alternative in the 
course
> > of time, my personal view,<br>
> > &gt; &gt; for above quoted reasons, is that they shoud be kept 
separate,
> > and it's worth the extra<br>
> > &gt; &gt; work (even before being aware of the DL vs Full issue)<br>
> > &gt; &gt; <br>
> > &gt; &gt; Are there other concrete experience on that, not only 
theoretical
> > considerations? Seems<br>
> > &gt; &gt; like there are not so many people exploring the 
terminology-ontology
> > interoperability. Or<br>
> > &gt; &gt; are they?<br>
> > &gt; &gt; <br>
> > &gt; &gt; Bernard Vatant<br>
> > &gt; &gt; Senior Consultant<br>
> > &gt; &gt; Knowledge Engineering<br>
> > &gt; &gt; Mondeca - www.mondeca.com<br>
> > &gt; &gt; bernard.vatant@mondeca.com<br>
> > &gt; &gt; <br>
> > &gt; &gt; <br>
> > &gt; <br>
> > </tt></font>

Received on Thursday, 25 March 2004 08:13:06 UTC