Re: [OPEN] and/or [PORT] : a practical question from Guus Schreiber on 2004-03-26 (public-swbp-wg@w3.org from March 2004)

From: Guus Schreiber <schreiber@cs.vu.nl>
Date: Fri, 26 Mar 2004 14:54:51 +0100
To: Ian Horrocks <horrocks@cs.man.ac.uk>, SWBPD list <public-swbp-wg@w3.org>
Message-ID: <4064362B.6030200@cs.vu.nl>
Ian Horrocks wrote:

> On March 24, Christopher Welty writes:
> 
>>Jeremy wrote on 03/24/2004 04:24:16 AM:
>>
>>
>>>Yes, like Bernard, I have been thinking more about this, and about Ian's 
>>
>>>insistence in WebOnt that classes-and-instances was almost always raised 
>>
>>by 
>>
>>>people wanting to mismodel their world. (cc Ian, wondering if I have 
>>
>>learnt 
>>
>>>my lessons well!, or misrepresented him)
> 
> 
> Jeremy,
> 
> You can go to the top of the class :-)
> 
> 
> 
>>Well, "mismodelling their world" is not limited to classes as instances. I 
>>find it rather dangerous to make such statements.  People use subclass 
>>incorrectly, too, but that wasn't a reason to remove that axiom from OWL 
>>DL.
> 
> 
> I would say that there is a big difference. Like any part of the
> language, subClass may occasionally be abused, but it is used very
> widely and most people seem able to use it more or less
> correctly. Moreover, subClass fits into a family of logics which are
> theoretically well understood and for which there is considerable
> implementation experience.
> 
> In contrast, classes as instances are relatively rarely used (most
> forms of conceptual modelling, databases etc., seem to have managed
> perfectly well without them), and hardly ever used
> "correctly". 

After reading this I was about to put my chai hat off and send a flame. 
But it generated such well-founded responses that I want to thank for 
starting this dialectic debate.

A response to Bernard's question follows....

Guus

  reading the responses
> Moreover, the resulting logics are much less well
> understood and there is little implementation experience.
> 
> Ian
> 
> 
> 
> 
> 
>> People just mismodel their worlds, I hope we can offer some advice on 
>>both how to do some of these things and how NOT to do it.
>>
>>Anyway, your analysis exposed some important misconceptions, espcially 
>>regarding so-called "subject hierarchies" and class hierarchies.  I've 
>>written a paper or two about the problem, in this one: 
>>[http://dx.doi.org/10.1016/S0169-023X(99)90021-6] I basically show that 
>>subject taxonomies are actually "part"onomies, or more precisely spatial 
>>containment,  not subclass (in fact, etymologically, "subject" means to 
>>throw under, "topic" is a region, and "about" means near).  Some of the 
>>initial problems of representing subject taxonomies in DLs are discussed 
>>in a paper in the first FOIS conference, which may be hard to find.  I 
>>can't seem to find a softcopy myself.
>>
>>
>>>The class hierarchy in RDFS/OWL is there to describe hierarchies of 
>>
>>classes 
>>
>>>of resources. Just because you have a hierarchy of subject descriptors 
>>>doesn't make it a class hierarchy.
>>>
>>>It seems to be confusing the human way of thinking of analogy and 
>>
>>metaphor 
>>
>>>(any hierarchy can act as a metaphor for any other hierarchy) with what 
>>
>>is 
>>
>>>a logical and implementation issue about how to say what we want to say 
>>>about our knowledge of our world in a way that machines can process it.
>>>
>>>Thus if PhDThesis is an owl:Class what are the resources that we intend 
>>
>>to 
>>
>>>belong to it? Probably my PhD Thesis with title "Graph Grammars: an 
>>>approach to transfer based MT; exemplified by a Turkish-English system" 
>>
>>is 
>>
>>>one such resource, but the copy sitting on my bookshelf is probably not.
>>>
>>>Then if that is the case what would we mean by dc:subject linking the 
>>>resource of my thesis with this class .... hmmm ... we mean my thisis 
>>>belongs to that class, i.e. rdf:type.
>>>So if we want to treat this subject hierachy as classes we really also 
>>
>>want
>>
>>>dc:creator rdf:subPropertyOf rdf:type .
>>>
>>>or perhaps
>>>
>>>eg:creator rdf:subPropertyOf rdf:type .
>>>eg:creator rdf:subPropertyOf dc:creator .
>>>
>>>But if we click on dc:creator we get to:
>>>http://purl.org/dc/elements/1.1/subject
>>>
>>><rdf:Property rdf:about="http://purl.org/dc/elements/1.1/subject">
>>><rdfs:label xml:lang="en-US">Subject and Keywords</rdfs:label>
>>><rdfs:comment xml:lang="en-US">The topic of the content of the 
>>>resource.</rdfs:comment>
>>><dc:description xml:lang="en-US">
>>>Typically, a Subject will be expressed as keywords,
>>>key phrases or classification codes that describe a topic
>>>of the resource.  Recommended best practice is to select
>>>a value from a controlled vocabulary or formal
>>>classification scheme.</dc:description>
>>><rdfs:isDefinedBy rdf:resource="http://purl.org/dc/elements/1.1/"/>
>>><dcterms:issued>1999-07-02</dcterms:issued>
>>><dcterms:modified>2002-10-04</dcterms:modified>
>>><dc:type 
>>>
>>
>>rdf:resource="http://dublincore.org/usage/documents/principles/#element"/>
>>
>>><dcterms:hasVersion 
>>>rdf:resource="http://dublincore.org/usage/terms/history/#subject-004"/>
>>></rdf:Property>
>>>
>>>and we see that dc:subject should typically be a string from a 
>>
>>controlled 
>>
>>>vocabulary. Thus it seems particularly poor practice to deviate from the 
>>
>>>preferred usage of dc:subject in order to (over-)simplify our model.
>>>
>>>This points to the solution I was earlier advocating of using such 
>>
>>strings, 
>>
>>>using hasValue restrictions to map the strings into classes and then 
>>
>>using 
>>
>>>the class hierachy on those restrictions to show the hierarchical 
>>>relationships between the subject vocab terms. To do this well, we 
>>
>>probably 
>>
>>>want to specialise the dc:subject property with a subproperty 
>>
>>eg:subject, 
>>
>>>specify its range with an owl:Datarange explicitly enumerating the 
>>>controlled vocabulary, and for each term create a class using a hasValue 
>>
>>>restriction.
>>>For further clarity and usablility we might want to create two related 
>>>properties, one indicating the (single) intended subject code, and the 
>>>other indicating all implicit subject codes formed from the class 
>>
>>hierachy.
>>
>>>The former would be a subproperty of both the latter and dc:subject; the 
>>
>>>latter would be used to create the hasValue restrictions.
>>>
>>>Hmmm ... quite a lot of work initially, but the end result is that the 
>>>subject indicators are marked up using text strings from an explicit 
>>>controlled vocab; we conform with the defn of dc:subject, even with the 
>>>advertised best practice; we fall within OWL DL with the expectation 
>>
>>that 
>>
>>>this will give us better reasoning performance, and we have been clearer 
>>
>>>about we are trying to say. I think the complexity can be hidden from 
>>
>>the 
>>
>>>end users.
>>>
>>>Jeremy
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>Bernard Vatant wrote:
>>>
>>>>*BV
>>>>
>>>>
>>>>>>- Is it worth the trade-off to switch one's ontology (otherwise DL)
>>>>>>to OWL-Full, just to
>>>>>>allow its classes to be used as objects in 'dc:subject' predicates?
>>>>
>>>>
>>>>*Jim
>>>>
>>>>
>>>>>That's a weird way to ask the question.  You mean, is it worth doing
>>>>>the extra work to break your naturally occuring model just so that
>>>>>you can be in DL?
>>>>
>>>>
>>>>The way I put it might seem weird indeed, but it's the way it was set 
>>
>>in the real project
>>
>>>>context (real world is weird). We had an OWL-DL ontology, and wanted 
>>
>>to keep it so, and
>>
>>>>suddenly after six months or so some user wants to be able to use a 
>>
>>class as a subject of
>>
>>>>a document ... which is one case out of one thousand, the 999 others 
>>
>>using 'regular'
>>
>>>>subjects. So using a class as subject of a document is not exactly 
>>
>>'naturally occuring'.
>>
>>>>It's a borderline case - not to say a weird one :))
>>>>
>>>>*Jim
>>>>
>>>>
>>>>>I would argue this is indeed a BP issue, but probably for WORLD not
>>>>>for OPEN... we need to explain why and when you would do the extra
>>>>>work (and in every case we have explored it is extra work) to make
>>>>>sure your ontology is in the DL profile of OWL.
>>>>
>>>>
>>>>I suggested it might be in PORT scope, because it deals with the 
>>
>>terminology vs ontology
>>
>>>>general issue. For me the heart of the question is to know what it 
>>
>>means to 'use a
>>
>>>>concept' defined in a terminology (glossary, thesaurus, subject 
>>
>>headings, index...) as a
>>
>>>>class (or a property) in an ontology.
>>>>
>>>>Is 'PhD Thesis' class the same 'subject' (using TM language here, 
>>
>>sorry) or 'resource'
>>
>>>>than the original concept? The more I think about it, the more I have 
>>
>>to deal with it, and
>>
>>>>the more I tend to say that they are distinct animals. Jim's PhD 
>>
>>Thesis is an instance of
>>
>>>>the class, but not of the concept. One subject of 'Social Functions of 
>>
>>PhD Thesis in
>>
>>>>Occidental University during 20th century', is the concept of PhD 
>>
>>Thesis, not the class.
>>
>>>>So it's not just an issue of OWL-DL vs OWL-Full, it's also an issue of 
>>
>>making distinct or
>>
>>>>not those two 'things'. This is a core issue in porting thesaurus to 
>>
>>the SW, related to
>>
>>>>others of the same kind, like if concepts A and B are interpreted as 
>>
>>classes, and there is
>>
>>>>a Broader-Narrower relationship between A and B in the Thesaurus, has 
>>
>>it to be interpreted
>>
>>>>as a class-subclass relationship in the ontology etc.
>>>>
>>>>So I think in that case a BP definition would be two-fold
>>>>
>>>>1. Is it generally a BP to make terminology concepts distinct from 
>>
>>ontology classes (and
>>
>>>>properties)?
>>>>2. If agnostic about 1, what is the trade-off when choosing to make 
>>
>>them distinct or to
>>
>>>>merge them ?
>>>>
>>>>FWIW, having tried both terms of the alternative in the course of 
>>
>>time, my personal view,
>>
>>>>for above quoted reasons, is that they shoud be kept separate, and 
>>
>>it's worth the extra
>>
>>>>work (even before being aware of the DL vs Full issue)
>>>>
>>>>Are there other concrete experience on that, not only theoretical 
>>
>>considerations? Seems
>>
>>>>like there are not so many people exploring the terminology-ontology 
>>
>>interoperability. Or
>>
>>>>are they?
>>>>
>>>>Bernard Vatant
>>>>Senior Consultant
>>>>Knowledge Engineering
>>>>Mondeca - www.mondeca.com
>>>>bernard.vatant@mondeca.com
>>>>
>>>>
>>>
>><br><font size=2><tt>Jeremy wrote on 03/24/2004 04:24:16 AM:<br>
>><br>
>>&gt; <br>
>>&gt; Yes, like Bernard, I have been thinking more about this, and about
>>Ian's <br>
>>&gt; insistence in WebOnt that classes-and-instances was almost always
>>raised by <br>
>>&gt; people wanting to mismodel their world. (cc Ian, wondering if I have
>>learnt <br>
>>&gt; my lessons well!, or misrepresented him)<br>
>></tt></font>
>><br><font size=2><tt>Well, &quot;mismodelling their world&quot; is not
>>limited to classes as instances. I find it rather dangerous to make such
>>statements. &nbsp;People use subclass incorrectly, too, but that wasn't
>>a reason to remove that axiom from OWL DL. &nbsp;People just mismodel their
>>worlds, I hope we can offer some advice on both how to do some of these
>>things and how NOT to do it.</tt></font>
>><br>
>><br><font size=2><tt>Anyway, your analysis exposed some important misconceptions,
>>espcially regarding so-called &quot;subject hierarchies&quot; and class
>>hierarchies. &nbsp;I've written a paper or two about the problem, in this
>>one: [http://dx.doi.org/10.1016/S0169-023X(99)90021-6] I basically show
>>that subject taxonomies are actually &quot;part&quot;onomies, or more precisely
>>spatial containment, &nbsp;not subclass (in fact, etymologically, &quot;subject&quot;
>>means to throw under, &quot;topic&quot; is a region, and &quot;about&quot;
>>means near). &nbsp;Some of the initial problems of representing subject
>>taxonomies in DLs are discussed in a paper in the first FOIS conference,
>>which may be hard to find. &nbsp;I can't seem to find a softcopy myself.</tt></font>
>><br>
>><br><font size=2><tt>&gt; The class hierarchy in RDFS/OWL is there to describe
>>hierarchies of classes <br>
>>&gt; of resources. Just because you have a hierarchy of subject descriptors
>><br>
>>&gt; doesn't make it a class hierarchy.<br>
>>&gt; <br>
>>&gt; It seems to be confusing the human way of thinking of analogy and
>>metaphor <br>
>>&gt; (any hierarchy can act as a metaphor for any other hierarchy) with
>>what is <br>
>>&gt; a logical and implementation issue about how to say what we want to
>>say <br>
>>&gt; about our knowledge of our world in a way that machines can process
>>it.<br>
>>&gt; <br>
>>&gt; Thus if PhDThesis is an owl:Class what are the resources that we intend
>>to <br>
>>&gt; belong to it? Probably my PhD Thesis with title &quot;Graph Grammars:
>>an <br>
>>&gt; approach to transfer based MT; exemplified by a Turkish-English system&quot;
>>is <br>
>>&gt; one such resource, but the copy sitting on my bookshelf is probably
>>not.<br>
>>&gt; <br>
>>&gt; Then if that is the case what would we mean by dc:subject linking
>>the <br>
>>&gt; resource of my thesis with this class .... hmmm ... we mean my thisis
>><br>
>>&gt; belongs to that class, i.e. rdf:type.<br>
>>&gt; So if we want to treat this subject hierachy as classes we really
>>also want<br>
>>&gt; <br>
>>&gt; dc:creator rdf:subPropertyOf rdf:type .<br>
>>&gt; <br>
>>&gt; or perhaps<br>
>>&gt; <br>
>>&gt; eg:creator rdf:subPropertyOf rdf:type .<br>
>>&gt; eg:creator rdf:subPropertyOf dc:creator .<br>
>>&gt; <br>
>>&gt; But if we click on dc:creator we get to:<br>
>>&gt; http://purl.org/dc/elements/1.1/subject<br>
>>&gt; <br>
>>&gt; &lt;rdf:Property rdf:about=&quot;http://purl.org/dc/elements/1.1/subject&quot;&gt;<br>
>>&gt; &lt;rdfs:label xml:lang=&quot;en-US&quot;&gt;Subject and Keywords&lt;/rdfs:label&gt;<br>
>>&gt; &lt;rdfs:comment xml:lang=&quot;en-US&quot;&gt;The topic of the content
>>of the <br>
>>&gt; resource.&lt;/rdfs:comment&gt;<br>
>>&gt; &lt;dc:description xml:lang=&quot;en-US&quot;&gt;<br>
>>&gt; Typically, a Subject will be expressed as keywords,<br>
>>&gt; key phrases or classification codes that describe a topic<br>
>>&gt; of the resource. &nbsp;Recommended best practice is to select<br>
>>&gt; a value from a controlled vocabulary or formal<br>
>>&gt; classification scheme.&lt;/dc:description&gt;<br>
>>&gt; &lt;rdfs:isDefinedBy rdf:resource=&quot;http://purl.org/dc/elements/1.1/&quot;/&gt;<br>
>>&gt; &lt;dcterms:issued&gt;1999-07-02&lt;/dcterms:issued&gt;<br>
>>&gt; &lt;dcterms:modified&gt;2002-10-04&lt;/dcterms:modified&gt;<br>
>>&gt; &lt;dc:type <br>
>>&gt; rdf:resource=&quot;http://dublincore.org/usage/documents/principles/#element&quot;/&gt;<br>
>>&gt; &lt;dcterms:hasVersion <br>
>>&gt; rdf:resource=&quot;http://dublincore.org/usage/terms/history/#subject-004&quot;/&gt;<br>
>>&gt; &lt;/rdf:Property&gt;<br>
>>&gt; <br>
>>&gt; and we see that dc:subject should typically be a string from a controlled
>><br>
>>&gt; vocabulary. Thus it seems particularly poor practice to deviate from
>>the <br>
>>&gt; preferred usage of dc:subject in order to (over-)simplify our model.<br>
>>&gt; <br>
>>&gt; This points to the solution I was earlier advocating of using such
>>strings, <br>
>>&gt; using hasValue restrictions to map the strings into classes and then
>>using <br>
>>&gt; the class hierachy on those restrictions to show the hierarchical
>><br>
>>&gt; relationships between the subject vocab terms. To do this well, we
>>probably <br>
>>&gt; want to specialise the dc:subject property with a subproperty eg:subject,
>><br>
>>&gt; specify its range with an owl:Datarange explicitly enumerating the
>><br>
>>&gt; controlled vocabulary, and for each term create a class using a hasValue
>><br>
>>&gt; restriction.<br>
>>&gt; For further clarity and usablility we might want to create two related
>><br>
>>&gt; properties, one indicating the (single) intended subject code, and
>>the <br>
>>&gt; other indicating all implicit subject codes formed from the class
>>hierachy.<br>
>>&gt; The former would be a subproperty of both the latter and dc:subject;
>>the <br>
>>&gt; latter would be used to create the hasValue restrictions.<br>
>>&gt; <br>
>>&gt; Hmmm ... quite a lot of work initially, but the end result is that
>>the <br>
>>&gt; subject indicators are marked up using text strings from an explicit
>><br>
>>&gt; controlled vocab; we conform with the defn of dc:subject, even with
>>the <br>
>>&gt; advertised best practice; we fall within OWL DL with the expectation
>>that <br>
>>&gt; this will give us better reasoning performance, and we have been clearer
>><br>
>>&gt; about we are trying to say. I think the complexity can be hidden from
>>the <br>
>>&gt; end users.<br>
>>&gt; <br>
>>&gt; Jeremy<br>
>>&gt; <br>
>>&gt; <br>
>>&gt; <br>
>>&gt; <br>
>>&gt; <br>
>>&gt; <br>
>>&gt; <br>
>>&gt; <br>
>>&gt; <br>
>>&gt; Bernard Vatant wrote:<br>
>>&gt; &gt; <br>
>>&gt; &gt; *BV<br>
>>&gt; &gt; <br>
>>&gt; &gt;&gt;&gt;- Is it worth the trade-off to switch one's ontology (otherwise
>>DL)<br>
>>&gt; &gt;&gt;&gt;to OWL-Full, just to<br>
>>&gt; &gt;&gt;&gt;allow its classes to be used as objects in 'dc:subject'
>>predicates?<br>
>>&gt; &gt; <br>
>>&gt; &gt; <br>
>>&gt; &gt; *Jim<br>
>>&gt; &gt; <br>
>>&gt; &gt;&gt;That's a weird way to ask the question. &nbsp;You mean, is
>>it worth doing<br>
>>&gt; &gt;&gt;the extra work to break your naturally occuring model just
>>so that<br>
>>&gt; &gt;&gt;you can be in DL?<br>
>>&gt; &gt; <br>
>>&gt; &gt; <br>
>>&gt; &gt; The way I put it might seem weird indeed, but it's the way it
>>was set in the real project<br>
>>&gt; &gt; context (real world is weird). We had an OWL-DL ontology, and
>>wanted to keep it so, and<br>
>>&gt; &gt; suddenly after six months or so some user wants to be able to
>>use a class as a subject of<br>
>>&gt; &gt; a document ... which is one case out of one thousand, the 999
>>others using 'regular'<br>
>>&gt; &gt; subjects. So using a class as subject of a document is not exactly
>>'naturally occuring'.<br>
>>&gt; &gt; It's a borderline case - not to say a weird one :))<br>
>>&gt; &gt; <br>
>>&gt; &gt; *Jim<br>
>>&gt; &gt; <br>
>>&gt; &gt;&gt;I would argue this is indeed a BP issue, but probably for
>>WORLD not<br>
>>&gt; &gt;&gt;for OPEN... we need to explain why and when you would do the
>>extra<br>
>>&gt; &gt;&gt;work (and in every case we have explored it is extra work)
>>to make<br>
>>&gt; &gt;&gt;sure your ontology is in the DL profile of OWL.<br>
>>&gt; &gt; <br>
>>&gt; &gt; <br>
>>&gt; &gt; I suggested it might be in PORT scope, because it deals with
>>the terminology vs ontology<br>
>>&gt; &gt; general issue. For me the heart of the question is to know what
>>it means to 'use a<br>
>>&gt; &gt; concept' defined in a terminology (glossary, thesaurus, subject
>>headings, index...) as a<br>
>>&gt; &gt; class (or a property) in an ontology.<br>
>>&gt; &gt; <br>
>>&gt; &gt; Is 'PhD Thesis' class the same 'subject' (using TM language here,
>>sorry) or 'resource'<br>
>>&gt; &gt; than the original concept? The more I think about it, the more
>>I have to deal with it, and<br>
>>&gt; &gt; the more I tend to say that they are distinct animals. Jim's
>>PhD Thesis is an instance of<br>
>>&gt; &gt; the class, but not of the concept. One subject of 'Social Functions
>>of PhD Thesis in<br>
>>&gt; &gt; Occidental University during 20th century', is the concept of
>>PhD Thesis, not the class.<br>
>>&gt; &gt; <br>
>>&gt; &gt; So it's not just an issue of OWL-DL vs OWL-Full, it's also an
>>issue of making distinct or<br>
>>&gt; &gt; not those two 'things'. This is a core issue in porting thesaurus
>>to the SW, related to<br>
>>&gt; &gt; others of the same kind, like if concepts A and B are interpreted
>>as classes, and there is<br>
>>&gt; &gt; a Broader-Narrower relationship between A and B in the Thesaurus,
>>has it to be interpreted<br>
>>&gt; &gt; as a class-subclass relationship in the ontology etc.<br>
>>&gt; &gt; <br>
>>&gt; &gt; So I think in that case a BP definition would be two-fold<br>
>>&gt; &gt; <br>
>>&gt; &gt; 1. Is it generally a BP to make terminology concepts distinct
>>from ontology classes (and<br>
>>&gt; &gt; properties)?<br>
>>&gt; &gt; 2. If agnostic about 1, what is the trade-off when choosing to
>>make them distinct or to<br>
>>&gt; &gt; merge them ?<br>
>>&gt; &gt; <br>
>>&gt; &gt; FWIW, having tried both terms of the alternative in the course
>>of time, my personal view,<br>
>>&gt; &gt; for above quoted reasons, is that they shoud be kept separate,
>>and it's worth the extra<br>
>>&gt; &gt; work (even before being aware of the DL vs Full issue)<br>
>>&gt; &gt; <br>
>>&gt; &gt; Are there other concrete experience on that, not only theoretical
>>considerations? Seems<br>
>>&gt; &gt; like there are not so many people exploring the terminology-ontology
>>interoperability. Or<br>
>>&gt; &gt; are they?<br>
>>&gt; &gt; <br>
>>&gt; &gt; Bernard Vatant<br>
>>&gt; &gt; Senior Consultant<br>
>>&gt; &gt; Knowledge Engineering<br>
>>&gt; &gt; Mondeca - www.mondeca.com<br>
>>&gt; &gt; bernard.vatant@mondeca.com<br>
>>&gt; &gt; <br>
>>&gt; &gt; <br>
>>&gt; <br>
>></tt></font>
> 
> 

-- 
Free University Amsterdam, Computer Science
De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
Tel: +31 20 444 7739/7718
E-mail: schreiber@cs.vu.nl
Home page: http://www.cs.vu.nl/~guus/
Received on Friday, 26 March 2004 09:02:03 UTC