- From: Guus Schreiber <schreiber@cs.vu.nl>
- Date: Fri, 26 Mar 2004 14:54:51 +0100
- To: Ian Horrocks <horrocks@cs.man.ac.uk>, SWBPD list <public-swbp-wg@w3.org>
Ian Horrocks wrote: > On March 24, Christopher Welty writes: > >>Jeremy wrote on 03/24/2004 04:24:16 AM: >> >> >>>Yes, like Bernard, I have been thinking more about this, and about Ian's >> >>>insistence in WebOnt that classes-and-instances was almost always raised >> >>by >> >>>people wanting to mismodel their world. (cc Ian, wondering if I have >> >>learnt >> >>>my lessons well!, or misrepresented him) > > > Jeremy, > > You can go to the top of the class :-) > > > >>Well, "mismodelling their world" is not limited to classes as instances. I >>find it rather dangerous to make such statements. People use subclass >>incorrectly, too, but that wasn't a reason to remove that axiom from OWL >>DL. > > > I would say that there is a big difference. Like any part of the > language, subClass may occasionally be abused, but it is used very > widely and most people seem able to use it more or less > correctly. Moreover, subClass fits into a family of logics which are > theoretically well understood and for which there is considerable > implementation experience. > > In contrast, classes as instances are relatively rarely used (most > forms of conceptual modelling, databases etc., seem to have managed > perfectly well without them), and hardly ever used > "correctly". After reading this I was about to put my chai hat off and send a flame. But it generated such well-founded responses that I want to thank for starting this dialectic debate. A response to Bernard's question follows.... Guus reading the responses > Moreover, the resulting logics are much less well > understood and there is little implementation experience. > > Ian > > > > > >> People just mismodel their worlds, I hope we can offer some advice on >>both how to do some of these things and how NOT to do it. >> >>Anyway, your analysis exposed some important misconceptions, espcially >>regarding so-called "subject hierarchies" and class hierarchies. I've >>written a paper or two about the problem, in this one: >>[http://dx.doi.org/10.1016/S0169-023X(99)90021-6] I basically show that >>subject taxonomies are actually "part"onomies, or more precisely spatial >>containment, not subclass (in fact, etymologically, "subject" means to >>throw under, "topic" is a region, and "about" means near). Some of the >>initial problems of representing subject taxonomies in DLs are discussed >>in a paper in the first FOIS conference, which may be hard to find. I >>can't seem to find a softcopy myself. >> >> >>>The class hierarchy in RDFS/OWL is there to describe hierarchies of >> >>classes >> >>>of resources. Just because you have a hierarchy of subject descriptors >>>doesn't make it a class hierarchy. >>> >>>It seems to be confusing the human way of thinking of analogy and >> >>metaphor >> >>>(any hierarchy can act as a metaphor for any other hierarchy) with what >> >>is >> >>>a logical and implementation issue about how to say what we want to say >>>about our knowledge of our world in a way that machines can process it. >>> >>>Thus if PhDThesis is an owl:Class what are the resources that we intend >> >>to >> >>>belong to it? Probably my PhD Thesis with title "Graph Grammars: an >>>approach to transfer based MT; exemplified by a Turkish-English system" >> >>is >> >>>one such resource, but the copy sitting on my bookshelf is probably not. >>> >>>Then if that is the case what would we mean by dc:subject linking the >>>resource of my thesis with this class .... hmmm ... we mean my thisis >>>belongs to that class, i.e. rdf:type. >>>So if we want to treat this subject hierachy as classes we really also >> >>want >> >>>dc:creator rdf:subPropertyOf rdf:type . >>> >>>or perhaps >>> >>>eg:creator rdf:subPropertyOf rdf:type . >>>eg:creator rdf:subPropertyOf dc:creator . >>> >>>But if we click on dc:creator we get to: >>>http://purl.org/dc/elements/1.1/subject >>> >>><rdf:Property rdf:about="http://purl.org/dc/elements/1.1/subject"> >>><rdfs:label xml:lang="en-US">Subject and Keywords</rdfs:label> >>><rdfs:comment xml:lang="en-US">The topic of the content of the >>>resource.</rdfs:comment> >>><dc:description xml:lang="en-US"> >>>Typically, a Subject will be expressed as keywords, >>>key phrases or classification codes that describe a topic >>>of the resource. Recommended best practice is to select >>>a value from a controlled vocabulary or formal >>>classification scheme.</dc:description> >>><rdfs:isDefinedBy rdf:resource="http://purl.org/dc/elements/1.1/"/> >>><dcterms:issued>1999-07-02</dcterms:issued> >>><dcterms:modified>2002-10-04</dcterms:modified> >>><dc:type >>> >> >>rdf:resource="http://dublincore.org/usage/documents/principles/#element"/> >> >>><dcterms:hasVersion >>>rdf:resource="http://dublincore.org/usage/terms/history/#subject-004"/> >>></rdf:Property> >>> >>>and we see that dc:subject should typically be a string from a >> >>controlled >> >>>vocabulary. Thus it seems particularly poor practice to deviate from the >> >>>preferred usage of dc:subject in order to (over-)simplify our model. >>> >>>This points to the solution I was earlier advocating of using such >> >>strings, >> >>>using hasValue restrictions to map the strings into classes and then >> >>using >> >>>the class hierachy on those restrictions to show the hierarchical >>>relationships between the subject vocab terms. To do this well, we >> >>probably >> >>>want to specialise the dc:subject property with a subproperty >> >>eg:subject, >> >>>specify its range with an owl:Datarange explicitly enumerating the >>>controlled vocabulary, and for each term create a class using a hasValue >> >>>restriction. >>>For further clarity and usablility we might want to create two related >>>properties, one indicating the (single) intended subject code, and the >>>other indicating all implicit subject codes formed from the class >> >>hierachy. >> >>>The former would be a subproperty of both the latter and dc:subject; the >> >>>latter would be used to create the hasValue restrictions. >>> >>>Hmmm ... quite a lot of work initially, but the end result is that the >>>subject indicators are marked up using text strings from an explicit >>>controlled vocab; we conform with the defn of dc:subject, even with the >>>advertised best practice; we fall within OWL DL with the expectation >> >>that >> >>>this will give us better reasoning performance, and we have been clearer >> >>>about we are trying to say. I think the complexity can be hidden from >> >>the >> >>>end users. >>> >>>Jeremy >>> >>> >>> >>> >>> >>> >>> >>> >>> >>>Bernard Vatant wrote: >>> >>>>*BV >>>> >>>> >>>>>>- Is it worth the trade-off to switch one's ontology (otherwise DL) >>>>>>to OWL-Full, just to >>>>>>allow its classes to be used as objects in 'dc:subject' predicates? >>>> >>>> >>>>*Jim >>>> >>>> >>>>>That's a weird way to ask the question. You mean, is it worth doing >>>>>the extra work to break your naturally occuring model just so that >>>>>you can be in DL? >>>> >>>> >>>>The way I put it might seem weird indeed, but it's the way it was set >> >>in the real project >> >>>>context (real world is weird). We had an OWL-DL ontology, and wanted >> >>to keep it so, and >> >>>>suddenly after six months or so some user wants to be able to use a >> >>class as a subject of >> >>>>a document ... which is one case out of one thousand, the 999 others >> >>using 'regular' >> >>>>subjects. So using a class as subject of a document is not exactly >> >>'naturally occuring'. >> >>>>It's a borderline case - not to say a weird one :)) >>>> >>>>*Jim >>>> >>>> >>>>>I would argue this is indeed a BP issue, but probably for WORLD not >>>>>for OPEN... we need to explain why and when you would do the extra >>>>>work (and in every case we have explored it is extra work) to make >>>>>sure your ontology is in the DL profile of OWL. >>>> >>>> >>>>I suggested it might be in PORT scope, because it deals with the >> >>terminology vs ontology >> >>>>general issue. For me the heart of the question is to know what it >> >>means to 'use a >> >>>>concept' defined in a terminology (glossary, thesaurus, subject >> >>headings, index...) as a >> >>>>class (or a property) in an ontology. >>>> >>>>Is 'PhD Thesis' class the same 'subject' (using TM language here, >> >>sorry) or 'resource' >> >>>>than the original concept? The more I think about it, the more I have >> >>to deal with it, and >> >>>>the more I tend to say that they are distinct animals. Jim's PhD >> >>Thesis is an instance of >> >>>>the class, but not of the concept. One subject of 'Social Functions of >> >>PhD Thesis in >> >>>>Occidental University during 20th century', is the concept of PhD >> >>Thesis, not the class. >> >>>>So it's not just an issue of OWL-DL vs OWL-Full, it's also an issue of >> >>making distinct or >> >>>>not those two 'things'. This is a core issue in porting thesaurus to >> >>the SW, related to >> >>>>others of the same kind, like if concepts A and B are interpreted as >> >>classes, and there is >> >>>>a Broader-Narrower relationship between A and B in the Thesaurus, has >> >>it to be interpreted >> >>>>as a class-subclass relationship in the ontology etc. >>>> >>>>So I think in that case a BP definition would be two-fold >>>> >>>>1. Is it generally a BP to make terminology concepts distinct from >> >>ontology classes (and >> >>>>properties)? >>>>2. If agnostic about 1, what is the trade-off when choosing to make >> >>them distinct or to >> >>>>merge them ? >>>> >>>>FWIW, having tried both terms of the alternative in the course of >> >>time, my personal view, >> >>>>for above quoted reasons, is that they shoud be kept separate, and >> >>it's worth the extra >> >>>>work (even before being aware of the DL vs Full issue) >>>> >>>>Are there other concrete experience on that, not only theoretical >> >>considerations? Seems >> >>>>like there are not so many people exploring the terminology-ontology >> >>interoperability. Or >> >>>>are they? >>>> >>>>Bernard Vatant >>>>Senior Consultant >>>>Knowledge Engineering >>>>Mondeca - www.mondeca.com >>>>bernard.vatant@mondeca.com >>>> >>>> >>> >><br><font size=2><tt>Jeremy wrote on 03/24/2004 04:24:16 AM:<br> >><br> >>> <br> >>> Yes, like Bernard, I have been thinking more about this, and about >>Ian's <br> >>> insistence in WebOnt that classes-and-instances was almost always >>raised by <br> >>> people wanting to mismodel their world. (cc Ian, wondering if I have >>learnt <br> >>> my lessons well!, or misrepresented him)<br> >></tt></font> >><br><font size=2><tt>Well, "mismodelling their world" is not >>limited to classes as instances. I find it rather dangerous to make such >>statements. People use subclass incorrectly, too, but that wasn't >>a reason to remove that axiom from OWL DL. People just mismodel their >>worlds, I hope we can offer some advice on both how to do some of these >>things and how NOT to do it.</tt></font> >><br> >><br><font size=2><tt>Anyway, your analysis exposed some important misconceptions, >>espcially regarding so-called "subject hierarchies" and class >>hierarchies. I've written a paper or two about the problem, in this >>one: [http://dx.doi.org/10.1016/S0169-023X(99)90021-6] I basically show >>that subject taxonomies are actually "part"onomies, or more precisely >>spatial containment, not subclass (in fact, etymologically, "subject" >>means to throw under, "topic" is a region, and "about" >>means near). Some of the initial problems of representing subject >>taxonomies in DLs are discussed in a paper in the first FOIS conference, >>which may be hard to find. I can't seem to find a softcopy myself.</tt></font> >><br> >><br><font size=2><tt>> The class hierarchy in RDFS/OWL is there to describe >>hierarchies of classes <br> >>> of resources. Just because you have a hierarchy of subject descriptors >><br> >>> doesn't make it a class hierarchy.<br> >>> <br> >>> It seems to be confusing the human way of thinking of analogy and >>metaphor <br> >>> (any hierarchy can act as a metaphor for any other hierarchy) with >>what is <br> >>> a logical and implementation issue about how to say what we want to >>say <br> >>> about our knowledge of our world in a way that machines can process >>it.<br> >>> <br> >>> Thus if PhDThesis is an owl:Class what are the resources that we intend >>to <br> >>> belong to it? Probably my PhD Thesis with title "Graph Grammars: >>an <br> >>> approach to transfer based MT; exemplified by a Turkish-English system" >>is <br> >>> one such resource, but the copy sitting on my bookshelf is probably >>not.<br> >>> <br> >>> Then if that is the case what would we mean by dc:subject linking >>the <br> >>> resource of my thesis with this class .... hmmm ... we mean my thisis >><br> >>> belongs to that class, i.e. rdf:type.<br> >>> So if we want to treat this subject hierachy as classes we really >>also want<br> >>> <br> >>> dc:creator rdf:subPropertyOf rdf:type .<br> >>> <br> >>> or perhaps<br> >>> <br> >>> eg:creator rdf:subPropertyOf rdf:type .<br> >>> eg:creator rdf:subPropertyOf dc:creator .<br> >>> <br> >>> But if we click on dc:creator we get to:<br> >>> http://purl.org/dc/elements/1.1/subject<br> >>> <br> >>> <rdf:Property rdf:about="http://purl.org/dc/elements/1.1/subject"><br> >>> <rdfs:label xml:lang="en-US">Subject and Keywords</rdfs:label><br> >>> <rdfs:comment xml:lang="en-US">The topic of the content >>of the <br> >>> resource.</rdfs:comment><br> >>> <dc:description xml:lang="en-US"><br> >>> Typically, a Subject will be expressed as keywords,<br> >>> key phrases or classification codes that describe a topic<br> >>> of the resource. Recommended best practice is to select<br> >>> a value from a controlled vocabulary or formal<br> >>> classification scheme.</dc:description><br> >>> <rdfs:isDefinedBy rdf:resource="http://purl.org/dc/elements/1.1/"/><br> >>> <dcterms:issued>1999-07-02</dcterms:issued><br> >>> <dcterms:modified>2002-10-04</dcterms:modified><br> >>> <dc:type <br> >>> rdf:resource="http://dublincore.org/usage/documents/principles/#element"/><br> >>> <dcterms:hasVersion <br> >>> rdf:resource="http://dublincore.org/usage/terms/history/#subject-004"/><br> >>> </rdf:Property><br> >>> <br> >>> and we see that dc:subject should typically be a string from a controlled >><br> >>> vocabulary. Thus it seems particularly poor practice to deviate from >>the <br> >>> preferred usage of dc:subject in order to (over-)simplify our model.<br> >>> <br> >>> This points to the solution I was earlier advocating of using such >>strings, <br> >>> using hasValue restrictions to map the strings into classes and then >>using <br> >>> the class hierachy on those restrictions to show the hierarchical >><br> >>> relationships between the subject vocab terms. To do this well, we >>probably <br> >>> want to specialise the dc:subject property with a subproperty eg:subject, >><br> >>> specify its range with an owl:Datarange explicitly enumerating the >><br> >>> controlled vocabulary, and for each term create a class using a hasValue >><br> >>> restriction.<br> >>> For further clarity and usablility we might want to create two related >><br> >>> properties, one indicating the (single) intended subject code, and >>the <br> >>> other indicating all implicit subject codes formed from the class >>hierachy.<br> >>> The former would be a subproperty of both the latter and dc:subject; >>the <br> >>> latter would be used to create the hasValue restrictions.<br> >>> <br> >>> Hmmm ... quite a lot of work initially, but the end result is that >>the <br> >>> subject indicators are marked up using text strings from an explicit >><br> >>> controlled vocab; we conform with the defn of dc:subject, even with >>the <br> >>> advertised best practice; we fall within OWL DL with the expectation >>that <br> >>> this will give us better reasoning performance, and we have been clearer >><br> >>> about we are trying to say. I think the complexity can be hidden from >>the <br> >>> end users.<br> >>> <br> >>> Jeremy<br> >>> <br> >>> <br> >>> <br> >>> <br> >>> <br> >>> <br> >>> <br> >>> <br> >>> <br> >>> Bernard Vatant wrote:<br> >>> > <br> >>> > *BV<br> >>> > <br> >>> >>>- Is it worth the trade-off to switch one's ontology (otherwise >>DL)<br> >>> >>>to OWL-Full, just to<br> >>> >>>allow its classes to be used as objects in 'dc:subject' >>predicates?<br> >>> > <br> >>> > <br> >>> > *Jim<br> >>> > <br> >>> >>That's a weird way to ask the question. You mean, is >>it worth doing<br> >>> >>the extra work to break your naturally occuring model just >>so that<br> >>> >>you can be in DL?<br> >>> > <br> >>> > <br> >>> > The way I put it might seem weird indeed, but it's the way it >>was set in the real project<br> >>> > context (real world is weird). We had an OWL-DL ontology, and >>wanted to keep it so, and<br> >>> > suddenly after six months or so some user wants to be able to >>use a class as a subject of<br> >>> > a document ... which is one case out of one thousand, the 999 >>others using 'regular'<br> >>> > subjects. So using a class as subject of a document is not exactly >>'naturally occuring'.<br> >>> > It's a borderline case - not to say a weird one :))<br> >>> > <br> >>> > *Jim<br> >>> > <br> >>> >>I would argue this is indeed a BP issue, but probably for >>WORLD not<br> >>> >>for OPEN... we need to explain why and when you would do the >>extra<br> >>> >>work (and in every case we have explored it is extra work) >>to make<br> >>> >>sure your ontology is in the DL profile of OWL.<br> >>> > <br> >>> > <br> >>> > I suggested it might be in PORT scope, because it deals with >>the terminology vs ontology<br> >>> > general issue. For me the heart of the question is to know what >>it means to 'use a<br> >>> > concept' defined in a terminology (glossary, thesaurus, subject >>headings, index...) as a<br> >>> > class (or a property) in an ontology.<br> >>> > <br> >>> > Is 'PhD Thesis' class the same 'subject' (using TM language here, >>sorry) or 'resource'<br> >>> > than the original concept? The more I think about it, the more >>I have to deal with it, and<br> >>> > the more I tend to say that they are distinct animals. Jim's >>PhD Thesis is an instance of<br> >>> > the class, but not of the concept. One subject of 'Social Functions >>of PhD Thesis in<br> >>> > Occidental University during 20th century', is the concept of >>PhD Thesis, not the class.<br> >>> > <br> >>> > So it's not just an issue of OWL-DL vs OWL-Full, it's also an >>issue of making distinct or<br> >>> > not those two 'things'. This is a core issue in porting thesaurus >>to the SW, related to<br> >>> > others of the same kind, like if concepts A and B are interpreted >>as classes, and there is<br> >>> > a Broader-Narrower relationship between A and B in the Thesaurus, >>has it to be interpreted<br> >>> > as a class-subclass relationship in the ontology etc.<br> >>> > <br> >>> > So I think in that case a BP definition would be two-fold<br> >>> > <br> >>> > 1. Is it generally a BP to make terminology concepts distinct >>from ontology classes (and<br> >>> > properties)?<br> >>> > 2. If agnostic about 1, what is the trade-off when choosing to >>make them distinct or to<br> >>> > merge them ?<br> >>> > <br> >>> > FWIW, having tried both terms of the alternative in the course >>of time, my personal view,<br> >>> > for above quoted reasons, is that they shoud be kept separate, >>and it's worth the extra<br> >>> > work (even before being aware of the DL vs Full issue)<br> >>> > <br> >>> > Are there other concrete experience on that, not only theoretical >>considerations? Seems<br> >>> > like there are not so many people exploring the terminology-ontology >>interoperability. Or<br> >>> > are they?<br> >>> > <br> >>> > Bernard Vatant<br> >>> > Senior Consultant<br> >>> > Knowledge Engineering<br> >>> > Mondeca - www.mondeca.com<br> >>> > bernard.vatant@mondeca.com<br> >>> > <br> >>> > <br> >>> <br> >></tt></font> > > -- Free University Amsterdam, Computer Science De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands Tel: +31 20 444 7739/7718 E-mail: schreiber@cs.vu.nl Home page: http://www.cs.vu.nl/~guus/
Received on Friday, 26 March 2004 09:02:03 UTC