Re: "factoring out" in the construction of FuGO = the search for "universals" from William Bug on 2006-07-08 (public-semweb-lifesci@w3.org from July 2006)

From: William Bug <William.Bug@DrexelMed.edu>
Date: Sat, 8 Jul 2006 12:32:34 -0400
To: "Xiaoshu Wang" <wangxiao@musc.edu>
Cc: w3c semweb hcls <public-semweb-lifesci@w3.org>, Trish Whetzel <whetzel@pcbi.upenn.edu>
Message-Id: <49E283EA-E286-4041-982D-003E76822CC0@DrexelMed.edu>
Please see below for inline comments.

Cheers,
Bill

On Jul 8, 2006, at 8:06 AM, Xiaoshu Wang wrote:

>
>> To use the example you have chosen, for an anatomist to want
>> to use FuGO and be ontological compatible, they need only use
>> an ontology founded on the same definition of continuants &
>> occurents.  How does this differ from what I think you are
>> claiming.  An anatomist need not necessarily care about the
>> details of FuGO entities bearing no relation to data they
>> intend to semantically integrate with (such as
>> 'LC_instrument') to remain fundamentally compatible with
>> FuGO.  They only need agree on the definition and use of the
>> foundational levels of the ontology they share in common - in
>> this case 'Continuant', 'Independent_ continuant', and
>> 'Object'.  In fact, this would be encouraged by the OBO
>> Foundry participants - use FuGO for formal semantic
>> descriptions of assay/instrument/reagent provenance and use
>> the Foundational Model of Anatomy to deal with anatomy (right
>> now, just human anatomy, but that is being worked on) - both
>> of which are founded on BFO.
>
> I think we have discussed the problem of "web closure", see  
> Chimezie's wiki
> here http://esw.w3.org/topic/HCLS/WebClosureSocialConvention.  We  
> have to
> realize that you cannot selectively import a subgraph from an RDF  
> model.
> And using owl:import, in effect, tightly bind one ontology to  
> another.  If
> ontological terms are not carefully grouped under different  
> namespaces, we
> will eventually end up a one-for-all schema, which is neither  
> practical nor
> useful.

This is a big concern for what we are trying to do at BIRN - and - as  
I mentioned - what is being proposed under the OBO Foundry principles  
(http://obofoundry.org/).  Both of these are very large, field-wide  
efforts with an increasing amount of resources being invested, so it  
will be important to make these issues clear to the community of  
biomed. ontology informaticists currently using OWL for these purposes.

I ran into exactly the problems you describe, when trying to  
selectively export branches of our BIRNLex semantic graph.  I tried  
applying separate namespaces to make this work.  I really do  
understand the technical/engineering issue you are discussing.  I  
just wasn't able to make this work under Protégé-OWL, either because:
	1) I'm ignorant of the tools available to support this task;
	2) The tools don't exist (tools for adding namespace qualifications  
to subgraphs in an OWL file do exist in Protégé, but I wasn't able to  
use them to accomplish this task)
	3) As others on this list - or in the TCon have suggested - the  
complexity and interdependence of nodes across disparate portions of  
underlying graph really prohibits one from doing this.


>
>> The "modularization" you refer to via namespaces relates to a
>> conceptually different issue, as I see it.  Even though the
>> use of namespace - especially in XML & RDF - has advanced to
>> something considerably beyond its original application of
>> simply avoiding tag/term/name "collisions", the
>> "modularization" namespaces can support in XML space is
>> distinctly different from the "factoring out" of universals
>> to which Trish referred.
>
> When I think of "factoring out", I thought of software engineer's  
> term of
> "refactoring" to decouple the hard-coded dependency.  Yes, when I  
> replied to
> Trish, I want to state explicitly what I mean by "factoring out".

This was clear in your post.  The point I was making, is Trish is  
referring to a distinctly different process derived from the OBO  
Foundry Principles for ontology construction.

>
>> Using the namespace facility inherent in XML formalisms might
>> be a useful way of enabling people to exchange different
>> world views for sub-regions of the graph, that's true.  But
>> it is by no means a requirement.  One should still expect
>> whether or not you are using namespaces to chop up or
>> modularize different portions of the ontological graph, you
>> are still following the same principle of extrapolating to
>> universals, even if the universals for "namespace_a"'s
>> coverage of the cardinal parts of a liquid chromatographic
>> apparatus differed from the universals defined for
>> "namespace_b"s coverage of the cardinal parts of a liquid
>> chromatographic apparatus .
>
> I think you misunderstood the topic.  The factoring out to  
> "universals" etc.
> is semantic issue.  For instance, what makes a "is_a" relationship, or
> Guiono's OntoClean Methodology.

I don't think I misunderstood, as I mention above

> The factoring out that I mean, a.k.a.,
> modulization is an engineer issue.  The task is not difficult to  
> envision,
> imaging that you are writing an RDF reasoner, and given one  
> assertion from
> say ":foo a fugo:continuant"

Can you provide a more specific example.  In the formalism being  
applied in the FuGO curation process, the only nodes which would map  
to ':foo' this N3 triplet would be ':independent_continuant' and  
':dependent_continuant', and I can't think of any instance data that  
will map to those nodes in the graph.

> , try to play around it and see how many
> assertions you will be end up with.  For human beings, there is no  
> such
> "engineer" issue.  When we speak one word, we never intend to pull  
> out the
> entire dictionary.  But semantic web is dealling with machines.

I really do understand the point is to support algorithmic processing  
of formal semantic statements - from determining approximate semantic  
equivalence of entities across separate Knowledge Maps up to more  
advance reasoning applications.  I also have a very deep appreciation  
for the problems that can arise when one conflates the lexicon with  
an ontology - similar to the distinction Nicola Guarino makes between  
"a set of extensional relations describing a particular state of  
affairs" and the "intensional relations" or "conceptual grid which we  
superimpose to various possible state of affairs" (http:// 
www.formalontology.it/section_4.htm).

>
>> If namespaces were used as the primary means of "factoring
>> out" universals, you could end up with quite a proliferation
>> of namespaces for FuGO - in the extreme - one for every node
>> in the graph.
>
> Yes, of course. It is a price we have to pay in order to maximize the
> sharing and reuse of an ontology.  Nothing comes for free.
>
> Of course, how to modulize is more of an art than science.  But  
> there are
> certain principles that we can follow.  One is Gruber's "Principle of
> minimal ontological commitment" (Gruber, Toward Principles for the  
> Design of
> Ontologies Used for Knowledge Sharing, Knowledge Systems Laboratory.
> Stanford University., 1993).  The other that I proposed the  
> Principles of
> Orthogonal Domain in my manuscripts.

Orthoganality is one amongst the several OBO Foundry Principles.  I  
will look to the proposal in your manuscript, but can you quickly  
take a look at the OBO Foundry Principles and explain how your  
proposal differs from what they propose.  This is a very important  
issue, because these principles are being applied now across a very  
broad field of biomedical ontology development projects.  If you  
scale that to the many instance data repositories for which the  
ontologies are being - and will continue to be - used to create  
knowledge maps/association files, one can see a very large amount of  
the data space HCLS semantic web projects will want to tap into will  
be fashioned according to these principles.

>
>> I wish I could understand what you describe at
>> http://www.charlestoncore.org/ont/2005/08/o3.htm a little
>> better.  I found your Nature Biotech. paper very clear and
>> extremely helpful in several discussions I've been having on
>> the issue of XML vs. RDF, but I really am having a hard time
>> with your description of O3 on that web page.  Is there a
>> publication where you go into this in more detail?
>
> Yes, I did have a manuscript written and send to Journal of Applied
> Ontologies recently.  I tried to keep the informaiton on the web to be
> minimal because I don't want to incur any potential conflict.  But  
> I will
> send you my manuscript in private.

Many thanks, Xiaoshu.

>
> Xiaoshu
>
>

Bill Bug
Senior Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu







This email and any accompanying attachments are confidential. 
This information is intended solely for the use of the individual 
to whom it is addressed. Any review, disclosure, copying, 
distribution, or use of this email communication by others is strictly 
prohibited. If you are not the intended recipient please notify us 
immediately by returning this message to the sender and delete 
all copies. Thank you for your cooperation.
Received on Saturday, 8 July 2006 16:32:49 UTC