"factoring out" in the construction of FuGO = the search for "universals" from William Bug on 2006-07-08 (public-semweb-lifesci@w3.org from July 2006)

From: William Bug <William.Bug@DrexelMed.edu>
Date: Sat, 8 Jul 2006 00:55:01 -0400
To: Xiaoshu Wang <wangxiao@musc.edu>
Cc: <public-semweb-lifesci@w3.org>
Message-Id: <DB9B70F0-EC3D-4B86-AA58-5BA8AFEFD8B5@DrexelMed.edu>
Hi All,

I think after following these discussions over the last few weeks  
about use of ontologies and bridging of ontologies on SW projects, I  
have to say, I believe the "impedance mismatch" on what different  
folks mean by ontology and the practical use for ontological  
resources is way to great to address via email.  As I've said in  
several neuroinformatics ontology meetings, we really need to start  
off by agreeing on an ontological framework for the concept ontology.

I'm loathe to say this, but maybe this whole issue should be tabled  
for an F2F meeting.

I really wouldn't want this issue to hold up the real work the HCLSIG  
has already begun to make useful progress on.  My only fear is these  
fundamental differences may in fact be incommensurable.

As to the issues Xiaoshu refers here, I think "factoring out" and  
"modularizing" are two very distinct issues here.

The "factoring out" Trish refers to has a very specific meaning in  
the current practice of ontology development embodied in the OBO  
Foundry principles.  It relates to the search for ontological  
"universals" that can be used to create a subsumption 'is_a'  
hierarchy.  This approach to formal KR has been very effective both  
in formal ontological philosophical domains, as well as in AI (see  
Rodney Brooks' Subsumption Architecture) and computational  
linguistics.  In the particular case of the foundational ontology  
being used here (BFO), a continuant (or endurant as labeled in other  
formalisms) means "entities in the world that endure through time:  
entities which persist self-identically even while undergoing changes  
of various sorts" (please see section "1.3 Continuants and  
Occurrents" in "Biodynamic Ontology" [http://ontology.buffalo.edu/ 
medo/biodynamic.pdf]).  Similar core entities are defined for other  
foundational ontologies, though the ontological commitments incurred  
by using a specific foundational ontology can be quite different  
(e.g., BFO vs. DOLCE).

To use the example you have chosen, for an anatomist to want to use  
FuGO and be ontological compatible, they need only use an ontology  
founded on the same definition of continuants & occurents.  How does  
this differ from what I think you are claiming.  An anatomist need  
not necessarily care about the details of FuGO entities bearing no  
relation to data they intend to semantically integrate with (such as  
'LC_instrument') to remain fundamentally compatible with FuGO.  They  
only need agree on the definition and use of the foundational levels  
of the ontology they share in common - in this case 'Continuant',  
'Independent_ continuant', and 'Object'.  In fact, this would be  
encouraged by the OBO Foundry participants - use FuGO for formal  
semantic descriptions of assay/instrument/reagent provenance and use  
the Foundational Model of Anatomy to deal with anatomy (right now,  
just human anatomy, but that is being worked on) - both of which are  
founded on BFO.

The "modularization" you refer to via namespaces relates to a  
conceptually different issue, as I see it.  Even though the use of  
namespace - especially in XML & RDF - has advanced to something  
considerably beyond its original application of simply avoiding tag/ 
term/name "collisions", the "modularization" namespaces can support  
in XML space is distinctly different from the "factoring out" of  
universals to which Trish referred.

Using the namespace facility inherent in XML formalisms might be a  
useful way of enabling people to exchange different world views for  
sub-regions of the graph, that's true.  But it is by no means a  
requirement.  One should still expect whether or not you are using  
namespaces to chop up or modularize different portions of the  
ontological graph, you are still following the same principle of  
extrapolating to universals, even if the universals for  
"namespace_a"'s coverage of the cardinal parts of a liquid  
chromatographic apparatus differed from the universals defined for  
"namespace_b"s coverage of the cardinal parts of a liquid  
chromatographic apparatus .

If namespaces were used as the primary means of "factoring out"  
universals, you could end up with quite a proliferation of namespaces  
for FuGO - in the extreme - one for every node in the graph.

It may be of some use to separate the different ontological sources  
using namespaces.   For instance, there may be some utility to  
specifying the BFO elements in their own namespace, so as to clearly  
separate those entities created by the FuGO group, and those derived  
directly from BFO.  In the end, however, the ontology construction  
performed by the FuGO group is very much wedded via subsumption to  
the many ontological commitments made in BFO, so it's not clear to me  
there would be anything to gain from doing this, apart from making  
the layers separable for curatorial purposes.  For instance, if FuGO  
v1.0 is build off BFO v1.0, having them in separate namespaces may be  
useful should a new new v1.1 of BFO be released.  Migrating FuGO v1.0  
to use BFO v1.1 could be easier to do, if they were namespace  
distinct, but, of course, in doing so, the altered nature of the  
ontological contract would require you to bump the version of FuGO  
once you've done this AND probably perform quite a bit of manual  
review to determine whether the FuGO graph is still valid when linked  
to v1.1 of BFO.

This is an issue we've begun to examine on the BIRN Ontology Task  
Force.  The BIRNLex knowledge resource we are constructing has an  
even more complex relation to a variety of external community  
ontologies, including BFO, FuGO, FMA, PaTO, etc.

My sense of what you are trying to do with O3 is to separate the  
intensional conceptualzation from the lexicon used to extend this  
view of the world into a specific application space - for instance,  
providing a URI for the concepts in the ConcreteOntology.  I am a  
strong proponent of separating the lexicon from the ontological  
graph.  Having said that, most of the OBO Foundry ontologies are  
using terms to label the nodes in the graph just to make them more  
human readable.  At least where the basic "factoring out" process is  
concerned, the goal is not to swayed by the lexicon when constructing  
the ontological graph, and in that sense, they are already separating  
the extensional view of the semantic network from the intentional  
conceptualization.

As to the issue of ComplexOntology vs. LocalOntology as you describe  
it, I think this is very much at odds with the OBO Foundry approach.   
All OBO Foundry ontologies are ComplexOntologies in that they are all  
being adjusted to build off the BFO.  Even more troublesome is the  
fact these foundational biomedical ontologies are MEANT to be used to  
create more complex ontologies and DL frameworks through coordination  
via the OBO Relation Ontology.  That would make all ontological  
development being encouraged by the OBO group break the model you are  
recommending for maximal ontology re-use and ease of knowledge map  
integration.

It's also possible O3 is more commensurate with DOLCE than BFO, in  
which case, I suppose mapping what you describe here to the OBO  
Foundry principles would require much more thought (and may, in fact,  
not be possible).

I wish I could understand what you describe at http:// 
www.charlestoncore.org/ont/2005/08/o3.htm a little better.  I found  
your Nature Biotech. paper very clear and extremely helpful in  
several discussions I've been having on the issue of XML vs. RDF, but  
I really am having a hard time with your description of O3 on that  
web page.  Is there a publication where you go into this in more detail?

Cheers,
Bill

On Jul 7, 2006, at 4:02 PM, Xiaoshu Wang wrote:

>
> Trish,
>
>> Comments inline.
>>
>>>> Based on that work, I'd like to follow Eric N's penchant for
>>>> "strawmen" and propose the following amendments to the Proposed
>>>> Classes to give focus to the discussion:
>>>>
>>>> Project
>>>> Study
>>>> Hypothesis
>>>> ...
>>>
>>> I honestly think before making the list, we should think about how
>>> ontology should be modulized and how to develop ontologies
>> on various
>>> granualities. I would suggest to start with an ontology
>> that has a very coarse granuality.
>>> And developing more detailed ontologies one step at a time.
>> This is the idea behind FuGE and to some degree FuGO. The
>> development of FuGE is an effort to factor out the re-usable
>> bits of modelling an experiment that can be used to describe,
>> in particular any functional genomics experiment, but perhaps
>> some sections could be extended to other types of
>> experiments?. The concepts that AJ has included seem to be
>> fine with respect to having something to test the proof of
>> concept of the idea and technology for linking/searching the
>> data. One thing that may need to be added is something to
>> indicate the type of experiment so that searches can be
>> limited in some way. The ways to go about typing the kind of
>> experiment are many so if needed, that can be left for future
>> discussions. Depending on how fine grained this effort goes,
>> perhaps there is some benefit in joining this work to what
>> has been done, at least with functional genomics experiments,
>> to develop data standards and exchange formats as these
>> efforts represent the granularity needed by the community.
>
> When I mean "factoring out", I mean by grouping terms under different
> namespaces.  I just take a brief overlook on FuGO, just taking the  
> first
> look that "LC_instrument" is under the same namespace of "continuant"
> already tells me there is no modulization whatsoever.  (Please  
> correct me if
> I am wrong).  If I were an anatomist, who will conduct experiment,  
> but never
> care about LC_instrument.  But if I want to use FuGO's continuant,  
> I would
> have to buy in the concepts of LC_instrument as well.  Now, what  
> about some
> physicist, electrical engineer, do each scientific community should  
> they
> ever want or forced to accept LC_experiment?
>
> To factor it out, for example, you need to break it at least into two
> ontologies. One top generic concepts and one lower domain specific  
> ones.
> So, at least, people can share the top one without forced to take  
> the bottom
> one.  But the right way in my suggested methodology would be at  
> least be
> three.  For the simple case, there needs to be at least three: two
> LocalOntologies + one Profile (see my classification of ontologies  
> and what
> I mean by ontology normalization at
> http://www.charlestoncore.org/ont/2005/08/o3.html). By this way, if  
> a lower
> level ontologies want to realign its relations to another high-level
> ontology, a new profile can be created to associate them.  The  
> rational for
> the basis of
> ontology normalization is to separate ontologies' dependency so  
> they can
> gracefully evolve over time... I will stop here.  Otherwise, I  
> would have to
> put out my entire manuscript to make it clear.
>
> FuGO, in its current form, would be unable to handle the  
> evolutionary, let
> alone the revolutionary change of its ontological terms.
>
>>> Using ontology implies that if you want to use one assertion of an
>>> ontology, you must agree to all assertions made in the ontology. A
>>> detailed monolithic ontology is what we should avoid. I
>> have thought
>>> of this problem for quite a while.  The BOSS ontology
>>> (http://www.charlestoncore.org/ontology/boss) that I
>> created only has
>>> a three classes (Study, Protocol and Data) and three pairs
>> of inverse
>>> properties.  (Please trust me, I am not trying to promote the BOSS
>>> ontology here.) What I have really hoped is that we should
>> think how the ontologies will be shared before start building
>> the ontology.
>>>
>>> The same issue also goes to the overlap between FUGO with
>> the proposed
>>> self-descriptive experiement ontology. In fact, I think all
>> biological
>>> related ontology will perhaps touche on the topic of
>> experiment in one
>>> way or the other.  Hence, if each ontology's designers don't factor
>>> out their ontology design, the eventual result will be a bunch of
>>> overlapping monolithic ontologies.
>
>> The overlap with the self-descriptive experiment ontology is
>> actually with both FuGE and FuGO. FuGE provides a model to
>> capture common components of investigations and to provide a
>> framework to capture laboratory workflows. FuGO is being
>> designed to provide a source of descriptors for the
>> annotation of investigations. The scope of FuGO includes the
>> design of an investigation, the protocols and instrumentation
>> used, the data generated and the types of analysis performed
>> on the data. FuGO is not to model any of these particular
>> items, but to provide annotation terms as needed by the
>> collaborating communities. Of course terms/classes will be
>> needed in the ontology to properly fill out the is_a hierachy
>> and create needed relationships between classes. Therefore,
>> in the FuGE model there will be a reference to use an
>> ontology term from some source and in some cases this will be
>> FuGO that contains the term. In other cases where an existing
>> ontology has already been designed, e.g. Foundational Model
>> of Anatomy, a term from this ontology will be used. FuGE
>> provides a mechanism to state which ontology the term was
>> obtained from so this will be present.
>
> Hiearchy is not modulization.  A tree is a tree, unless its braches  
> can
> exist independently.  Whether an ontology is structured or not,  
> btw, which
> ontology does not have a structure? has nothing to do with if the  
> ontology
> is monolithic or not.
>
> Xiaoshu
>
>

Bill Bug
Senior Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu







This email and any accompanying attachments are confidential. 
This information is intended solely for the use of the individual 
to whom it is addressed. Any review, disclosure, copying, 
distribution, or use of this email communication by others is strictly 
prohibited. If you are not the intended recipient please notify us 
immediately by returning this message to the sender and delete 
all copies. Thank you for your cooperation.
Received on Saturday, 8 July 2006 04:55:16 UTC