Revisiting structured datatypes was: Re: Revisiting AllDisjoint (was Proposed (parital) response to Ken Laskey and questions for WG)

Re: Revisiting AllDisjoint (was Proposed (parital) resJim,

With all due respect, if we are going to modify OWL based upon the needs of the NCI Ontology (which *is* reasonable) then let's look at an example class definition:

- <owl:Class rdf:ID="Aflatoxins">
  <code>C213</code> 
  <id>213</id> 
  <hasType>primitive</hasType> 
  <kind>Chemicals_and_Drugs_Kind</kind> 
  <rdfs:subClassOf rdf:resource="#Mycotoxin" /> 
- <FULL_SYN>
- <![CDATA[ 
<term-name>Aflatoxins</term-name>
<term-group>PT</term-group>
<term-source>NCI</term-source>
  ]]> 
  </FULL_SYN>
- <DEFINITION>
- <![CDATA[ 
<def-source>NCI-GLOSS</def-source>
<def-definition>(AF-la-TOK-sins) Substances made  by a fungi that is often found on poorly stored grains and nuts.  Aflatoxins have been implicated as a factor in the etiology of primary liver cancer.</def-definition>
  ]]> 
  </DEFINITION>
- <DEFINITION>
- <![CDATA[ 
<def-source>MSH2002_06_01</def-source>
<def-definition>A group of closely related toxic metabolites that are designated mycotoxins. They are produced by Aspergillus flavus and A. parasiticus. Members of the group include AFLATOXIN B1; aflatoxin B2, aflatoxin G1, aflatoxin G2; AFLATOXIN M1; and aflatoxin M2.</def-definition>
  ]]> 
  </DEFINITION>
  <CUI>C0001734</CUI> 
  <Semantic_Type>Organic Chemical</Semantic_Type> 
  <Semantic_Type>Hazardous or Poisonous Substance</Semantic_Type> 
  <Preferred_Name>Aflatoxins</Preferred_Name> 
  <rdfs:label>Aflatoxins</rdfs:label> 
  </owl:Class>

Pardon me, but this use of XML inside CDATA sections is a, to be blunt, horrible hack whose necessity raises serious concerns about the lack of structured datatypes in OWL. Since this seems essential to the ontology -- this idiom is repeated over and over -- I think that we need to readdress the issue of structured datatypes -- such use of 'XML' and I mean the quotes as the XML is inside CDATA sections !!! must raise a most serious interoperability issue.

This is aside from other properties such as <code> <id> and <CUI> which I suspect contains an identifier -- ought not these be resources i.e. URIs?

It looks to me as though these types of 'graphs' have all sorts of extra-RDF arrows between the nodes.

Jonathan

From: Jim Hendler 
  To: Jonathan Borden ; webont 
  Sent: Monday, July 14, 2003 6:17 PM
  Subject: Re: Revisiting AllDisjoint (was Proposed (parital) response to Ken Laskey and questions for WG)


  At 6:05 PM -0400 7/14/03, Jonathan Borden wrote:
  Different kinds of cancer, for example, are hardly disjoint: one person might have any number of different cancers -- really. If we are going to revisit anything that might actually improve something like GALEN, how about qualified cardinalities?

  Jonathan



  Jonathan - there are sets of diseases that do not/cannot co-occur, there are sets of genetic loci that are disjoint and others that arent, and, in fact, any reasoning system that is going to use any sort of medical ontology for classifying diseases (or patients, or lifestyles, or anything else) is going to have to represent these disjoints or the fact that its reasoning is complete and sound won't be worth much.  Sometimes these disjoints happen at a "high level" and our argument that there won't be many of them holds (Human v. non-human categorizes a lot of stuff), but sometimes these happen at a low level (all the different mammals are speciated in a disjoint way by nature, except a couple of odd cases like mules) and one would need to express them.   I'd bet you that there's a lot more disjoint statements that are needed for the NCI ontology [1] than QCRs - will you take that bet??
   -JH
  p.s. And I'll just count the disjoint classes themselves, so I won't take unfair advantage of the N^2 to 1 that disjoints have over QCRs.

  [1] http://www.mindswap.org/2003/CancerOntology/

  ----- Original Message -----

  From: Jim Hendler
  To: webont
  Sent: Monday, July 14, 2003 5:32 PM
  Subject: Revisiting AllDisjoint (was Proposed (parital) response to Ken Laskey and questions for WG)


  I would like to take a moment to see what people think about having to reopen this issue (or possibly move forward over an objection):


  In a conversation (non electronic) with Ken Laskey, who has again raised the issue of having an owl:allDisjoint construct (mirroring the allDifferent construct), I pointed him to Dan Connolly's [1] earlier response to this issue.  Ken indicated that he was not likely to accept this answer, and in conversation he brought up many use cases where this would be needed.  Basically, he disputes our contention that since this occurs in "class space" it is likely to be just a small number -- as he points out,  we already have a number of ontologies in OWL that are quite large (the NCI ontology and the GALEN ontology, are two examples).  In these ontologies, there are numerous cases where one would want to take a large set of classes (for example the different kinds of cancers) and make it explicit that some of these are disjoint (and thus others aren't necessarily) -- even though N is comparatively small, say 100 (remember the total number of classes in NCI is about 17000), this takes ~(N^2)/2 = 5,000 (!!) OWL statements.
    Further, Ken points out that even in some of the smaller ontologies we've created, the number of classes and the number of disjoint classes can be almost identical (for example, military ranks are mutually disjoint within services, but not always between - someone cannot be an Army Lieutenant and and Army Captain, but there are rare cases where someone is "dual hatted" as an Army Colonel and a Navy Captain, for example).  Here's the odd thing -- the number of classes in the military ranks ontology would be about 50 (if one includes officers and non-officiers), which would require on the order of 1000 disjoint statements --significantly dwarfing the size of the original ontology!
   Note that in none of these use cases are we talking disjoint unions per se (although I suppose one could create a workaround if one had a disjointunion construct).
   I think the N^2 problem in the size of the ontologies we're already seeing might be evidence to reopen this issue.  Alternatively, if there is a decent workaround, we might want to document that workaround and not add this construct.  We can also try to move forward over Ken's objection, although my preference would be to look for a way not to.

   -JH
  p.s. Please note that this would NOT require any new semantics or any major new syntax - semantically we already have the ability to assert the pairwise disjuncts, and semantically this could be done using the same construct we created for allDifferent.




  [1] http://lists.w3.org/Archives/Public/public-webont-comments/2003Jun/0038.html


  --
  Professor James Hendler                           hendler@cs.umd.edu
  Director, Semantic Web and Agent Technologies         301-405-2696
  Maryland Information and Network Dynamics Lab.      301-405-6707 (Fax)
  Univ of Maryland, College Park, MD 20742      *** 240-277-3388 (Cell)
  http://www.cs.umd.edu/users/hendler      *** NOTE CHANGED CELL NUMBER ***




-- 

  Professor James Hendler                           hendler@cs.umd.edu
  Director, Semantic Web and Agent Technologies         301-405-2696
  Maryland Information and Network Dynamics Lab.      301-405-6707 (Fax)
  Univ of Maryland, College Park, MD 20742      *** 240-277-3388 (Cell)
  http://www.cs.umd.edu/users/hendler      *** NOTE CHANGED CELL NUMBER ***

Received on Tuesday, 15 July 2003 16:37:56 UTC