Re: updated updated bams model

Looking at this discussion, I was wondering if anyone had some sort of "concept map" visual for these kinds of ontologies, i.e., BAMS, GO, MESH, etc...?

Shouldn't be too complicated, and would be useful as a back drop for the demo, as to how the "data pieces" fit together. Anyone willing to do it?

Eric


On Mar 28, 2007, at 5:20 PM, William Bug wrote:


	Absolutely, John.  I completely agree.

	Just stash that feedback on the page for now.

	Getting a working OWL version of BAMS that best reflects the suggestions Alan, Kei, Luis, Mihail and others have made - one particularly suited to catalyzing an RDF-driven integration of the various BioRDF data sets is definitely the priority.

	Many thanks for all the work you are doing.  I know this will be an important resource not only for the HCLS demo but also for the community at-large.

	Cheers,
	Bill

	On Mar 28, 2007, at 12:45 PM, jbarkley@nist.gov wrote:



		hi bill,

		Thanks very much for your suggestions. I'm deep into doing the conversion of 
		BAMS. I want to make significant enough progress with this before attempting 
		to deal with changing the section. 

		jb


		Quoting William Bug <William.Bug@DrexelMed.edu>:


			Hi John,

			I agree - I think it's important to keep things simple and clear,  
			though I do also agree I believe Chris's comments are actually very  
			helpful in achieving this goal.

			A few thoughts that came to mind when reading Chris's comments:

			1) XML as a database language
			Chris is correct.  XML qua XML is primarily a markup language  
			designed for the task of providing an "extensible" data exchange mark  
			up formalism.  When I read what you say on the page, I thought you  
			might have been referring to XML databases - e.g., RDBMS frameworks  
			that actually store XML internally OR use XML-based disk files as  
			their serialization format.  If that is what you meant, it might be  
			useful to state that explicitly.

			2) RDBMS syntax & semantics
			It is important to be clear RDBMS architectures are based on 

		very  

			formal and explicit syntax designed specifically to express a set  
			theoretic view of how data sets inter-relate.  As you say, its best  
			to keep things clear and simple but given the what you are trying to  
			explain, I do agree with Chris it is important to be clear RDBMS  
			systems are based on very formal representations - they just are  
			representations devoid of any explicit semantic entailments beyond  
			the most abstract "set X relates to set Y via relation A".
			I believe its also important to the argument you are making to 

		be  

			clear we recognize there are long-standing RDBMS approaches that do  
			attempt to take semantics into account - i.e., "Semantic Data  
			Models" (http://portal.acm.org/citation.cfm?id=509264).  These do  
			provide a means of defining a local, application-specific semantic  
			description of the data held in a relational data model, but they do  
			not provide an explicit externalized semantics expressed in a common,  
			standard formalism such as what is provided by RDF & OWL.

			3) SQL "standard"
			It would be useful to simply list "SQL 92", "SQL 99", "SQL 

		2003",  

			if that is what you mean.  You could also mention there is  
			considerable variation in the ways in which a given RDBMS framework -  
			e.g., Oracle, PostgreSQL, Ingres, DB2, etc. - implements the  
			"optional" portions of these specs and extends the available calculus  
			beyond the SQL standard.  This means that in addition to their being  
			not explicit statement of semantic-to-syntactic mapping, there is  
			also considerable variation at the implementation level even in the  
			syntax.
			As Chris says, the underlying relational algebra on which all 

		of  

			these systems are based does provide a solid, formal basis for each  
			implementation, but in the context of the point you are making on  
			this page, this does not provide an explicit and shared formalism for  
			representing the underlying semantics - AND - the variety in formal  
			syntactic implementations adds to the cost and the ultimate  
			"brittleness" of trying to provide such semantic mapping as an  
			adjunct to the underlying relational syntax.

			4) Documentation
			I suppose what Chris is asking on this front is simply to be 

		clear  

			it's not the fact that "documentation" is required to support the  
			applications one constructs whether you are using XML, an RDBMS, or  
			SemWeb tools to build your application.  The point I believe you are  
			trying to make here is with XML & RDBMS approaches, the documentation  
			describing the semantic "mapping" is an absolute pre-requisite to  
			fully describing the semantic content of the information and this is  
			essentially opaque to the algorithms one creates to parse the  
			information - therefore, the algorithms have no direct access to the  
			semantic assertions and entailments.

			5) Qualified Relations
			To some extent, what you are trying to express regarding the 

		use of  

			Domain & Range when defining RDF predicate relations can be expressed  
			in a RDBMS idiom - especially if one includes Object-Relational  
			systems in this category.  In an ORDBMS, the table "class" containing  
			the PK becomes the domain for a relation, and the set of all tables  
			(and their sub-classes) whose tuples include the corresponding FK is  
			equivalent to the range for the relation.  Of course, the underlying  
			formalism provides no explicit support for algorithmically  
			manipulation or interpretation of semantic entailments of such  
			relation(s).  This is where the model-theoretic underpinnings of OWL  
			certainly provide considerably more support for this activity.  Even  
			outside the ORBMS frameworks, one can provide SQL DDL models where  
			relations are "qualified".  Without such modeling patterns, it would  
			be impossible to represent the full expressiveness of MeSH or UMLS in  
			a RDBMS backend.  These implementations in an RDBMS framework,  
			however, tend to get very complex and brittle and require specialized  
			RDBMS skills to implement effectively.  They can also be MUCH more  
			complicated to access and manipulate when using a particular language  
			to access the data stored in such models.  I do think one can argue  
			the standard tools growing up around RDF & OWL provide a much more  
			powerful, less fragile, and ultimately less complicated (at least  
			measured in lines of code) means to manipulate the semantic  
			assertions & entailments expressed in the underlying data relations.
			There is also the issue of "directionality" that you bring 

		up,  

			which to my mind is explicitly defined both for XML graphs and  
			relational systems, but I think you mean to capture more than simply  
			the directionality of a semantic entailment in this argument re: use  
			of domain & range.

			6) RDFS and/or OWL compared to XML Schema & SQL DDL
			Chris is definitely correct here.  Even if you don't go into 

		the  

			details, these are the correct, more specific comparisons to be  
			making in terms of the inherent ability of these formalisms to  
			explicit represent semantic assertions and entailments.
			It would also be useful to be more explicit regarding both 

		the  

			expressivity and computability of semantic assertions encoded using  
			XML Schema, RDBMS formalisms, ORBMS formalisms, and systems that  
			convolve XML & RDBMS together.  When compared with the formalism and  
			tools provided for performing these same tasks with RDF & OWL, one  
			would hope the result of such a comparison would strongly indicate  
			RDF & OWL provide a significant advantage when representing real- 
			world entities in a semantic meaningful way.

			Sorry - I've only had a brief moment to capture some of these  
			thoughts.  The idea is to follow-up on Chris's suggestion there is a  
			need to do more to define "the strength of the OWL/RDF approach  
			(over) a traditional XML or SQL approach".  XML "databases", ORBMS,  
			Semantic Data Models - these are all tools likely to be cited as  
			addressing some of the requirements to handling semantically  
			qualified data, and it's worth placing them in these arguments  
			somewhere.

			Hope this helps a little - and doesn't make things worse.

			Cheers,
			Bill

			
			
			On Mar 27, 2007, at 8:19 AM, John Barkley wrote:



				chris,

				I appreciate your comments, and I agree that if the demo is to show  
				the superiority of the semantic web approach, then that section  
				should be more carefully worded. I was trying to create something  
				that would be (reasonably) readable by RDB and XML practitioners  
				who are likely not to appreciate subtleties of differences. I will  
				try to redo the section.

				jb


				----- Original Message ----- From: "Chris Mungall" <cjm@fruitfly.org>
				To: <jbarkley@nist.gov>
				Cc: <public-semweb-lifesci@w3.org>
				Sent: Monday, March 26, 2007 11:06 AM
				Subject: Re: updated updated bams model





					I have some comments on:
					http://esw.w3.org/topic/HCLS/  
					HCLSIG_DemoHomePage_HCLSIG_Demo#head-50710462ea5aac416fd063dce8621ce0 
					354 d2d5a


					Formal Definition of Semantics

					OWL and RDF have a formal definition for the semantics of an OWL/  
					RDF knowledge base, i.e., given a knowledge base, associated   
					semantics are primarily provided explicitly within the knowledge   
					base itself. Commonly used database languages, e.g., XML and   
					relational database (RDB), have at most a semi-formal definition.


					XML is a way of standardising syntax, not semantics. XML isn't a   
					database language, I'm not sure why it's classified as such here.

					It's not quite correct to state that an RDB (which is not a  
					database language either) has only a semi-formal definition. The  
					strength of  the relational model is precisely the formal  
					definition - either as relational algebra or relational calculus.  
					How much more formal do  you want?

					Of course, existing databases use various extensions to the   
					relational model, and, regrettably, departures from it. But this  
					may  well be the case for practical OWL/RDF implementations. I  
					think it's  a fairly minor point, and not something you want to  
					base your  argument on.


					XML is a grammar writing system with no defined relationship   
					between a given schema and its semantic meaning. An XML schema  
					is  simply a grammar. Any semantics represented by that schema  
					and its  associated documents are specified external to those   
					representations, e.g., in documentation.

					RDB has more than one semi-formal definition, e.g., the ISO   
					Standard SQL [sql].


					You state there is >1 formal definition, give the SQL standard as  
					an example of one - can you give an example of another? Perhaps  
					you mean successive iterations of the SQL standard? Again,  
					variations from  this are relatively minor. Relational algebra  
					precedes the ISO SQL  standard and forms the basis for all  
					relational databases.


					Thus, given an RDB schema and repository, it is not possible to   
					know from those which definition of semantics, if any, was used.  
					In  common use, a given RDB database and repository may make use  
					of no  semi-formal definition of semantics or borrow from  
					several  different ones.


					What is a repository in this context?


					Like XML, other means, such as, documentation, external to the   
					schema and repository describes the semantics.


					So OWL/RDF dispenses with documentation?


					For example, consider how a relation between two sets would be  
					represented in OWL/RDF, XML, and RDB. In OWL/RDF, the semantics  
					of  a relation is formally defined similar to the mathematical   
					definition, i.e., as a subset of the cross product of the domain   
					and range. Because the relation is a cross product, it has a   
					direction. An element of the domain is related to an element of  
					the  range, but not necessarily the other way around. In an XML  
					schema,  there are many different ways of representing a relation  
					using  elements, subelements, and attributes. Similarly, in an  
					RDB schema,  depending on which semi-formal definition of RDB  
					semantics is used,  there are multiple ways to represent a  
					relation. How a relation is  represented in an XML or RDB schema/ 
					repository can only be known  external to the schema/repository  
					itself.


					I'm afraid I can't make head nor tail of this.

					  "In OWL/RDF, the semantics of a relation is formally defined   
					similar to the mathematical definition, i.e., as a subset of the   
					cross product of the domain and range."

					Actually, I think you are talking about mathematical functions,  
					not relations. As OWL/RDF is restricted to binary relations the   
					terminology of functions makes sense (ie we can call the first   
					argument domain the domain, and the second the range)

					So you seem to be stating a strength of OWL/RDF is that you can  
					state  the domain and range of a relation? Note that in the  
					relational model  you can of course state the domain of every  
					argument of the relation.

					  "Because the relation is a cross product, it has a direction. An  
					element of the domain is related to an element of the range, but  
					not necessarily the other way around"

					Can you elaborate on this? I don't understand this at all.

					  "in an RDB schema, depending on which semi-formal definition of   
					RDB semantics is used, there are   multiple ways to represent a   
					relation"

					??

					Are we talking about mathematical relations? As far as I  
					understand  this, this is simply false. Using the relational model  
					you would  represent a relation using, ummm, a relation. A  
					relation is the cross- product of the domains of each argument. It  
					would seem that an RDB  relation is much closer to a mathematical  
					relation than the OWL/RDF  equivalent. (For one thing, there is no  
					restriction to binary  relations forcing use of n-ary patterns).  
					This is true for all RDBs,  even ones that fall short of the ideal  
					relational model. Can you give  an example of two different  
					definitions of RDB semantics that would  give different answers here?


					If this demo is to convince people of the strength of the OWL/RDF  
					approach as opposed to a traditional XML or SQL approach, then  
					this section needs some work.

					I would not lump XML in with the relational model - the  
					relational  model has more in common with logic-based approaches  
					than with XML  (it's unfortunate for both camps they do not yet  
					have more in common)

					I think it would be more appropriate to compare and contrast the  
					expressivity of, say, XML Schema with OWL than, say, XML with OWL/  
					RDF. Make sure you are comparing like with like. Similarly, I  
					would  compare the expressivity of standard SQL DDL with OWL,  
					perhaps using  an example - e.g. a simple one with class  
					subsumption. If you're  going to use the term semantics, give a  
					definition. Note that both  relational algebra and OWL's model  
					theoretic semantics are rock-solid  and formal (I'll leave others  
					to comment on the semantics of OWL  layered on RDF/RDFS).

					I think the point you want to make is that OWL (arguably) provides  
					a  more expressive (and perhaps agile?) framework for  
					representations of real-world entities. Although you  
					simultaneously seem to be making  the case for RDF too, which  
					makes your task harder.

					Cheers
					Chris






			Bill Bug
			Senior Research Analyst/Ontological Engineer

			Laboratory for Bioimaging  & Anatomical Informatics
			www.neuroterrain.org
			Department of Neurobiology & Anatomy
			Drexel University College of Medicine
			2900 Queen Lane
			Philadelphia, PA    19129
			215 991 8430 (ph)
			610 457 0443 (mobile)
			215 843 9367 (fax)


			Please Note: I now have a new email - William.Bug@DrexelMed.edu









		Bill Bug
	Senior Research Analyst/Ontological Engineer

	Laboratory for Bioimaging  & Anatomical Informatics
	www.neuroterrain.org
	Department of Neurobiology & Anatomy
	Drexel University College of Medicine
	2900 Queen Lane
	Philadelphia, PA    19129
	215 991 8430 (ph)
	610 457 0443 (mobile)
	215 843 9367 (fax)


	Please Note: I now have a new email - William.Bug@DrexelMed.edu



	

Received on Wednesday, 28 March 2007 22:27:00 UTC