Re: updated updated bams model

Hi John,

I agree - I think it's important to keep things simple and clear,  
though I do also agree I believe Chris's comments are actually very  
helpful in achieving this goal.

A few thoughts that came to mind when reading Chris's comments:

	1) XML as a database language
		Chris is correct.  XML qua XML is primarily a markup language  
designed for the task of providing an "extensible" data exchange mark  
up formalism.  When I read what you say on the page, I thought you  
might have been referring to XML databases - e.g., RDBMS frameworks  
that actually store XML internally OR use XML-based disk files as  
their serialization format.  If that is what you meant, it might be  
useful to state that explicitly.

	2) RDBMS syntax & semantics
		It is important to be clear RDBMS architectures are based on very  
formal and explicit syntax designed specifically to express a set  
theoretic view of how data sets inter-relate.  As you say, its best  
to keep things clear and simple but given the what you are trying to  
explain, I do agree with Chris it is important to be clear RDBMS  
systems are based on very formal representations - they just are  
representations devoid of any explicit semantic entailments beyond  
the most abstract "set X relates to set Y via relation A".
		I believe its also important to the argument you are making to be  
clear we recognize there are long-standing RDBMS approaches that do  
attempt to take semantics into account - i.e., "Semantic Data  
Models" (http://portal.acm.org/citation.cfm?id=509264).  These do  
provide a means of defining a local, application-specific semantic  
description of the data held in a relational data model, but they do  
not provide an explicit externalized semantics expressed in a common,  
standard formalism such as what is provided by RDF & OWL.

	3) SQL "standard"
		It would be useful to simply list "SQL 92", "SQL 99", "SQL 2003",  
if that is what you mean.  You could also mention there is  
considerable variation in the ways in which a given RDBMS framework -  
e.g., Oracle, PostgreSQL, Ingres, DB2, etc. - implements the  
"optional" portions of these specs and extends the available calculus  
beyond the SQL standard.  This means that in addition to their being  
not explicit statement of semantic-to-syntactic mapping, there is  
also considerable variation at the implementation level even in the  
syntax.
		As Chris says, the underlying relational algebra on which all of  
these systems are based does provide a solid, formal basis for each  
implementation, but in the context of the point you are making on  
this page, this does not provide an explicit and shared formalism for  
representing the underlying semantics - AND - the variety in formal  
syntactic implementations adds to the cost and the ultimate  
"brittleness" of trying to provide such semantic mapping as an  
adjunct to the underlying relational syntax.

	4) Documentation
		I suppose what Chris is asking on this front is simply to be clear  
it's not the fact that "documentation" is required to support the  
applications one constructs whether you are using XML, an RDBMS, or  
SemWeb tools to build your application.  The point I believe you are  
trying to make here is with XML & RDBMS approaches, the documentation  
describing the semantic "mapping" is an absolute pre-requisite to  
fully describing the semantic content of the information and this is  
essentially opaque to the algorithms one creates to parse the  
information - therefore, the algorithms have no direct access to the  
semantic assertions and entailments.

	5) Qualified Relations
		To some extent, what you are trying to express regarding the use of  
Domain & Range when defining RDF predicate relations can be expressed  
in a RDBMS idiom - especially if one includes Object-Relational  
systems in this category.  In an ORDBMS, the table "class" containing  
the PK becomes the domain for a relation, and the set of all tables  
(and their sub-classes) whose tuples include the corresponding FK is  
equivalent to the range for the relation.  Of course, the underlying  
formalism provides no explicit support for algorithmically  
manipulation or interpretation of semantic entailments of such  
relation(s).  This is where the model-theoretic underpinnings of OWL  
certainly provide considerably more support for this activity.  Even  
outside the ORBMS frameworks, one can provide SQL DDL models where  
relations are "qualified".  Without such modeling patterns, it would  
be impossible to represent the full expressiveness of MeSH or UMLS in  
a RDBMS backend.  These implementations in an RDBMS framework,  
however, tend to get very complex and brittle and require specialized  
RDBMS skills to implement effectively.  They can also be MUCH more  
complicated to access and manipulate when using a particular language  
to access the data stored in such models.  I do think one can argue  
the standard tools growing up around RDF & OWL provide a much more  
powerful, less fragile, and ultimately less complicated (at least  
measured in lines of code) means to manipulate the semantic  
assertions & entailments expressed in the underlying data relations.
		There is also the issue of "directionality" that you bring up,  
which to my mind is explicitly defined both for XML graphs and  
relational systems, but I think you mean to capture more than simply  
the directionality of a semantic entailment in this argument re: use  
of domain & range.

	6) RDFS and/or OWL compared to XML Schema & SQL DDL
		Chris is definitely correct here.  Even if you don't go into the  
details, these are the correct, more specific comparisons to be  
making in terms of the inherent ability of these formalisms to  
explicit represent semantic assertions and entailments.
		It would also be useful to be more explicit regarding both the  
expressivity and computability of semantic assertions encoded using  
XML Schema, RDBMS formalisms, ORBMS formalisms, and systems that  
convolve XML & RDBMS together.  When compared with the formalism and  
tools provided for performing these same tasks with RDF & OWL, one  
would hope the result of such a comparison would strongly indicate  
RDF & OWL provide a significant advantage when representing real- 
world entities in a semantic meaningful way.

Sorry - I've only had a brief moment to capture some of these  
thoughts.  The idea is to follow-up on Chris's suggestion there is a  
need to do more to define "the strength of the OWL/RDF approach  
(over) a traditional XML or SQL approach".  XML "databases", ORBMS,  
Semantic Data Models - these are all tools likely to be cited as  
addressing some of the requirements to handling semantically  
qualified data, and it's worth placing them in these arguments  
somewhere.

Hope this helps a little - and doesn't make things worse.

Cheers,
Bill

	
On Mar 27, 2007, at 8:19 AM, John Barkley wrote:

>
> chris,
>
> I appreciate your comments, and I agree that if the demo is to show  
> the superiority of the semantic web approach, then that section  
> should be more carefully worded. I was trying to create something  
> that would be (reasonably) readable by RDB and XML practitioners  
> who are likely not to appreciate subtleties of differences. I will  
> try to redo the section.
>
> jb
>
>
> ----- Original Message ----- From: "Chris Mungall" <cjm@fruitfly.org>
> To: <jbarkley@nist.gov>
> Cc: <public-semweb-lifesci@w3.org>
> Sent: Monday, March 26, 2007 11:06 AM
> Subject: Re: updated updated bams model
>
>
>>
>>
>> I have some comments on:
>> http://esw.w3.org/topic/HCLS/  
>> HCLSIG_DemoHomePage_HCLSIG_Demo#head-50710462ea5aac416fd063dce8621ce0 
>> 354 d2d5a
>>
>>> Formal Definition of Semantics
>>>
>>> OWL and RDF have a formal definition for the semantics of an OWL/  
>>> RDF knowledge base, i.e., given a knowledge base, associated   
>>> semantics are primarily provided explicitly within the knowledge   
>>> base itself. Commonly used database languages, e.g., XML and   
>>> relational database (RDB), have at most a semi-formal definition.
>>
>> XML is a way of standardising syntax, not semantics. XML isn't a   
>> database language, I'm not sure why it's classified as such here.
>>
>> It's not quite correct to state that an RDB (which is not a  
>> database language either) has only a semi-formal definition. The  
>> strength of  the relational model is precisely the formal  
>> definition - either as relational algebra or relational calculus.  
>> How much more formal do  you want?
>>
>> Of course, existing databases use various extensions to the   
>> relational model, and, regrettably, departures from it. But this  
>> may  well be the case for practical OWL/RDF implementations. I  
>> think it's  a fairly minor point, and not something you want to  
>> base your  argument on.
>>
>>> XML is a grammar writing system with no defined relationship   
>>> between a given schema and its semantic meaning. An XML schema  
>>> is  simply a grammar. Any semantics represented by that schema  
>>> and its  associated documents are specified external to those   
>>> representations, e.g., in documentation.
>>>
>>> RDB has more than one semi-formal definition, e.g., the ISO   
>>> Standard SQL [sql].
>>
>> You state there is >1 formal definition, give the SQL standard as  
>> an example of one - can you give an example of another? Perhaps  
>> you mean successive iterations of the SQL standard? Again,  
>> variations from  this are relatively minor. Relational algebra  
>> precedes the ISO SQL  standard and forms the basis for all  
>> relational databases.
>>
>>> Thus, given an RDB schema and repository, it is not possible to   
>>> know from those which definition of semantics, if any, was used.  
>>> In  common use, a given RDB database and repository may make use  
>>> of no  semi-formal definition of semantics or borrow from  
>>> several  different ones.
>>
>> What is a repository in this context?
>>
>>> Like XML, other means, such as, documentation, external to the   
>>> schema and repository describes the semantics.
>>
>> So OWL/RDF dispenses with documentation?
>>
>>> For example, consider how a relation between two sets would be  
>>> represented in OWL/RDF, XML, and RDB. In OWL/RDF, the semantics  
>>> of  a relation is formally defined similar to the mathematical   
>>> definition, i.e., as a subset of the cross product of the domain   
>>> and range. Because the relation is a cross product, it has a   
>>> direction. An element of the domain is related to an element of  
>>> the  range, but not necessarily the other way around. In an XML  
>>> schema,  there are many different ways of representing a relation  
>>> using  elements, subelements, and attributes. Similarly, in an  
>>> RDB schema,  depending on which semi-formal definition of RDB  
>>> semantics is used,  there are multiple ways to represent a  
>>> relation. How a relation is  represented in an XML or RDB schema/ 
>>> repository can only be known  external to the schema/repository  
>>> itself.
>>
>> I'm afraid I can't make head nor tail of this.
>>
>>   "In OWL/RDF, the semantics of a relation is formally defined   
>> similar to the mathematical definition, i.e., as a subset of the   
>> cross product of the domain and range."
>>
>> Actually, I think you are talking about mathematical functions,  
>> not relations. As OWL/RDF is restricted to binary relations the   
>> terminology of functions makes sense (ie we can call the first   
>> argument domain the domain, and the second the range)
>>
>> So you seem to be stating a strength of OWL/RDF is that you can  
>> state  the domain and range of a relation? Note that in the  
>> relational model  you can of course state the domain of every  
>> argument of the relation.
>>
>>   "Because the relation is a cross product, it has a direction. An  
>> element of the domain is related to an element of the range, but  
>> not necessarily the other way around"
>>
>> Can you elaborate on this? I don't understand this at all.
>>
>>   "in an RDB schema, depending on which semi-formal definition of   
>> RDB semantics is used, there are   multiple ways to represent a   
>> relation"
>>
>> ??
>>
>> Are we talking about mathematical relations? As far as I  
>> understand  this, this is simply false. Using the relational model  
>> you would  represent a relation using, ummm, a relation. A  
>> relation is the cross- product of the domains of each argument. It  
>> would seem that an RDB  relation is much closer to a mathematical  
>> relation than the OWL/RDF  equivalent. (For one thing, there is no  
>> restriction to binary  relations forcing use of n-ary patterns).  
>> This is true for all RDBs,  even ones that fall short of the ideal  
>> relational model. Can you give  an example of two different  
>> definitions of RDB semantics that would  give different answers here?
>>
>>
>> If this demo is to convince people of the strength of the OWL/RDF  
>> approach as opposed to a traditional XML or SQL approach, then  
>> this section needs some work.
>>
>> I would not lump XML in with the relational model - the  
>> relational  model has more in common with logic-based approaches  
>> than with XML  (it's unfortunate for both camps they do not yet  
>> have more in common)
>>
>> I think it would be more appropriate to compare and contrast the  
>> expressivity of, say, XML Schema with OWL than, say, XML with OWL/  
>> RDF. Make sure you are comparing like with like. Similarly, I  
>> would  compare the expressivity of standard SQL DDL with OWL,  
>> perhaps using  an example - e.g. a simple one with class  
>> subsumption. If you're  going to use the term semantics, give a  
>> definition. Note that both  relational algebra and OWL's model  
>> theoretic semantics are rock-solid  and formal (I'll leave others  
>> to comment on the semantics of OWL  layered on RDF/RDFS).
>>
>> I think the point you want to make is that OWL (arguably) provides  
>> a  more expressive (and perhaps agile?) framework for  
>> representations of real-world entities. Although you  
>> simultaneously seem to be making  the case for RDF too, which  
>> makes your task harder.
>>
>> Cheers
>> Chris
>>
>
>
>

Bill Bug
Senior Research Analyst/Ontological Engineer

Laboratory for Bioimaging  & Anatomical Informatics
www.neuroterrain.org
Department of Neurobiology & Anatomy
Drexel University College of Medicine
2900 Queen Lane
Philadelphia, PA    19129
215 991 8430 (ph)
610 457 0443 (mobile)
215 843 9367 (fax)


Please Note: I now have a new email - William.Bug@DrexelMed.edu

Received on Tuesday, 27 March 2007 14:05:04 UTC