- From: William Bug <William.Bug@DrexelMed.edu>
- Date: Tue, 27 Mar 2007 10:04:36 -0400
- To: "John Barkley" <jbarkley@nist.gov>
- Cc: "Chris Mungall" <cjm@fruitfly.org>, <public-semweb-lifesci@w3.org>
- Message-Id: <D14485B9-5A63-4917-95FA-288CD0EBF51D@DrexelMed.edu>
Hi John, I agree - I think it's important to keep things simple and clear, though I do also agree I believe Chris's comments are actually very helpful in achieving this goal. A few thoughts that came to mind when reading Chris's comments: 1) XML as a database language Chris is correct. XML qua XML is primarily a markup language designed for the task of providing an "extensible" data exchange mark up formalism. When I read what you say on the page, I thought you might have been referring to XML databases - e.g., RDBMS frameworks that actually store XML internally OR use XML-based disk files as their serialization format. If that is what you meant, it might be useful to state that explicitly. 2) RDBMS syntax & semantics It is important to be clear RDBMS architectures are based on very formal and explicit syntax designed specifically to express a set theoretic view of how data sets inter-relate. As you say, its best to keep things clear and simple but given the what you are trying to explain, I do agree with Chris it is important to be clear RDBMS systems are based on very formal representations - they just are representations devoid of any explicit semantic entailments beyond the most abstract "set X relates to set Y via relation A". I believe its also important to the argument you are making to be clear we recognize there are long-standing RDBMS approaches that do attempt to take semantics into account - i.e., "Semantic Data Models" (http://portal.acm.org/citation.cfm?id=509264). These do provide a means of defining a local, application-specific semantic description of the data held in a relational data model, but they do not provide an explicit externalized semantics expressed in a common, standard formalism such as what is provided by RDF & OWL. 3) SQL "standard" It would be useful to simply list "SQL 92", "SQL 99", "SQL 2003", if that is what you mean. You could also mention there is considerable variation in the ways in which a given RDBMS framework - e.g., Oracle, PostgreSQL, Ingres, DB2, etc. - implements the "optional" portions of these specs and extends the available calculus beyond the SQL standard. This means that in addition to their being not explicit statement of semantic-to-syntactic mapping, there is also considerable variation at the implementation level even in the syntax. As Chris says, the underlying relational algebra on which all of these systems are based does provide a solid, formal basis for each implementation, but in the context of the point you are making on this page, this does not provide an explicit and shared formalism for representing the underlying semantics - AND - the variety in formal syntactic implementations adds to the cost and the ultimate "brittleness" of trying to provide such semantic mapping as an adjunct to the underlying relational syntax. 4) Documentation I suppose what Chris is asking on this front is simply to be clear it's not the fact that "documentation" is required to support the applications one constructs whether you are using XML, an RDBMS, or SemWeb tools to build your application. The point I believe you are trying to make here is with XML & RDBMS approaches, the documentation describing the semantic "mapping" is an absolute pre-requisite to fully describing the semantic content of the information and this is essentially opaque to the algorithms one creates to parse the information - therefore, the algorithms have no direct access to the semantic assertions and entailments. 5) Qualified Relations To some extent, what you are trying to express regarding the use of Domain & Range when defining RDF predicate relations can be expressed in a RDBMS idiom - especially if one includes Object-Relational systems in this category. In an ORDBMS, the table "class" containing the PK becomes the domain for a relation, and the set of all tables (and their sub-classes) whose tuples include the corresponding FK is equivalent to the range for the relation. Of course, the underlying formalism provides no explicit support for algorithmically manipulation or interpretation of semantic entailments of such relation(s). This is where the model-theoretic underpinnings of OWL certainly provide considerably more support for this activity. Even outside the ORBMS frameworks, one can provide SQL DDL models where relations are "qualified". Without such modeling patterns, it would be impossible to represent the full expressiveness of MeSH or UMLS in a RDBMS backend. These implementations in an RDBMS framework, however, tend to get very complex and brittle and require specialized RDBMS skills to implement effectively. They can also be MUCH more complicated to access and manipulate when using a particular language to access the data stored in such models. I do think one can argue the standard tools growing up around RDF & OWL provide a much more powerful, less fragile, and ultimately less complicated (at least measured in lines of code) means to manipulate the semantic assertions & entailments expressed in the underlying data relations. There is also the issue of "directionality" that you bring up, which to my mind is explicitly defined both for XML graphs and relational systems, but I think you mean to capture more than simply the directionality of a semantic entailment in this argument re: use of domain & range. 6) RDFS and/or OWL compared to XML Schema & SQL DDL Chris is definitely correct here. Even if you don't go into the details, these are the correct, more specific comparisons to be making in terms of the inherent ability of these formalisms to explicit represent semantic assertions and entailments. It would also be useful to be more explicit regarding both the expressivity and computability of semantic assertions encoded using XML Schema, RDBMS formalisms, ORBMS formalisms, and systems that convolve XML & RDBMS together. When compared with the formalism and tools provided for performing these same tasks with RDF & OWL, one would hope the result of such a comparison would strongly indicate RDF & OWL provide a significant advantage when representing real- world entities in a semantic meaningful way. Sorry - I've only had a brief moment to capture some of these thoughts. The idea is to follow-up on Chris's suggestion there is a need to do more to define "the strength of the OWL/RDF approach (over) a traditional XML or SQL approach". XML "databases", ORBMS, Semantic Data Models - these are all tools likely to be cited as addressing some of the requirements to handling semantically qualified data, and it's worth placing them in these arguments somewhere. Hope this helps a little - and doesn't make things worse. Cheers, Bill On Mar 27, 2007, at 8:19 AM, John Barkley wrote: > > chris, > > I appreciate your comments, and I agree that if the demo is to show > the superiority of the semantic web approach, then that section > should be more carefully worded. I was trying to create something > that would be (reasonably) readable by RDB and XML practitioners > who are likely not to appreciate subtleties of differences. I will > try to redo the section. > > jb > > > ----- Original Message ----- From: "Chris Mungall" <cjm@fruitfly.org> > To: <jbarkley@nist.gov> > Cc: <public-semweb-lifesci@w3.org> > Sent: Monday, March 26, 2007 11:06 AM > Subject: Re: updated updated bams model > > >> >> >> I have some comments on: >> http://esw.w3.org/topic/HCLS/ >> HCLSIG_DemoHomePage_HCLSIG_Demo#head-50710462ea5aac416fd063dce8621ce0 >> 354 d2d5a >> >>> Formal Definition of Semantics >>> >>> OWL and RDF have a formal definition for the semantics of an OWL/ >>> RDF knowledge base, i.e., given a knowledge base, associated >>> semantics are primarily provided explicitly within the knowledge >>> base itself. Commonly used database languages, e.g., XML and >>> relational database (RDB), have at most a semi-formal definition. >> >> XML is a way of standardising syntax, not semantics. XML isn't a >> database language, I'm not sure why it's classified as such here. >> >> It's not quite correct to state that an RDB (which is not a >> database language either) has only a semi-formal definition. The >> strength of the relational model is precisely the formal >> definition - either as relational algebra or relational calculus. >> How much more formal do you want? >> >> Of course, existing databases use various extensions to the >> relational model, and, regrettably, departures from it. But this >> may well be the case for practical OWL/RDF implementations. I >> think it's a fairly minor point, and not something you want to >> base your argument on. >> >>> XML is a grammar writing system with no defined relationship >>> between a given schema and its semantic meaning. An XML schema >>> is simply a grammar. Any semantics represented by that schema >>> and its associated documents are specified external to those >>> representations, e.g., in documentation. >>> >>> RDB has more than one semi-formal definition, e.g., the ISO >>> Standard SQL [sql]. >> >> You state there is >1 formal definition, give the SQL standard as >> an example of one - can you give an example of another? Perhaps >> you mean successive iterations of the SQL standard? Again, >> variations from this are relatively minor. Relational algebra >> precedes the ISO SQL standard and forms the basis for all >> relational databases. >> >>> Thus, given an RDB schema and repository, it is not possible to >>> know from those which definition of semantics, if any, was used. >>> In common use, a given RDB database and repository may make use >>> of no semi-formal definition of semantics or borrow from >>> several different ones. >> >> What is a repository in this context? >> >>> Like XML, other means, such as, documentation, external to the >>> schema and repository describes the semantics. >> >> So OWL/RDF dispenses with documentation? >> >>> For example, consider how a relation between two sets would be >>> represented in OWL/RDF, XML, and RDB. In OWL/RDF, the semantics >>> of a relation is formally defined similar to the mathematical >>> definition, i.e., as a subset of the cross product of the domain >>> and range. Because the relation is a cross product, it has a >>> direction. An element of the domain is related to an element of >>> the range, but not necessarily the other way around. In an XML >>> schema, there are many different ways of representing a relation >>> using elements, subelements, and attributes. Similarly, in an >>> RDB schema, depending on which semi-formal definition of RDB >>> semantics is used, there are multiple ways to represent a >>> relation. How a relation is represented in an XML or RDB schema/ >>> repository can only be known external to the schema/repository >>> itself. >> >> I'm afraid I can't make head nor tail of this. >> >> "In OWL/RDF, the semantics of a relation is formally defined >> similar to the mathematical definition, i.e., as a subset of the >> cross product of the domain and range." >> >> Actually, I think you are talking about mathematical functions, >> not relations. As OWL/RDF is restricted to binary relations the >> terminology of functions makes sense (ie we can call the first >> argument domain the domain, and the second the range) >> >> So you seem to be stating a strength of OWL/RDF is that you can >> state the domain and range of a relation? Note that in the >> relational model you can of course state the domain of every >> argument of the relation. >> >> "Because the relation is a cross product, it has a direction. An >> element of the domain is related to an element of the range, but >> not necessarily the other way around" >> >> Can you elaborate on this? I don't understand this at all. >> >> "in an RDB schema, depending on which semi-formal definition of >> RDB semantics is used, there are multiple ways to represent a >> relation" >> >> ?? >> >> Are we talking about mathematical relations? As far as I >> understand this, this is simply false. Using the relational model >> you would represent a relation using, ummm, a relation. A >> relation is the cross- product of the domains of each argument. It >> would seem that an RDB relation is much closer to a mathematical >> relation than the OWL/RDF equivalent. (For one thing, there is no >> restriction to binary relations forcing use of n-ary patterns). >> This is true for all RDBs, even ones that fall short of the ideal >> relational model. Can you give an example of two different >> definitions of RDB semantics that would give different answers here? >> >> >> If this demo is to convince people of the strength of the OWL/RDF >> approach as opposed to a traditional XML or SQL approach, then >> this section needs some work. >> >> I would not lump XML in with the relational model - the >> relational model has more in common with logic-based approaches >> than with XML (it's unfortunate for both camps they do not yet >> have more in common) >> >> I think it would be more appropriate to compare and contrast the >> expressivity of, say, XML Schema with OWL than, say, XML with OWL/ >> RDF. Make sure you are comparing like with like. Similarly, I >> would compare the expressivity of standard SQL DDL with OWL, >> perhaps using an example - e.g. a simple one with class >> subsumption. If you're going to use the term semantics, give a >> definition. Note that both relational algebra and OWL's model >> theoretic semantics are rock-solid and formal (I'll leave others >> to comment on the semantics of OWL layered on RDF/RDFS). >> >> I think the point you want to make is that OWL (arguably) provides >> a more expressive (and perhaps agile?) framework for >> representations of real-world entities. Although you >> simultaneously seem to be making the case for RDF too, which >> makes your task harder. >> >> Cheers >> Chris >> > > > Bill Bug Senior Research Analyst/Ontological Engineer Laboratory for Bioimaging & Anatomical Informatics www.neuroterrain.org Department of Neurobiology & Anatomy Drexel University College of Medicine 2900 Queen Lane Philadelphia, PA 19129 215 991 8430 (ph) 610 457 0443 (mobile) 215 843 9367 (fax) Please Note: I now have a new email - William.Bug@DrexelMed.edu
Received on Tuesday, 27 March 2007 14:05:04 UTC