- From: <jbarkley@nist.gov>
- Date: Sun, 18 Feb 2007 09:40:17 -0500
- To: public-semweb-lifesci@w3.org
- Cc: jbarkley@nist.gov
As requested, enclosed below in wiki format is the initial text for the "Benefits" section of: http://esw.w3.org/topic/HCLS/HCLSIG_DemoHomePage_HCLSIG_Demo for tomorrow's telcon. I could'nt put the text on the wiki page because the page is marked "immutable", as are several other wiki pages that I sampled. Don't know why. I had used the wiki edit capability as recently as last tuesday. jb ----------------------------------------------------------------------- === Benefits of the Semantic Web Illustrated in the Demonstration === ==== Knowledge Representation ==== The Semantic Web languages OWL and RDF enable knowledge representation by means of a knowledge base, i.e., a knowledge repository - not just a database. A knowledge base consists of two parts: ontology and individuals, i.e., instances of classes in the ontology. There are two fundamental differences between a knowledge base and a database: formal definition of semantics and automated reasoning. Mathematical rigor for the OWL/RDF formal definition of semantics and automated reasoning comes from Description Logic [dl handbook]. ===== Formal Definition of Semantics ===== OWL and RDF have a formal definition for the semantics of an OWL/RDF knowledge base, i.e., given a knowledge base, associated semantics are primarily provided explicitly within the knowledge base itself. Commonly used database languages, e.g., XML and relational database (RDB), have at most a semi-formal definition. XML is a grammar writing system with no defined relationship between a given schema and its semantic meaning. An XML schema is simply a grammar. Any semantics represented by that schema and its associated documents are specified external to those representations, e.g., in documentation. RDB has more than one semi-formal definition, e.g., the ISO Standard SQL [sql]. Thus, given an RDB schema and repository, it is not possible to know from those which definition of semantics, if any, was used. In common use, a given RDB database and repository may make use of no semi-formal definition of semantics or borrow from several different ones. Like XML, other means, such as, documentation, external to the schema and repository describes the semantics. For example, consider how a relation between two sets would be represented in OWL/RDF, XML, and RDB. In OWL/RDF, the semantics of a relation is formally defined similar to the mathematical definition, i.e., as a subset of the cross product of the domain and range. Because the relation is a cross product, it has a direction. An element of the domain is related to an element of the range, but not necessarily the other way around. In an XML schema, there are many different ways of representing a relation using elements, subelements, and attributes. Similarly, in an RDB schema, depending on which semi-formal definition of RDB semantics is used, there are multiple ways to represent a relation. How a relation is represented in an XML or RDB schema/repository can only be known external to the schema/repository itself. ===== Automated Reasoning ===== Given that a knowledge base is represented in OWL/RDF, it becomes amenable to automated reasoning for the purpose of validating and augmenting the knowledge represented. There are three reasoning tasks that can be automated (see section 2.2 of [dl handbook]): * Satisfiability: ensures that every defined class can be non-empty. * Subsumption: determines class hierarchy. * Consistency: identifies class membership for individuals. These tasks help validate that there are no contradictions in class definitions and class membership. RDB has the capability to define constraints on data within tables in the database. However, there is no capability for automatically checking for contradictions within the set of constraints. With OWL/RDF, satisfiability helps ensure that there are no contradictions among classes (see [nistir] for a simple example). In addition, if a query is modeled as a class definition, satisfiability ensures that it is possible for that query to return results. RDBs have no automated tools to check that a given query does not contradict constraints on data within tables. A query that contradicts constraints on data within tables will never return anything. These reasoning tasks also augment the knowledge in the knowledge base. Subsumption computes class hierarchy for those classes whose position in the hierarchy was not explicitly specified, and for those individuals whose class membership was not fully specified, consistency places each individual in the classes where it is a member. Not only do these reasoning tasks augment knowledge in the knowledge base, they also help ensure a knowledge base's validity. Following the subsumption and consistency tasks, erroneous and unintended class definitions can usually be identified. For a knowledge base represented in OWL DL, these reasoning tasks are always decidable and fully automated. Not all knowledge is representable in OWL (see section 5.4 of [horrocks]). Furthermore, as a result of the open world semantics of OWL, some knowledge/information constructs are difficult or impossible to represent in OWL (see section 2.2.4.4 of [dl handbook] and [alan]). Currently, Semantic Web reasoning tools are only capable of fully applying these three reasoning tasks to knowledge bases which can reside in memory. Such knowledge bases (which can be as small as tens of thousands of triples) are not sufficient for most applications. Most applications require at least hundreds of thousands of triples. For those applications, there are several tools available that enable query inferencing on the full knowledge base, e.g., [sesame], [oracle spatial]. Full reasoning can usually be applied to the ontology itself (perhaps, with some sample of individuals) so that some level of validation and augmentation of the knowledge base can be automatically applied. ==== Knowledge Integration ==== Semantic Web technology enables the integration of knowledge bases with each other, and with datasets represented using non-semantic web technologies. Integrating such resources requires that the semantics represented within them is harmonized. With Semantic Web technology, the ontology is the method used to harmonize semantics associated with different resources. For some examples, see [eric], chapter 12 in [kei book], [susie], [kei paper], [scott], [kei book]. Integrating datasets using XML is difficult and not consistent with the very dynamic nature of some applications. See [xiaoshu] for details. With RDB, integrating datasets can be accomplished by creating queries which access tables in different schemas located in disparate databases. The queries become the "ontology" which does the integration. Accomplishing the integration may also require new tables and constraints that relate concepts and terms. As is the case with XML, this process is problematic, and as previously described, neither formal definition of semantics nor automated reasoning is available. [dl handbook] Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel- Schneider, P.F. The Description Logic Handbook: Theory, Implementation, and Application (Cambridge University Press, 2003). [sql] Database Language – SQL, ISO 9075, 1992 [nistir] John Barkley, Using Semantic Web Methods to Improve Information Resource Quality, NIST Internal Report 7354, September 2006. http://www.itl.nist.gov/div897/staff/barkley/consistency-validation-OWLvXML- RDB-9-29-06.pdf [horrocks] Horrocks, I., Patel-Schneider, P. F., van Harmelen, F. From SHIQ and RDF to OWL: The making of a web ontology language. J. of Web Semantics, 1 (1):7-26, 2003. http://www.cs.man.ac.uk/~horrocks/Publications/download/2003/HoPH03a.pdf [alan] Alan Ruttenberg, Jonathan A. Rees, Joanne S. Luciano, Experience Using OWL DL for the Exchange of Biological Pathway Information, OWL: Experiences and Direction Workshop, Galway Ireland, November 2005. http://www.mindswap.org/2005/OWLWorkshop/sub37.pdf [sesame] http://www.openrdf.org/ [oracle spatial] OracleŽ Spatial Resource Description Framework (RDF), 10g Release 2 (10.2), B19307-03. http://download-east.oracle.com/otndocs/tech/semantic_web/pdf/rdfrm.pdf [eric] Biodash: Eric K. Neumann and Dennis Quan, A Semantic Web Dashboard for Drug Development, Pacific Symposium on Biocomputing 11:176-187(2006). http://helix-web.stanford.edu/psb06/neumann.pdf [susie] Stephens, Susie; LaVigna, David; DiLascio, Mike; Luciano, Joanne. Aggregation of Bioinformatics Data Using Semantic Web Technology. In: Journal of Web Semantics, (4)3, 2006. http://www.websemanticsjournal.org/ps/pub/showDoc.Fulltext/document.pdf? lang=en&doc=2006-15&format=pdf&compression= [kei paper] Hugo Y.K. Lam, Luis Marenco, Tim Clark, Yong Gao, June Kinoshita, Gordon Shepherd, Perry Miller, Elizabeth Wu, Gwen Wong, Nian Liu, Chiquito Crasto, Thomas Morse, Susie Stephens, and Kei-Hoi Cheung, Semantic Web Meets e- Neuroscience: An RDF Use Case, 43rd Annual Technical Meeting Society of Engineering Science (SES2006). http://www.oracle.com/technology/industries/life_sciences/press/semantic_web_me ets_eneuroscience.pdf [scott] M. Scott Marshall, Lennart Post, Marco Roos, Timo M. Breit, Using semantic web tools to integrate experimental measurement data on our own terms, International Workshop on Knowledge Systems in Bioinformatics (KSinBIT'06), Montpellier, France, 2006. http://integrativebioinformatics.nl/docs/MarshallKSinBIT.pdf [kei book] Baker, Christopher J.O.; Cheung, Kei-Hoi (Eds.), Semantic Web: Revolutionizing Knowledge Discovery in the Life Sciences (Springer 2007) [xiaoshu] Xiaoshu Wang, Robert Gorlitsky & Jonas S Almeida, From XML to RDF: how semantic web technologies will change the design of 'omic' standards, Nature Biotechnology 23, 1099 - 1103 (2005). http://www.nature.com/nbt/journal/v23/n9/pdf/nbt1139.pdf
Received on Sunday, 18 February 2007 14:40:30 UTC