Please refer to the errata for this document, which may include some normative corrections. For Rec only.
Copyright © 2003 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply.
This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.
This is a public (WORKING DRAFT) Working Group Note produced by the W3C Semantic Web Best Practices Working Group, which is part of the W3C Semantic Web activity.
Discussion of this document is invited on the public mailing list www-ws-arch@w3.org (public archives).
Publication as a Working Group Note does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress. Other documents may supersede this document.
Please send comments to either Phil Tetlow, IBM, <philip.tetlow@uk.ibm.com> or Jeff Pan, Manchester University, <pan@cs.man.ac.uk>
In all well-established engineering disciplines, modelling a common understanding of domains through a variety of formal and semi-formal notations has proven itself essential to advancing the practice in each such line of work. This has led to large sections of the Software Engineering profession evolving from the concept of constructing models of one form or another as a means to develop, communicate and verify abstract designs in accordance with original requirements. Computers Aided Software Engineering (CASE) and, more recently, Model Driven Architectures (MDA) provide the most prominent examples of this approach. Here models are not only used for design purposes, but associated tools and techniques can be utilised further to generate executable artefacts for later use in the Software Lifecycle. Nevertheless there has always been a frustrating paradox present with tooling use in Software Engineering. This arises from the range of modelling techniques available and the breadth of systems requiring design: Engineering nontrivial systems demands rigour and unambiguous statement of concept, yet the more formal the modelling approach chosen, the more abstract the tools needed, often making methods difficult to implement, limiting the freedom of expression available to the engineer and proving a barrier to communication amongst practitioners with lesser experience. For these reasons less formal approaches have seen mainstream commercial acceptance in recent years, with the Unified Modelling Language (UML) currently being the most favoured amongst professionals.
Even so, approaches like the UML are by no means perfect. Although they are capable of capturing highly complex conceptualisations, current versions are far from semantically rich. Furthermore they can be notoriously ambiguous. A standard isolated model from such a language, no matter how perfect, can still be open to gross misinterpretation by those who are not overly familiar with its source problem space. It is true that supporting annotation and documentation can help alleviate such problems, but traditionally this has still involved a separate, literal, verbose and longwinded activity often disjointed from the production of the actual model itself. Furthermore, MDA does not currently support automated consistency checking.
What is needed instead is a way to incorporate unambiguous, rich semantics into the various semi-formal notations underlying methods like the UML. In so doing, the ontologies inherent to a system's problem space - real world or not - and its various abstract solution spaces could be encapsulated through the very same representations used to engineer its design. This not only provides a basis for improved communication, conformance verification and automated generation of run time-artefacts, but would also presents additional mechanisms for establishing the consistency of deliverables across specification, design and build processes.
In many respects an ontology can be considered as simply a formal model in its own right. Hence, given the semantically rich, unambiguous qualities of information embodiment on the Semantic Web, and the universality of the Semantic Web's XML heritage, there appears a compelling argument to combine the semi-formal, model driven techniques of Software Engineering with approaches common to Information Engineering on the Semantic Web. This may involve the implanting of descriptive ontologies directly into systems' design models themselves, the referencing of separate metadata artefacts by such models or a mixture of both. What is important is that mechanisms are made available to enable cross-referencing and checking between design descriptions and related ontologies in a manner that can be easily engineered and maintained for the betterment of systems' quality and cost.
Such mechanism should be capable of supporting both the interlinking of more broadly related ontologies into grander information corpuses (Thereby implying formal similarities and relationships between discreet systems through their design description metadata), and the transformation of designtime ontological artefact relationships into useful runtime bindings. This will, therefore, realise metadata use across a broader spectrum of the software lifecycle. In so doing, this approach carries two obvious implications for Web-based systems employing such techniques:
Having raised the above propositions, a commonly asked question arises, namely; how does one broadly characterise the Semantic Web in terms of Software or Systems' Engineering use? In attempting to answer this question, consensus appears to be forming around two loose definitions:
Primarily such tools and techniques should be viewed as being formally descriptive in character, but there appears little reason to restrict this definition other than standards alignment. Therefore, it may also be relevant, at some appropriate point in the Semantic Web's future, to include prescriptive, invasive and/or other types of approach under this heading.
In such circumstances the Semantic Web can be viewed as a single formalised corpus of interrelated, reusable ontological content, which can be further be classified as being either:
By suggesting use of the Semantic Web as a framework for runtime component sharing there is an implicit need to provide means for clearly identifying participating components based on aggregates of characterising semantic properties (name-pair/predicate-object values), and this differs from current Semantic Web schemes for unique identification (e.g. FOAF sha1). In such frameworks the Semantic Web can be seen as a true global relational database and, as with every relational model, issues dealing with composite object identification have to be addressed. This leads to the view that:
This illustrates that the establishment of accurate context is established if open relational association is to work properly on the Semantic Web. In making such a statement it is acknowledged that the Semantic Web still faces a number of well known issues when attempting to implement public mechanisms for artefact sharing:
As highlighted above, it is also important to note also that schemas now exist to provide adequate descriptions for both traditional passive and active content for use as runtime artefacts. So the notion of the Semantic Web as a global database and a usable artefact itself embodies both descriptive and functional notions. Furthermore, given that concept of runtime sharing has been raised, it is also important to recognise that the querying, discovery and binding of components at run time does not necessarily have to be static, leading to the, currently extreme, notion of dynamically formed 'runtime Semantic Web systems'. This appears to concur with the concepts of 'Semantic Grid Computing' and on dynamic Web Service choreography currently gaining popularity in other IT and standards communities.
It is important to note that the concept of applying Semantic Web technologies and techniques in Systems' and Software Engineering does not solely arise from any desire to increase semantic expressiveness or to access a global resource pool. The amalgamation of existing engineering practices with methods capable of semantic ontology representation adds significant potential for increased model formality in a number of previously weak areas. Semantic Web technologies provide strong mechanisms for first-order predicate logic representation and are capable of second-order logic in keeping with triple notation and this adds a number of advantages specifically in areas of conformance and consistency checking. Furthermore, with the aid graphical modelling tools, levels of formal logic can be achieved, currently uncommon in everyday practice, with relative ease.
Many, however, would argue that such approaches have been tried a number of times before with only limited success. This may indeed by true, but it is important to remember that past attempts have always been isolated to some degree. Standards-based ontology representation targeted at hugely open problem spaces, such as the Web, is, however, a new concept and deliberately sets out to remove isolated problem solving from the equation. It not only offers a number of distinct technical advantages, but is also available to a hitherto unprecedented global development community. Furthermore, this community is steeped in a tradition of free and open knowledge exchange and source distribution. If the history of the Web to date is to be used as a benchmark, this community will eventually produce a groundswell of support and enough impetus to kick-start a number of revolutionary changes in systems and software engineering. To recognise this potential and provide early food for thought, is hence seen as a significantly worthy initiative.