- From: Rick Jelliffe <ricko@gate.sinica.edu.tw>
- Date: Mon, 8 Nov 1999 20:31:44 +0800
- To: <w3c-wai-er-ig@w3.org>, <w3c-wai-pf@w3.org>
Al Gilman kindly asked me to provide a sketch of my Schematron tree-pattern language for this group. I have picked WAI as a good demonstration of some issues; I also hope it may be genuinely useful. It is the result of more than 10 years involvement with DTDs which culminated in a book, and also from various discussions related to the XML Schema effort. DTDs have three admirable properties: 1) they are terse; 2) they are elegant (the idea of treating a document as a grammar is brilliant); 3) they allow many different data-modeling methodologies on top of them. I have tried to follow this example with the Schematron. Their problems are: 1) they do not allow specific, custom error messages; 2) not all useful structures can be modeled using a formal grammar (unfortunately, XML Schemas is taking the grammar track too); 3) regular grammars are too complicated for ordinary users and they lend themselves to unruly nested forms that are difficult to construct user interfaces for. The Schematron seems to be a unique XPath system which can be trivially implemented on top of XSL systems. It is not a transformation language; indeed it hides XSL as much as possible while exposing as much XPath as possible. Its predominant use is for detecting tree-patterns in a document: the particular use made of the detected patterns (which often indicate the *absense* of a complete pattern) is entirely application dependent. Friendly document validation is thus the foremost application, though automatic tools for creating RDF based on the patterns can be made too. The basic organization is this: * A schema is made from patterns. All patterns are checked in parallel, as far as the user is concerned. I am putting in code so that the user can select certain patterns only, to avoid being flooded, if that is a problem. Contrast this with DTDs, one early error makes later error reports unreliable; so a DTD user cannot concentrate on fixing problems in some logical sequence...they must fix the document in the sequence that the errors are reported. * A pattern contains rules. A single element in a document can only match one rule; the same rule may match against many elements. An XPath is used to determine the context in the document. For example, I can say "find me every table row that is not the first table row in all tables" <rule context="tr[position() > 1]"> * Each rule then has multiple <assert> or <report> statements. All are tested. These have XPath expressions, which allow matching some criteria starting from the current context. * There are other nice bits under development. Schematron is graph-aware: it is currently can follow an ID/IDREF link and has the code in place for more general keys (when the underlying XSL implementation supports this). It also will soon have groups to allow variant documents and workflows. The distinction between a Schematron schema and a DTD is that a DTD tries to fit everything into a grammar. A Schematron schema is based on the idea that there are other kinds of general patterns in a document; sometimes these patterns may not relate to meaning but to usage: but best-practise should not be a second-class issue! I like to think in terms of "definitional schemas" versus "usage schemas". A definitional schema answers the question "what is this element or attribute or record?" while a usage schema answers the question "what constraints are imposed in this data by its context?" WAI is a usage schema issue, and Schematron lends itself to usage schema definitions. The Schematron can be distinguished from MIX, THETIS and Strudel, in that these are all concerned with definition of fairly atomic elements for the purposes of querying, rather than making assertions about complex structures. (A Schematron-like language could be implemented on top of Strudel, though.) Furthermore, those systems are based on database queries or logic, which is over-engineering for the simple needs of usage schemas. The only system that is close in spirit to Schematron is W3C's Dave Ragget's Assertion Grammars. I will be upgrading the WAI guideline application soon, and Al has raised some interesting issues (i.e., that repair is important). The Schematron home page is at http://www.ascc.net/xml/resource/schematron/schematron.html I hope this is of some use to you. I don't think that the W3C Schema language will provide much help in allowing you to formally express some of the WAI constraints. If there is continued interest, I may put Schematron forward as a technical note. I will be very happy if it some use to the WAI. Rick Jelliffe Academia Sinica (W3C Member) w3c-i18n-ig member w3c-xml-schema-wg member P.S. Hi Judy B: we met in Hong Kong at APWeb'99 conference and had lunch. P.P.S Hi Jason W: do you remember me? We emailed a few years ago.
Received on Monday, 8 November 1999 07:32:41 UTC