- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Sun, 06 Mar 2005 03:15:27 +0100
- To: www-archive@w3.org
= @@@ = == Introduction == @@ Why it is exceedingly important to have validators, etc. @@ == Adding support for new formats == === Basic Requirements === There is little difference between a Validator and ordinary implementations of a format; in order to properly support the format, the applicable conformance requirements must be clearly understood and implemented in a consistent manner. The Validator processes a document basically as follows: * Determine all constraints * Check for all constraints * Present the result * all constraints met * one or more constraints not met * unknown ==== Determine all constraints ==== The first item requires that the document in some way indicates the applicable constraints. In XML for example this is done using the XML declaration, <?xml version="1.0"?> (or no XML declaration) for XML 1.0, <?xml version="1.1"?> for XML 1.1. This requires that documents are written such that this information can be derived from the document. If documents do not provide sufficient information, the Validator would need other means to determine the constraints, e.g. by asking the user to specify them. The latter is typically difficult to implement, it is thus recommended that specifications provide means that gurantee that such information can be derived from all conforming documents. ==== Check for all constraints ==== Once the applicable constraints have been determined, these are checked for. This requires an algorithm that translate all possible inputs into a result (conforming: yes/no/unknown) along with a rationale for the result (e.g., a list of errors). This is the most difficult part, and requires that conformance requirements for documents and (thus) Validators are well-defined. Specifications must provide this information in an easily accessible manner. It is common that conformance requirements are in part spelled out in formal languages such as EBNF grammars and schema documents and it is reasonable to reduce the effort needed to support a new format in the Validator that this is based on such formal languages as support for this partial validation is often already available. It is however also common to define constraints that are not expressed in the formal language (e.g., because using the formal language is not possible or because it is too difficult to use it accordingly). For such requirements additional implementation effort is necessary to fully meet the implementation requirements for Validators. It is thus important that these additional requirements are easy to derive from specifications. For example, specifications can provide a list with all requirements that validation using the formal language would not check for. Some existing specifications clearly identify all errors, e.g. [http://www.w3.org/TR/xslt20/#error-summary XSLT 2.0]. === Present the result === It is important that the users of the Validator fully understand the result of the validation process. There are three states, if not all constraints have been met, the result is negative. If the validator could not find a constraint that has not been met, the result could be positive, but that's not always the case. In this case the result rather depends on whether the Validator fully understood the input, but the format might allow some open-ended extensibility which is not fully supported by the Validator. For example, the format might allow the inclusion of URI references and require that the URI references are constructed according to the URI specfication and the requirements of the URI scheme; if the document then uses a scheme that the Validator does not fully support, it cannot tell whether the document is actually conforming. This must then be clearly indicated in the result. As discussed later in this document, there might not be a 1:1 relationship between the validation result and actual conformance, in this case it must be clear to the user how to interpret the result, what is checked for and what is not. === Conclusion === Specifications ideally treat Validators as first-class implementations and define in the specification the conformance requirements for Validators. This should include making it easy for Validator developers to derive conformance requirements that are not checked for using a formal language (like a schema, if any) and allowing users to clearly understand the requirements for and limitations of conforming validators. Proper implementation of such requirements depends on proper conformance requirements in the first place. For all protocol elements of a format it must be clear how to determine whether it is conforming or not. This might sound obvious, but it is common that specifications do not get all the details right. For example, consider the [http://www.w3.org/TR/SVG11/script.html#ScriptingContentScriptTypeAttribute contentScriptType] attribute in SVG 1.1 and consider these examples: {{{ contentScriptType = 'application' contentScriptType = 'application/ecmascript' contentScriptType = 'application/ecmascript;version=example' contentScriptType = 'application/ecmascript;foo=bar' contentScriptType = 'application/x-ecmascript' contentScriptType = 'application/perlscript' contentScriptType = 'Hello World' contentScriptType = '€' }}} Refer to [http://www.ietf.org/internet-drafts/draft-hoehrmann-script-types-02.txt the application/ecmascript draft] for the type registration. Take five minutes to determine for each example whether the Validator is licenced to generate an error or a warning. @@ need to come up with a better example here... or maybe better no example and better prose... == Design Goals == * @@ Validator MUST NOT declare legal content invalid * @@ Validator SHOULD NOT declare illegal content valid * @@ Validator MUST be reliable (no non-experimental support for a format if the support is known to be incomplete) * @@ ... == Considering Validator Limitations == Validators are limited in their abilities to fully determine whether a document meets all requirements of the relevant specifications under all possible circumstances. For example, Validators are typically static user agents, for dynamic content they can consequently only validate a certain state of the document. They are also limited for all protocol elements that allow for extensibility. A common case is a document format that allows the inclusion of URI references. It must be clear from a specification what Validators are required to check for, for URI references for example it must be clear whether URIs that do not conform to the generic URI syntax or to the specific scheme syntax must be detected (i.e., whether illegal use of URI references renders a document non-conforming) and if so, which schemes must be supported. If illegal use of URI references renders a document non-conforming (and Validators are consequently required to detect such use), the Validator would need to note for URI schemes that it cannot fully check that the document uses features the Validator does not support and consequently cannot make a definitive statement about the compliance of the document. Proper definition of Validator conformance requirements enables users to clearly understand the results of the validator and allows Validator developers to focus on the task of providing support for a new format without spending a lot of effort with reading meaning into requirements in specifications and their normative references. Care must be taken in defining conformance requirements, there are many things that are better avoided. For example, requirements that depend on unstable external information are difficult to implement. Specification authors might want to encourage specific practise such as only using registered media types, charset names, but if a validator is required to check for such constraints it would need to be updated daily to avoid declaring legal use invalid. Other constraints might be impossible to validate, for example, if an attribute value is required to be a "globally unique identifier" the validator would need to be omniscient to tell whether the requirement has been met. Specification authors should check for each protocol element whether an algorithm can be devised that returns a boolean value whether the requirements have been met. == Outreach Material == bla bla ... branding ... outreach ... community ... positive statements ... terminology ... "Valid XHTML 1.1" undefined ... ValidatorOK != Fully Conforming ... bla bla * [http://www.w3.org/Consortium/Legal/logo-usage-20000308.html#public-logos Valid XHTML, Valid CSS, WAI WCAG Levels] * [http://www.w3.org/2005/02/3GSM-2005.html MobileOK] * ... == Validator Testing == Support for new formats must be thoroughly and automatically tested, in particular, all custom code (as opposed to external code such as schema validation libraries) must be tested. Working Groups should provide test suite and work with the Validator Team to integrate existing test suites into the Validator workflow. The Validator in particular needs test cases that violate constraints, Working Groups should thus ensure that test suite include proper error handling tests. The Validator Team should work with Working Groups to contribute new tests back to the "official" test suite. This requires clear documentation of test suite guidelines that must be available to the Validator Team (i.e., publicly available). == Support Maintenance == * @@ Timely responses to requests for clarifications * @@ Changes in schemas, etc. must be communicated to the Validator Team * @@ No informal agreements; changes, clarifications, etc. must be normative before they can be included in the Validator * @@ ... == See Also == * SpecLite, ... -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de 68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Sunday, 6 March 2005 03:04:30 UTC