ValidatorGL draft...

= @@@ = 

== Introduction ==

@@ Why it is exceedingly important to have validators, etc.
@@ 

== Adding support for new formats ==

=== Basic Requirements ===

There is little difference between a Validator and ordinary
implementations of a format; in order to properly support the
format, the applicable conformance requirements must be clearly
understood and implemented in a consistent manner. The Validator
processes a document basically as follows:

 * Determine all constraints
 * Check for all constraints
 * Present the result
  * all constraints met
  * one or more constraints not met
  * unknown

==== Determine all constraints ====

The first item requires that the document in some way indicates the
applicable constraints. In XML for example this is done using the XML
declaration, <?xml version="1.0"?> (or no XML declaration) for XML 1.0,
<?xml version="1.1"?> for XML 1.1.

This requires that documents are written such that this information can
be derived from the document. If documents do not provide sufficient
information, the Validator would need other means to determine the
constraints, e.g. by asking the user to specify them.

The latter is typically difficult to implement, it is thus recommended
that specifications provide means that gurantee that such information
can be derived from all conforming documents.

==== Check for all constraints ====

Once the applicable constraints have been determined, these are checked
for. This requires an algorithm that translate all possible inputs into
a result (conforming: yes/no/unknown) along with a rationale for the
result (e.g., a list of errors). This is the most difficult part, and
requires that conformance requirements for documents and (thus)
Validators are well-defined.

Specifications must provide this information in an easily accessible
manner. It is common that conformance requirements are in part spelled
out in formal languages such as EBNF grammars and schema documents and
it is reasonable to reduce the effort needed to support a new format in
the Validator that this is based on such formal languages as support for
this partial validation is often already available.

It is however also common to define constraints that are not expressed
in the formal language (e.g., because using the formal language is not
possible or because it is too difficult to use it accordingly). For such
requirements additional implementation effort is necessary to fully meet
the implementation requirements for Validators.

It is thus important that these additional requirements are easy to
derive from specifications. For example, specifications can provide a
list with all requirements that validation using the formal language
would not check for. Some existing specifications clearly identify all
errors, e.g. [http://www.w3.org/TR/xslt20/#error-summary XSLT 2.0].

=== Present the result ===

It is important that the users of the Validator fully understand the
result of the validation process. There are three states, if not all
constraints have been met, the result is negative. If the validator
could not find a constraint that has not been met, the result could be
positive, but that's not always the case.

In this case the result rather depends on whether the Validator fully
understood the input, but the format might allow some open-ended
extensibility which is not fully supported by the Validator.

For example, the format might allow the inclusion of URI references and
require that the URI references are constructed according to the URI
specfication and the requirements of the URI scheme; if the document
then uses a scheme that the Validator does not fully support, it cannot
tell whether the document is actually conforming. This must then be
clearly indicated in the result.

As discussed later in this document, there might not be a 1:1
relationship between the validation result and actual conformance, in
this case it must be clear to the user how to interpret the result, what
is checked for and what is not.

=== Conclusion ===

Specifications ideally treat Validators as first-class implementations
and define in the specification the conformance requirements for
Validators. This should include making it easy for Validator developers
to derive conformance requirements that are not checked for using a
formal language (like a schema, if any) and allowing users to clearly
understand the requirements for and limitations of conforming
validators.

Proper implementation of such requirements depends on proper conformance
requirements in the first place. For all protocol elements of a format
it must be clear how to determine whether it is conforming or not. This
might sound obvious, but it is common that specifications do not get all
the details right.

For example, consider the
[http://www.w3.org/TR/SVG11/script.html#ScriptingContentScriptTypeAttribute
contentScriptType] attribute in SVG 1.1 and consider these examples:

{{{
  contentScriptType = 'application'
  contentScriptType = 'application/ecmascript'
  contentScriptType = 'application/ecmascript;version=example'
  contentScriptType = 'application/ecmascript;foo=bar'
  contentScriptType = 'application/x-ecmascript'
  contentScriptType = 'application/perlscript'
  contentScriptType = 'Hello World'
  contentScriptType = '&#x20AC;'
}}}

Refer to
[http://www.ietf.org/internet-drafts/draft-hoehrmann-script-types-02.txt
the application/ecmascript draft] for the type registration. Take five
minutes to determine for each example whether the Validator is licenced
to generate an error or a warning.

@@ need to come up with a better example here... or maybe better no
example and better prose...

== Design Goals ==

 * @@ Validator MUST NOT declare legal content invalid
 * @@ Validator SHOULD NOT declare illegal content valid
 * @@ Validator MUST be reliable (no non-experimental support for a
format if the support is known to be incomplete)
 * @@ ...

== Considering Validator Limitations ==

Validators are limited in their abilities to fully determine whether a
document meets all requirements of the relevant specifications under all
possible circumstances. For example, Validators are typically static
user agents, for dynamic content they can consequently only validate a
certain state of the document. They are also limited for all protocol
elements that allow for extensibility.

A common case is a document format that allows the inclusion of URI
references. It must be clear from a specification what Validators are
required to check for, for URI references for example it must be clear
whether URIs that do not conform to the generic URI syntax or to the
specific scheme syntax must be detected (i.e., whether illegal use of
URI references renders a document non-conforming) and if so, which
schemes must be supported.

If illegal use of URI references renders a document non-conforming (and
Validators are consequently required to detect such use), the Validator
would need to note for URI schemes that it cannot fully check that the
document uses features the Validator does not support and consequently
cannot make a definitive statement about the compliance of the document.

Proper definition of Validator conformance requirements enables users
to clearly understand the results of the validator and allows Validator
developers to focus on the task of providing support for a new format
without spending a lot of effort with reading meaning into requirements
in specifications and their normative references.

Care must be taken in defining conformance requirements, there are many
things that are better avoided. For example, requirements that depend on
unstable external information are difficult to implement. Specification
authors might want to encourage specific practise such as only using
registered media types, charset names, but if a validator is required to
check for such constraints it would need to be updated daily to avoid
declaring legal use invalid.

Other constraints might be impossible to validate, for example, if an
attribute value is required to be a "globally unique identifier" the
validator would need to be omniscient to tell whether the requirement
has been met. Specification authors should check for each protocol
element whether an algorithm can be devised that returns a boolean value
whether the requirements have been met.

== Outreach Material ==

bla bla ... branding ... outreach ... community ... positive statements
... terminology ... "Valid XHTML 1.1" undefined ... ValidatorOK != Fully
Conforming ... bla bla 

 *
[http://www.w3.org/Consortium/Legal/logo-usage-20000308.html#public-logos
Valid XHTML, Valid CSS, WAI WCAG Levels]
 * [http://www.w3.org/2005/02/3GSM-2005.html MobileOK]
 * ...

== Validator Testing ==

Support for new formats must be thoroughly and automatically tested, in
particular, all custom code (as opposed to external code such as schema
validation libraries) must be tested. Working Groups should provide test
suite and work with the Validator Team to integrate existing test suites
into the Validator workflow.

The Validator in particular needs test cases that violate constraints,
Working Groups should thus ensure that test suite include proper error
handling tests. The Validator Team should work with Working Groups to
contribute new tests back to the "official" test suite. This requires
clear documentation of test suite guidelines that must be available to
the Validator Team (i.e., publicly available).

== Support Maintenance ==

 * @@ Timely responses to requests for clarifications
 * @@ Changes in schemas, etc. must be communicated to the Validator
Team
 * @@ No informal agreements; changes, clarifications, etc. must be
normative before they can be included in the Validator
 * @@ ...


== See Also ==

 * SpecLite, ...


-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 

Received on Sunday, 6 March 2005 03:04:30 UTC