Proposed new validation section

This might cause disagreement -- IMO most of the section on 'Semantic Validation' applies to syntactic validation too.  My understanding of 'Semantic' or domain-specific validation is that knowledge of a particular application or domain is used to introduce additional checks.  I also note that the term 'Semantic Validation' is rather controversial.  So here's my proposed rewording (again, please excuse 'boilerplate':


\subsubsection{Validation}

In order to help ensure the correctness and usefulness of data, it can be validated automatically.  This picks up some errors that may exist in the data.   Validation can either be enforced when metadata is entered into the system or during an ingesting stage if entry has occurred elsewhere.

It is desirable that validation is performed on human entered data to guard against errors and inconsistencies. However there are limits to the type of validation that can be performed: for example with a controlled vocabulary, we can validate a property value conforms to the vocabulary but we cannot validate that it accurately reflects the real world.  Therefore validation should be performed on a ``best-effort'' basis to guard against errors and inconsistencies. 

There are two main types of validation.

\begin{description}

\item[Syntactic validation]  Syntactic validation, mentioned earlier in this section, refers to ensuring that instance data conforms to the structure and encoding laid out in a schema.  It may also involve checking whether the instance data uses values that correspond to the particular data types or controlled vocabularies used if they are intrinsic to the schema.

\item[Domain-specific validation]  This is also termed by some as `Semantic Validation' (though this term is not universally accepted).  Domain-specific or Semantic Validation extends syntactic validation with other possible validation rules and checks derived from knowledge about a particular application or domain:  For example, in a particular application or domain, instances of a certain class may be required to have certain properties; or if an instance has a certain property there may be restrictions on the other properties it can have etc.

 Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624

Received on Wednesday, 2 July 2003 16:15:28 UTC