Validation in CC/PP

In my opinion, one key problem with CC/PP as it currently stands is there is
no notion of profile validation. I've been speaking about this to people
here at HP working on RDF and they have made a number of suggestions. I'd
like to briefly outline these suggestions as I think they are of general
interest to people working on CC/PP. 

So why do we need validation? Well experience with existing CC/PP
vocabularies has shown that even with a small number of profiles, vendors
make mistakes when creating profiles. For example they get property names
wrong e.g. use PixelsAspectRatio not PixelAspectRatio. There is also no
agreement on property literal values so two vendors might use the same
literal to indicate different capabilities or different literals to indicate
the same capability e.g. "1.2.1/June 2000" and "1.2.1" are used to refer to
the same capability. 

My colleague Andy Seaborne has suggested there are three assumptions you can
make about RDF properties when performing schema validation in order to
solve the first problem:
 
i) Open - the "correct" validation of data against a schema - can never
actually say anything is wrong because RDFS does not make any closed world
assumptions or contain negation. 
ii) Strict - must be able to prove that a resource is the type specified,
whether by domain/range or by rdf:type declaration.
iii) Exact - the resources must have all and only the declared properties.
This is a crude way of getting a robust checking - really need the idea of
optional/mandatory properties.

So it seems to me that by default CC/PP should be using the Strict
assumption i.e. a property can only be used in a profile if it is defined in
the associated schema. In addition if a property is associated with a
component(s), then it can only appear there. In the future, we may have
vocabularies where a device must supply all the profile attributes to
conform to the vocabulary i.e. schema need to be able to define if they
should be interpreted as Exact. 

The second problem has also been encountered by the DAML community who have
explored using XML Schema (XSD) to perform data validation on literal
values. For example this validator
http://www.daml.org/validator/
provides support for XML schema validation using the Oracle XDK XML Schema
Validation toolkit to verify DAML files.

For examples of how to reference XSD in RDF, see this DAML example
http://www.daml.org/validator/examples/dt4.daml
which uses this XML schema file
http://www.daml.org/validator/examples/dt1.xsd
via the dt namespace prefix i.e. 

<rdf:RDF 
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" 
xmlns:daml="http://www.daml.org/2001/03/daml+oil#" 
xmlns:xsd="http://www.w3.org/2000/10/XMLSchema#" 
xmlns:dt="http://www.daml.org/validator/examples/dt1.xsd#">

I also found this document, Annotated DAML+OIL, useful
http://www.daml.org/2001/03/daml+oil-walkthru.html

So what do other people think about this? Would providing validation be
useful? Are there any other appropriate methods?

best regards

Mark H. Butler, PhD
Research Scientist                HP Labs Bristol
mark-h_butler@hp.com
Internet: http://www-uk.hpl.hp.com/people/marbut/

Received on Wednesday, 5 June 2002 12:40:19 UTC