RDF validation of vocabulary

> I was trying to validate an RDF document and I just realized that the  
> RDF validator, was just checking
> the document is well-formed and that is a graph, but not that the  
> vocabulary is used appropriately.
...
> 
> The vocabulary which is not part of Dublin Core for example will not  
> be detected.
> <dc:foobar>a foreign element to Dublin Core Vocabulary</dc:foobar>
> 
> Is there a way to check that your vocabulary is consistent.  The  
> answer could be:
> 
> Your document is RDF valid but contains
> 
> * an unknown vocabulary: vocab,

Unknown to whom?  Unknown to the person who wrote the validator? That's
not very useful.  You could, however have errors like these:

    * Link Checker: Unable to deference the URI "...."
      (404 Not Found)

or
    * Link Type Checker: No RDFS/OWL vocabulary definition found at URI
      "....".    Perhaps you are using the wrong URI, or perhaps the
      vocabulary publisher is not serving it correctly.

      [ You would get this error from DC, Karl, at least if you used
      the best advice to date from the TAG on how to handle
      dereferencing. ]

> * an element which is not part of dc vocabulary: foobar

That is:

    * Your document used a term which is not used in the vocabulary
      definition document.   You may have made a mistake.

or

    * Your document used a term in a way which is inconsistent with the
      vocabulary definition document.  

This last case is what you'd like, I think, but with OWL it will almost
never happen, and with RDFS misuse is even less likely to be caught.
OWL and RDFS are not there to help you find this kind of syntactic error
in your data.

What I *suspect* you're looking for here, Karl, is not really something
the Semantic Web Recommendations, alone, can give you.  At least for me,
when I find myself asking questions like this, what I really want is a
tool to check that a document is a proper element of some
interface/protocol.  Have I communicated in this file what I should
communicate in Dublin Core data?  Does this file look right for FOAF
tools to work properly with it?

But there is no way to answer those questions because even though there
are specifications of DC and FOAF, they don't get into those details.
For that you'd need some other specification, giving baseline details
about what information must be present in some DC or FOAF document which
is expected to be used for some purpose, etc.  

And it would be nice to have that specification be machine readable, so
that it could be validated by a machine.  We're not there yet -- no one
is writing specs like that (as far as I know), and there is no standard
language in which to write them, anyway.

Some approaches approaches:

   - Document some SPARQL queries and characterize what the responses
     MUST, SHOULD, MAY, and MUST NOT look like.

   - Define an XML Schema for your data, use XML Schema Validation for
     data validation, and GRDDL (or otherwise map) it to RDF.

   - Define an alternate semantics or extension for OWL which allows OWL
     documents to be used in this way.  See [1].

FWIW, I don't see any easy or quick solution here, and I think this is a
major obstacle for the Semantic Web.   

    -- Sandro

[1] http://www.cs.man.ac.uk/~horrocks/Publications/download/2007/MoHS07a.pdf

Received on Monday, 26 November 2007 15:04:08 UTC