Re: RDF validation of vocabulary from Frederick Giasson on 2007-11-26 (semantic-web@w3.org from November 2007)

From: Frederick Giasson <fred@fgiasson.com>
Date: Mon, 26 Nov 2007 08:05:57 -0500
To: Karl Dubost <karl@w3.org>
Cc: Semantic web list <semantic-web@w3.org>
Message-id: <474AC4B5.5050400@fgiasson.com>

Hi Karl,

>
> Is there a way to check that your vocabulary is consistent.  The 
> answer could be:
>
> Your document is RDF valid but contains
>
> * an unknown vocabulary: vocab,
> * an element which is not part of dc vocabulary: foobar


Don't know if such a tool currently exists; but it shouldn't be hard to 
create a prototype one.

For an easier development, you would only have to index such documents 
within a triplestore, and then to query it in different ways to answer 
to these questions.


First you would check for ontologies know by the system. Then you would 
check if each classes and properties used to describe the resource 
belongs to the ontology. Then, you could easily check if rdfs:domains 
and rdfs:range for each properties are used properly. You would only 
have to check in the ontology definition what are the ranges and domains 
for used properties, then to dereference each resources each 
ObjectProperty refers to. Then you check the type of related resources 
to know if it is valid.

Otherwise, if the system doesn't know a vocab, then it can tries to 
dereference the definition of the ontology by dereferencing the 
namespace of the ontology. Once the system put its hands on the 
definition of the ontology, it does what I explained above.


You could even push the experience further and develop some OWL 
reasonning based on possible OWL descriptions of the ontology 
(constraints, etc).

Such a system would "validate" the description of a resource "at the 
best of its knowledge". This means: with the information it knows, and 
with the information it is able to find (the dereferencing part of the 
strategy).


So this problem come down to two things: sending sparql queries againts 
a triple store, andd hoping that URIs and Namespaces are dereferencable 
to get more knowledge about ontologies and resources to help your system 
validating any given resource.



Take care,


Fred

Received on Monday, 26 November 2007 13:07:13 UTC