- From: Butler, Mark <Mark_Butler@hplb.hpl.hp.com>
- Date: Thu, 5 Feb 2004 14:23:21 -0000
- To: www-rdf-dspace@w3.org
Hi team, I want to explain a problem I have encountered while implementing inferencing in Longwell, and outline a number of possible solutions and their pros and cons. This issue is relevant to Haystack also. The Artstor data has a number of properties which contain hierarchical terms e.g. Architecture : Artist Architecture : Site A while ago I modified the XSLT transform that creates the RDF version of the Artstor data to insert subclass relations to indicate the hierarchy. In the following examples I've simplified the names here to make it easier to read, and numbered them to make them easier to refer to later on: (1) artstorcontrolledterm:Architecture rdfs:label "Architecture" ; rdfs:type ArtstorControlledTerm . (2) artstorcontrolledterm:Architecture_artist rdfs:subClassOf artstorconrolledTerm:architecture ; rdfs:lable "Architecture : Artist" ; rdfs:type ArstorControlledTerm . then these terms are used in the inference data as follows: (3) artstordata:UCSD001 artstor:subject artstorcontrolledTerm:Architecture_artists . So I want the inference engine to infer result (4): (4) artstordata:UCSD001 artstor:subject artstorcontrolledTerm:Architecture However according to RDFS, I can only make this inference if the property is rdf:type e.g. a rdf:type b . b rdfs:subClassOf c . then I could infer a rdf:type c . So I can't make that inference in this case. POSSIBLE SOLUTIONS 1. Create a custom rule (which eventually could be described in some kind of rules language) that infers result (4). Advantages: No changes to current data model. Disadvantage: This requires custom rule processors, however one of the aims of the semantic web is to avoid these custom processors because this means the data can only be processed by these processors. Of course eventually these custom rule processors could be replaced by rules written in a standardised rule language, but we don't have such a language yet. 2. Change the data model to include an instance that is of type artstorcontrolledTerm:architecture_site e.g. assuming (1) and (2), artstordata:UCSD001 artstor:subject [ rdf:type artstorcontrolledTerm:Architecture_artists ] . then we can now infer artstordata:UCSD001 artstor:subject [ rdf:type artstorcontrolledTerm:Architecture ] . Advantages: This will work with standard RDFS inference, so no need for custom rule processors. Disadvantages: This makes the data model more complicated. Secondly it increases the complexity of the inference task - for example we now have to inference over every bNode of type arstorcontrolledTerm:architecture_artists rather than just a single instance of the artstorcontrolledTerm:Architecture_artists class as in solution (1). Thirdly it will require changes to both the Haystack and Longwell clients. 3. I'm not so clear on this, but I think we could use owl:hasValue e.g. artstordata:UCSD001 artstor:subject artstorcontrolledTerm:Architecture_artists . artstorcontrolledTermClass:Architecture_artists rdf:type owl:Class ; rdfs:subClassOf artstorcontrolledTermClass:Architecture rdfs:subClassOf [ rdf:type owl:Restriction ; owl:onProperty artstor:subject ; owl:hasValue artstorcontrolledTerm:Architecture_artists ; ] . which allows us to infer artstordata:UCSD001 rdf:type artstorcontrolledTermClass:Architecture_artists ; rdf:type arcstorcontrolledTermClass:Architecture Advantages: This works with standard OWL inference, so no need for custom rule processors. Disadvantages: This adds complexity to the schema / ontology, as every controlled term is now represented by two URIs, one the controlled term in the property, the other representing the type. As I see it, this is related to a standard modelling question in RDF e.g. when should do we do this a b c ; d e ; f g ; and when should we do this: a rdf:type b_c ; rdf:type d_e ; rdf:type f_g ; MY RECOMMENDATION: As I am having to use a custom inference engine at the moment anyway, I think solution (1) is the easiest. The proposed RDF standard for thesauri http://www.w3c.rl.ac.uk/SWAD/rdfthes.html also seems to be predicated on the existence of custom processors rather than leveraging OWL and RDFS. At some point, we can migrate our vocabularies to use this standard, and then programs that use this standard can also use our (well the Artstor) vocabularies. Can anybody else think of any other alternatives here, and do people agree that solution (1) is the way forward or do people have a strong preference for one of the other alternatives and if so why? Mark Butler Research Scientist HP Labs Bristol http://www-uk.hpl.hp.com/people/marbut
Received on Thursday, 5 February 2004 09:25:08 UTC