- From: Ross Horne <ross.horne@gmail.com>
- Date: Tue, 23 Feb 2016 17:24:21 +0800
- To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
- Cc: Semantic Web <semantic-web@w3.org>, Pat Hayes <phayes@ihmc.us>
- Message-ID: <CAHBrK_jUikdq1-58y6bGgXay1JS7+AfAjmokEaC8xOir8msHOQ@mail.gmail.com>
Hi AZ, I agree with you analysis of Bioportal. So would the official line in this situation be to encourage the style in the provenance ontology, and exert caution when performing RDFS inference in Bioportal? Let me follow up your nice example in the hypothetical "Bioportal" style definition: > "Where P has more than one rdfs:range property, then the resources > denoted by the objects of triples with predicate P are instances of > *some* class stated by the rdfs:range properties." You are exactly right with your subclass example. Under the hypothetical *some* definition above, properties become more accommodating as more schema information is discovered rather than more restrictive. Discovering ex:myProperty rdfs:domain ex:Female would just confirm something that is already known i.e. that ex:myProperty can be use with ex:Person including females. Notice that this analysis assumes that ex:Female is a sub class of ex:Person is part of the ontology. However, if ex:myProperty rdfs:domain ex:Place was discovered, where ex:Place and ex:Person are not related by the sub class relation, then we discover something new. In particular, we discover that instances of both ex:Person and ex:myProperty may appear in the subject position of a triple with property ex:myProperty. My follow up question is: whether anyone knows whether the more accommodating inference, as implied by Bioportal, was ever discussed during the RDFS standardisation process; and if so, why the more restrictive definition for multiple domains and ranges was chosen. I suspect this question has a simple explanation in model theory, which is why I also copy Pat. Best regards, Ross On 23 February 2016 at 16:37, Antoine Zimmermann <antoine.zimmermann@emse.fr > wrote: > Ross, > > > The conclusion here is that Bioportal wrongly uses rdfs:domain. The > provenance ontology uses it correctly, and if DBpedia does not have > multiple domains or ranges, then no problem. > > There are certainly many more mistaken datasets with this respect, as > there are many other kinds of errors in datasets. There are also many > misinterpretations of HTML markups, mistakes in CSS files, and in fact, all > Web standards are misused to some extent. If the wrong use of multiple > domains / ranges was largely predominant, it would be a source of concern > for the standardisation groups of future versions of RDF. But your > observations in your email are not sufficient to indicate that. > > In any case, your suggestion: > > > "Where P has more than one rdfs:range property, then the resources > > denoted by the objects of triples with predicate P are instances of > > *some* class stated by the rdfs:range properties." > > would not work well with the inherent incompleteness of knowledge on the > Web and with the distributed nature of Web data. If I see: > > ex:myProperty rdfs:domain ex:Person . > > somewhere on the Web, I would like to conclude something about those > individuals who have the property ex:myProperty. Then I may find the > following: > > ex:myProperty rdfs:domain ex:Female . > > Now I know more than before, so I should infer more about those who have > the property. With your suggestion, every time I would know more about the > domain of a property, I would know less about those who have the property. > > > Best, > AZ > > > On 23/02/2016 03:36, Ross Horne wrote: > >> Hi All, >> >> I'm wondering if many people here use multiple rdfs:domain/rdfs:range >> properties in RDF Schema? >> >> The RDF Schema spec is clearly worded: "Where P has more than one >> rdfs:range property, then the resources denoted by the objects of >> triples with predicate P are instances of *all* the classes stated by >> the rdfs:range properties." [similarly for rdfs:domain] >> >> However, this doesn't quite match the usage of multiple >> rdfs:domain/rdfs:range properties in several popular datasets. >> >> For example, in Bioportal, the property bpo:has_event has three classes >> indicated as its domain: bpo:person, bpo:event and >> bpo:disease_or_disorder. Following the wording of the spec, it would >> appear that any resource that appears in the subject position of a >> triple with property bpo:has_event is an instance of all three types >> bpo:person, bpo:event and bpo:disease_or_disorder. However, common sense >> says that the resource cannot simultaneously be a person, event and >> disease. >> >> Elsewhere, the provenance ontology avoids the problem by explicitly >> using owl:unionOf. For example, prov:wasInfluencedBy has rdfs:range such >> that it is the owl:unionOf the classes prov:Activity, prov:Agent and >> prov:Entity. DBpedia avoids the problem entirely, since I cannot find >> any multiple rdfs:domain/rdfs:range properties in their ontologies. >> >> The interpretation of multiple rdfs:range properties in the above >> datasets, either implicitly or explicitly imply an alternative spec such >> as: >> >> "Where P has more than one rdfs:range property, then the resources >> denoted by the objects of triples with predicate P are instances of >> *some* class stated by the rdfs:range properties." >> >> I'm wondering whether anyone else has observed this mismatch between the >> spec and real world datasets; and what the official line would be on >> avoiding this conflict? >> >> Regards, >> >> Ross >> >> >> Note I'm using the following prefixes in examples: >> bpo: <http://www.semanticweb.org/ontologies/2010/10/BPO.owl#> >> prov: <http://www.w3.org/ns/prov#> >> rdfs: <http://www.w3.org/2000/01/rdf-schema#> >> > >
Received on Tuesday, 23 February 2016 09:24:50 UTC