Re: Handling multiple rdfs:ranges from Ross Horne on 2016-02-23 (semantic-web@w3.org from February 2016)

From: Ross Horne <ross.horne@gmail.com>
Date: Tue, 23 Feb 2016 17:24:21 +0800
To: Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: Semantic Web <semantic-web@w3.org>, Pat Hayes <phayes@ihmc.us>
Message-ID: <CAHBrK_jUikdq1-58y6bGgXay1JS7+AfAjmokEaC8xOir8msHOQ@mail.gmail.com>
Hi AZ,

I agree with you analysis of Bioportal. So would the official line in this
situation be to encourage the style in the provenance ontology, and exert
caution when performing RDFS inference in Bioportal?

Let me follow up your nice example in the hypothetical "Bioportal" style
definition:
>    "Where P has more than one rdfs:range property, then the resources
> denoted by the objects of triples with predicate P are instances of
> *some* class stated by the rdfs:range properties."

You are exactly right with your subclass example. Under the hypothetical
*some* definition above, properties become more accommodating as more
schema information is discovered rather than more restrictive.
Discovering  ex:myProperty
rdfs:domain  ex:Female  would just confirm something that is already known
i.e. that ex:myProperty can be use with  ex:Person  including females.
Notice that this analysis assumes that  ex:Female is a sub class of
ex:Person is part of the ontology.

However, if  ex:myProperty  rdfs:domain  ex:Place  was discovered,
where ex:Place
and ex:Person are not related by the sub class relation, then we discover
something new. In particular, we discover that instances of both ex:Person
and ex:myProperty may appear in the subject position of a triple with
property ex:myProperty.

My follow up question is: whether anyone knows whether the more
accommodating inference, as implied by Bioportal, was ever discussed during
the RDFS standardisation process; and if so, why the more restrictive
definition for multiple domains and ranges was chosen.

I suspect this question has a simple explanation in model theory, which is
why I also copy Pat.

Best regards,

Ross



On 23 February 2016 at 16:37, Antoine Zimmermann <antoine.zimmermann@emse.fr
> wrote:

> Ross,
>
>
> The conclusion here is that Bioportal wrongly uses rdfs:domain. The
> provenance ontology uses it correctly, and if DBpedia does not have
> multiple domains or ranges, then no problem.
>
> There are certainly many more mistaken datasets with this respect, as
> there are many other kinds of errors in datasets. There are also many
> misinterpretations of HTML markups, mistakes in CSS files, and in fact, all
> Web standards are misused to some extent. If the wrong use of multiple
> domains / ranges was largely predominant, it would be a source of concern
> for the standardisation groups of future versions of RDF. But your
> observations in your email are not sufficient to indicate that.
>
> In any case, your suggestion:
>
> >    "Where P has more than one rdfs:range property, then the resources
> > denoted by the objects of triples with predicate P are instances of
> > *some* class stated by the rdfs:range properties."
>
> would not work well with the inherent incompleteness of knowledge on the
> Web and with the distributed nature of Web data. If I see:
>
> ex:myProperty  rdfs:domain  ex:Person .
>
> somewhere on the Web, I would like to conclude something about those
> individuals who have the property ex:myProperty. Then I may find the
> following:
>
> ex:myProperty  rdfs:domain  ex:Female .
>
> Now I know more than before, so I should infer more about those who have
> the property. With your suggestion, every time I would know more about the
> domain of a property, I would know less about those who have the property.
>
>
> Best,
> AZ
>
>
> On 23/02/2016 03:36, Ross Horne wrote:
>
>> Hi All,
>>
>> I'm wondering if many people here use multiple rdfs:domain/rdfs:range
>> properties in RDF Schema?
>>
>> The RDF Schema spec is clearly worded: "Where P has more than one
>> rdfs:range property, then the resources denoted by the objects of
>> triples with predicate P are instances of *all* the classes stated by
>> the rdfs:range properties." [similarly for rdfs:domain]
>>
>> However, this doesn't quite match the usage of multiple
>> rdfs:domain/rdfs:range properties in several popular datasets.
>>
>> For example, in Bioportal, the property bpo:has_event has three classes
>> indicated as its domain: bpo:person, bpo:event and
>> bpo:disease_or_disorder. Following the wording of the spec, it would
>> appear that any resource that appears in the subject position of a
>> triple with property bpo:has_event is an instance of all three types
>> bpo:person, bpo:event and bpo:disease_or_disorder. However, common sense
>> says that the resource cannot simultaneously be a person, event and
>> disease.
>>
>> Elsewhere, the provenance ontology avoids the problem by explicitly
>> using owl:unionOf. For example, prov:wasInfluencedBy has rdfs:range such
>> that it is the owl:unionOf the classes prov:Activity, prov:Agent and
>> prov:Entity. DBpedia avoids the problem entirely, since I cannot find
>> any multiple rdfs:domain/rdfs:range properties in their ontologies.
>>
>> The interpretation of multiple rdfs:range properties in the above
>> datasets, either implicitly or explicitly imply an alternative spec such
>> as:
>>
>>    "Where P has more than one rdfs:range property, then the resources
>> denoted by the objects of triples with predicate P are instances of
>> *some* class stated by the rdfs:range properties."
>>
>> I'm wondering whether anyone else has observed this mismatch between the
>> spec and real world datasets; and what the official line would be on
>> avoiding this conflict?
>>
>> Regards,
>>
>> Ross
>>
>>
>> Note I'm using the following prefixes in examples:
>> bpo: <http://www.semanticweb.org/ontologies/2010/10/BPO.owl#>
>> prov: <http://www.w3.org/ns/prov#>
>> rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>>
>
>
Received on Tuesday, 23 February 2016 09:24:50 UTC