Re: Handling multiple rdfs:ranges

Ross,


A few comments below.

On 23/02/2016 10:24, Ross Horne wrote:
> Hi AZ,
>
> I agree with you analysis of Bioportal. So would the official line in
> this situation be to encourage the style in the provenance ontology, and
> exert caution when performing RDFS inference in Bioportal?
>
> Let me follow up your nice example in the hypothetical "Bioportal" style
> definition:
>>    "Where P has more than one rdfs:range property, then the resources
>> denoted by the objects of triples with predicate P are instances of
>> *some* class stated by the rdfs:range properties."
>
> You are exactly right with your subclass example. Under the hypothetical
> *some* definition above, properties become more accommodating as more
> schema information is discovered rather than more restrictive.
> Discovering ex:myProperty  rdfs:domain  ex:Female  would just confirm
> something that is already known i.e. that ex:myProperty can be use with
> ex:Person  including females. Notice that this analysis assumes that
>   ex:Female is a sub class of ex:Person is part of the ontology.

There is no assumption about whether ex:Person and ex:Female are relate 
via a subclass relation. In fact, In my example, I expected that 
ex:Female be interpreted as the class of all individuals that have the 
female gender, such as a female dog, a female lizard, or a female human. 
The imaginary ex:myProperty would only apply to the females of the human 
species.


> However, if ex:myProperty  rdfs:domain  ex:Place  was discovered, where
> ex:Place and ex:Person are not related by the sub class relation, then
> we discover something new. In particular, we discover that instances of
> both ex:Person and ex:myProperty may appear in the subject position of a
> triple with property ex:myProperty.

If we find that ex:Place is also a domain of myProperty, then anything 
that has this property is a person, is female, and is a place. That's 
all and that's unrelated to what may or may not appear in triples. To my 
point clearer, consider rdfs:range, rather than rdfs:domain. Take for 
example the character string "AZ". RDF 1.1 Semantics tells us that this 
string has the types rdfs:Literal and xsd:string. I may also find 
somewhere such things as:

:i  ex:prop  "AZ" .
ex:prop rdfs:range ex:TwoLettersString .

from which I can infer "AZ" is a ex:TwoLetterString. However, this does 
not say that "AZ" may appear in the subject position of a triple because 
as a matter of fact, it cannot.


--AZ

>
> My follow up question is: whether anyone knows whether the more
> accommodating inference, as implied by Bioportal, was ever discussed
> during the RDFS standardisation process; and if so, why the more
> restrictive definition for multiple domains and ranges was chosen.
>
> I suspect this question has a simple explanation in model theory, which
> is why I also copy Pat.
>
> Best regards,
>
> Ross
>
>
>
> On 23 February 2016 at 16:37, Antoine Zimmermann
> <antoine.zimmermann@emse.fr <mailto:antoine.zimmermann@emse.fr>> wrote:
>
>     Ross,
>
>
>     The conclusion here is that Bioportal wrongly uses rdfs:domain. The
>     provenance ontology uses it correctly, and if DBpedia does not have
>     multiple domains or ranges, then no problem.
>
>     There are certainly many more mistaken datasets with this respect,
>     as there are many other kinds of errors in datasets. There are also
>     many misinterpretations of HTML markups, mistakes in CSS files, and
>     in fact, all Web standards are misused to some extent. If the wrong
>     use of multiple domains / ranges was largely predominant, it would
>     be a source of concern for the standardisation groups of future
>     versions of RDF. But your observations in your email are not
>     sufficient to indicate that.
>
>     In any case, your suggestion:
>
>     >    "Where P has more than one rdfs:range property, then the resources
>     > denoted by the objects of triples with predicate P are instances of
>     > *some* class stated by the rdfs:range properties."
>
>     would not work well with the inherent incompleteness of knowledge on
>     the Web and with the distributed nature of Web data. If I see:
>
>     ex:myProperty  rdfs:domain  ex:Person .
>
>     somewhere on the Web, I would like to conclude something about those
>     individuals who have the property ex:myProperty. Then I may find the
>     following:
>
>     ex:myProperty  rdfs:domain  ex:Female .
>
>     Now I know more than before, so I should infer more about those who
>     have the property. With your suggestion, every time I would know
>     more about the domain of a property, I would know less about those
>     who have the property.
>
>
>     Best,
>     AZ
>
>
>     On 23/02/2016 03:36, Ross Horne wrote:
>
>         Hi All,
>
>         I'm wondering if many people here use multiple
>         rdfs:domain/rdfs:range
>         properties in RDF Schema?
>
>         The RDF Schema spec is clearly worded: "Where P has more than one
>         rdfs:range property, then the resources denoted by the objects of
>         triples with predicate P are instances of *all* the classes
>         stated by
>         the rdfs:range properties." [similarly for rdfs:domain]
>
>         However, this doesn't quite match the usage of multiple
>         rdfs:domain/rdfs:range properties in several popular datasets.
>
>         For example, in Bioportal, the property bpo:has_event has three
>         classes
>         indicated as its domain: bpo:person, bpo:event and
>         bpo:disease_or_disorder. Following the wording of the spec, it would
>         appear that any resource that appears in the subject position of a
>         triple with property bpo:has_event is an instance of all three types
>         bpo:person, bpo:event and bpo:disease_or_disorder. However,
>         common sense
>         says that the resource cannot simultaneously be a person, event
>         and disease.
>
>         Elsewhere, the provenance ontology avoids the problem by explicitly
>         using owl:unionOf. For example, prov:wasInfluencedBy has
>         rdfs:range such
>         that it is the owl:unionOf the classes prov:Activity, prov:Agent and
>         prov:Entity. DBpedia avoids the problem entirely, since I cannot
>         find
>         any multiple rdfs:domain/rdfs:range properties in their ontologies.
>
>         The interpretation of multiple rdfs:range properties in the above
>         datasets, either implicitly or explicitly imply an alternative
>         spec such as:
>
>             "Where P has more than one rdfs:range property, then the
>         resources
>         denoted by the objects of triples with predicate P are instances of
>         *some* class stated by the rdfs:range properties."
>
>         I'm wondering whether anyone else has observed this mismatch
>         between the
>         spec and real world datasets; and what the official line would be on
>         avoiding this conflict?
>
>         Regards,
>
>         Ross
>
>
>         Note I'm using the following prefixes in examples:
>         bpo: <http://www.semanticweb.org/ontologies/2010/10/BPO.owl#>
>         prov: <http://www.w3.org/ns/prov#>
>         rdfs: <http://www.w3.org/2000/01/rdf-schema#>
>
>
>

Received on Tuesday, 23 February 2016 13:26:28 UTC