Re: MuSim Ontology problems (was Re: Share, Like Ontology) from Antoine Zimmermann on 2010-06-15 (public-lod@w3.org from June 2010)

From: Antoine Zimmermann <antoine.zimmermann@deri.org>
Date: Tue, 15 Jun 2010 18:45:52 +0100
To: Yves Raimond <yves.raimond@gmail.com>
CC: Bob Ferris <zazi@elbklang.net>, Kurt J <kurtjx@gmail.com>, Linked Data community <public-lod@w3.org>, Semantic Web <semantic-web@w3.org>, pedantic-web@googlegroups.com
Message-ID: <4C17BC50.3010801@deri.org>
Hello,

Le 15/06/2010 17:35, Yves Raimond a écrit :
> Hello!
>
>
>> owl:Class is defined as a subclass of rdfs:Class *in the OWL
>> specifications*.  The RDF/RDFS specification does not say anything about
>> owl:Class.  So, from a pure RDFS perspective, owl:Class has as much meaning
>> as, e.g., xyz:abc.  The fact that someone defines *somewhere* that xyz:abc
>> is a subclass of rdfs:Class is irrelevant from a pure RDFS system point of
>> view.  As I said in my example, a SPARQL query would not be able to retrieve
>> the OWL classes or properties that are not directly asserted as RDFS classes
>> or properties (unless the SPARQL engine implements part of the OWL spec,
>> which is rarely the case).
>>
>> Now, that's a small issue but there is no disadvantage of putting the
>> additional types, as far as I know.
>
> Frankly, I don't think this is something we need to change. Yes, the
> rdfs:subClassOf is in the OWL specification, so what? If you follow
> your nose, you'll end up in the RDFS world, so that's OK.

There is nothing wrong with not writing rdfs:Class (similarly to "there 
is nothing wrong in writing webpages in HTML 4.01 as opposed to XHTML"). 
  I only point out that rdfs:Class (and rdf:Property) allows you to play 
nicely with RDF/RDFS tools, such as SPARQL (similarly to "XHTLM plays 
nicely with XML tools").

You can obviously choose to ignore this, for various reasons. But here, 
the changes are so simple that I don't see any problem in adding it. May 
you can tell me where is the problem?

As an example, consider the following SPARQL query:

SELECT ?classOrProp WHERE {
  {?classOrProp a rdfs:Class .} UNION {?classOrProp a rdf:Property .}

most standard SPARQL implementations would return no result at all when 
evaluated against the Similarity ontology (or the Music Ontology). 
SPARQL engines have to implement simple graph matching but have no 
obligation to implement any entailment regime. In practice, there are 
extremely few SPARQL engines that go beyond simple RDF and I'm not aware 
of any implementation of SPARQL with built-in OWL entailment. Moreover, 
some Linked Data people are not very concerned about OWL entailment and 
advocate the use of simple RDF/RDFS modelling and tool usage. Yet, they 
want to take advantage of the RDFS underlying an OWL ontology. Let us 
give them what they want (at almost no extra cost).


FYI, please notice that FOAF is defining all its classes and properties 
with both the OWL vocabulary *and* the RDFS vocabulary.
I actually started to realise the interest of such practice because I 
was wondering why there was these two Class types.


Cheers,
AZ.

>
>>
>>>>
>> [...skip...]
>>
>>>>> =====
>>>>> 4) musim:distance and musim:weight
>>>>> =====
>>>>> I notice that you are defining two datatype properties with multiple
>>>>> range
>>>>> restriction:
>>>>>
>>>>> :distance a owl:DatatypeProperty;
>>>>> rdfs:range xsd:float;
>>>>> rdfs:range xsd:int;
>>>>> rdfs:range xsd:double .
>>>>>
>>>>> and
>>>>>
>>>>> :weight a owl:DatatypeProperty;
>>>>> rdfs:range xsd:float;
>>>>> rdfs:range xsd:int;
>>>>> rdfs:range xsd:double .
>>>>>
>>>>> I'm quite sure that it is not what you intend to mean and I imagine
>>>>> that you
>>>>> would like to say that the weight or the distance can be either a
>>>>> float, a
>>>>> double or an int. Here you actually specify that the distance and the
>>>>> weight of something is necessarily a float, a int and a double at the
>>>>> same
>>>>> time.
>>>>>
>>>>> Furthermore, the OWL spec [1] says that:
>>>>>
>>>>> """As specified in XML Schema [XML Schema Datatypes], the value
>>>>> spaces of
>>>>> xsd:double, xsd:float, and xsd:decimal are pairwise disjoint."""
>>>>>
>>>>> This implies that :distance and :weight are in fact empty relations
>>>>> since it
>>>>> is impossible to have a value which is both a float and a double. Using
>>>>> :distance or :weight in the predicate position of any triple would
>>>>> make the
>>>>> knowledge base inconsistent.
>>>>>
>>>>> If you want to say that a distance or weight has to be in *one of*
>>>>> the three
>>>>> datatypes, you should rather say:
>>>>>
>>>>> :weight a owl:DatatypeProperty, rdf:Property;
>>>>> rdfs:range [ owl:unionOf ( xsd:float xsd:int xsd:double ) ] .
>
> Yes, you're right - it should be an union, not an interesection.
>
>>>>>
>>>>> However, I feel unsatisfied by this because it is slightly
>>>>> overconstraining.
>>>>> Why not allow xsd:decimal or even owl:real as well? Or untyped literals
>>>>> such as:
>>>>>
>>>>> ex:a :distance "1879.42" .
>>>>>
>>>>> I imagine that the value for such a distance will be computed
>>>>> automatically
>>>>> and the programme which does it will ensure that it is indeed a number.
>>>>>
>>>>
>>>> another rookie mistake i'm afraid! i think leaving the rdfs: range
>>>> unspecified perhaps makes the most sense - yes it is a common
>>>> occurence to get a "NaN" distance in audio signal based similarity and
>>>> other similarity calculations.
>>>
>>> Here the issue is that the programme, which computes the number, knows
>>> of course that it is a number, but the reason to define it at least as a
>>> kind of number is for reusing this values.
>>> I'm somehow satisfied with the restriction rdfs:range [ owl:unionOf (
>>> xsd:float xsd:int xsd:double ) ], because it is a well-defined range,
>>> which expresses that the values are number. I can't really imagine other
>>> values that are might used here. The XSD namespace is a kind of best
>>> practice for defining the Datatypes.
>>
>> Reusing the value would be straightforward. In practice, the value will be
>> computed in such a way that it is a number (or maybe "NaN", if relevant) and
>> will most likely be given a datatype. In the end, the data will contain
>> something like:
>>
>> ex:sim :distance "389.009"^^xsd:float .
>>
>> There is no problem reusing this value, regardless of the range definition.
>>   However, *if* the range constraint is maintained as you suggest, the
>> following triples would be each individually inconsistent wrt the ontology:
>>
>> ex:sim :distance "389.009" .
>> ex:sim :distance "NaN" .
>> ex:sim :distance "389.009"^^xsd:decimal .
>> ex:sim :distance "389.009"^^owl:real .
>> ex:sim :distance "0fb7"^^xsd:hexBinary .
>> ex:sim :distance "6z2b76aa"^^xsd:base64Binary .
>>
>> Yet, it's easy to make a programme that deals equally well with all these
>> values, whereas it is difficult to ensure that everybody will use the three
>> datatypes mentioned in the range assertion.
>>
>> In the absence of range assertion, such values as:
>>
>> ex:sim :distance "very similar" .
>> ex:sim :distance "+++"^^xsd:string .
>>
>> would be consistent wrt the ontology but they can be simply ignored by any
>> programme using these values. In the presence of the range assertion, these
>> triples would be inconsistent wrt the ontology, but this does not prevent
>> anybody from writing them, so they would have to be dealt with somehow too.
>>
>
> This discussion reminds me a bit of
> http://www.w3.org/DesignIssues/InterpretationProperties.html.
> I don't think there's anything wrong with either approach. It is
> perfectly ok to put such a range constraint in the ontology. When
> building an aggregator of MuSim data, it is much easier to know what
> to expect (and what to eventually reject - in case it is inconsistent
> wrt the ontology) rather than committing to support all possible
> datatypes! On the other hand, it's fine to leave it open - you gain in
> flexibility.
>
> Kind regards,
> y


-- 
Antoine Zimmermann
Post-doctoral researcher at:
Digital Enterprise Research Institute
National University of Ireland, Galway
IDA Business Park
Lower Dangan
Galway, Ireland
antoine.zimmermann@deri.org
http://vmgal34.deri.ie/~antzim/
Received on Tuesday, 15 June 2010 17:46:34 UTC