Re: MuSim Ontology problems (was Re: Share, Like Ontology)

Le 14/06/2010 15:26, Bob Ferris a écrit :
> Hi Antoine,
> Hi Kurt,
> Hi at all from the different lists,
>
> Am 13.06.2010 22:13, schrieb Kurt J:
>> Hi Antoine,

[...skip...]

>>> =====
>>> 2) owl:Class VS rdfs:Class; owl:*Property VS rdf:Property
>>> =====
>>> All your classes and properties are declared using the OWL vocabulary.
>>> It would be good to have, *in addition* to this, a declared type
>>> rdfs:Class
>>> and rdf:Property like this:
>>>
>>> :Similarity a owl:Class, rdfs:Class;
>>> rdfs:label "...";
>>> rdfs:subClassOf ... etc.
>>
>> done, thnx!
>>
>
> I do not really understand the need for rdfs:Class:
> owl:Class is already defined with rdfs:subClassOf rdfs:Class (same thing
> for the properties). So its is a transitivity issue and it depends on
> the used reasoner to resolve that issue.

owl:Class is defined as a subclass of rdfs:Class *in the OWL 
specifications*.  The RDF/RDFS specification does not say anything about 
owl:Class.  So, from a pure RDFS perspective, owl:Class has as much 
meaning as, e.g., xyz:abc.  The fact that someone defines *somewhere* 
that xyz:abc is a subclass of rdfs:Class is irrelevant from a pure RDFS 
system point of view.  As I said in my example, a SPARQL query would not 
be able to retrieve the OWL classes or properties that are not directly 
asserted as RDFS classes or properties (unless the SPARQL engine 
implements part of the OWL spec, which is rarely the case).

Now, that's a small issue but there is no disadvantage of putting the 
additional types, as far as I know.

>>
[...skip...]

>>> =====
>>> 4) musim:distance and musim:weight
>>> =====
>>> I notice that you are defining two datatype properties with multiple
>>> range
>>> restriction:
>>>
>>> :distance a owl:DatatypeProperty;
>>> rdfs:range xsd:float;
>>> rdfs:range xsd:int;
>>> rdfs:range xsd:double .
>>>
>>> and
>>>
>>> :weight a owl:DatatypeProperty;
>>> rdfs:range xsd:float;
>>> rdfs:range xsd:int;
>>> rdfs:range xsd:double .
>>>
>>> I'm quite sure that it is not what you intend to mean and I imagine
>>> that you
>>> would like to say that the weight or the distance can be either a
>>> float, a
>>> double or an int. Here you actually specify that the distance and the
>>> weight of something is necessarily a float, a int and a double at the
>>> same
>>> time.
>>>
>>> Furthermore, the OWL spec [1] says that:
>>>
>>> """As specified in XML Schema [XML Schema Datatypes], the value
>>> spaces of
>>> xsd:double, xsd:float, and xsd:decimal are pairwise disjoint."""
>>>
>>> This implies that :distance and :weight are in fact empty relations
>>> since it
>>> is impossible to have a value which is both a float and a double. Using
>>> :distance or :weight in the predicate position of any triple would
>>> make the
>>> knowledge base inconsistent.
>>>
>>> If you want to say that a distance or weight has to be in *one of*
>>> the three
>>> datatypes, you should rather say:
>>>
>>> :weight a owl:DatatypeProperty, rdf:Property;
>>> rdfs:range [ owl:unionOf ( xsd:float xsd:int xsd:double ) ] .
>>>
>>> However, I feel unsatisfied by this because it is slightly
>>> overconstraining.
>>> Why not allow xsd:decimal or even owl:real as well? Or untyped literals
>>> such as:
>>>
>>> ex:a :distance "1879.42" .
>>>
>>> I imagine that the value for such a distance will be computed
>>> automatically
>>> and the programme which does it will ensure that it is indeed a number.
>>>
>>
>> another rookie mistake i'm afraid! i think leaving the rdfs: range
>> unspecified perhaps makes the most sense - yes it is a common
>> occurence to get a "NaN" distance in audio signal based similarity and
>> other similarity calculations.
>
> Here the issue is that the programme, which computes the number, knows
> of course that it is a number, but the reason to define it at least as a
> kind of number is for reusing this values.
> I'm somehow satisfied with the restriction rdfs:range [ owl:unionOf (
> xsd:float xsd:int xsd:double ) ], because it is a well-defined range,
> which expresses that the values are number. I can't really imagine other
> values that are might used here. The XSD namespace is a kind of best
> practice for defining the Datatypes.

Reusing the value would be straightforward. In practice, the value will 
be computed in such a way that it is a number (or maybe "NaN", if 
relevant) and will most likely be given a datatype. In the end, the data 
will contain something like:

ex:sim :distance "389.009"^^xsd:float .

There is no problem reusing this value, regardless of the range 
definition.  However, *if* the range constraint is maintained as you 
suggest, the following triples would be each individually inconsistent 
wrt the ontology:

ex:sim :distance "389.009" .
ex:sim :distance "NaN" .
ex:sim :distance "389.009"^^xsd:decimal .
ex:sim :distance "389.009"^^owl:real .
ex:sim :distance "0fb7"^^xsd:hexBinary .
ex:sim :distance "6z2b76aa"^^xsd:base64Binary .

Yet, it's easy to make a programme that deals equally well with all 
these values, whereas it is difficult to ensure that everybody will use 
the three datatypes mentioned in the range assertion.

In the absence of range assertion, such values as:

ex:sim :distance "very similar" .
ex:sim :distance "+++"^^xsd:string .

would be consistent wrt the ontology but they can be simply ignored by 
any programme using these values. In the presence of the range 
assertion, these triples would be inconsistent wrt the ontology, but 
this does not prevent anybody from writing them, so they would have to 
be dealt with somehow too.


Regards,
-- 
Antoine Zimmermann
Post-doctoral researcher at:
Digital Enterprise Research Institute
National University of Ireland, Galway
IDA Business Park
Lower Dangan
Galway, Ireland
antoine.zimmermann@deri.org
http://vmgal34.deri.ie/~antzim/

Received on Monday, 14 June 2010 15:52:43 UTC