Re: MuSim Ontology problems (was Re: Share, Like Ontology)

Hello!


>
> owl:Class is defined as a subclass of rdfs:Class *in the OWL
> specifications*.  The RDF/RDFS specification does not say anything about
> owl:Class.  So, from a pure RDFS perspective, owl:Class has as much meaning
> as, e.g., xyz:abc.  The fact that someone defines *somewhere* that xyz:abc
> is a subclass of rdfs:Class is irrelevant from a pure RDFS system point of
> view.  As I said in my example, a SPARQL query would not be able to retrieve
> the OWL classes or properties that are not directly asserted as RDFS classes
> or properties (unless the SPARQL engine implements part of the OWL spec,
> which is rarely the case).
>
> Now, that's a small issue but there is no disadvantage of putting the
> additional types, as far as I know.

Frankly, I don't think this is something we need to change. Yes, the
rdfs:subClassOf is in the OWL specification, so what? If you follow
your nose, you'll end up in the RDFS world, so that's OK.

>
>>>
> [...skip...]
>
>>>> =====
>>>> 4) musim:distance and musim:weight
>>>> =====
>>>> I notice that you are defining two datatype properties with multiple
>>>> range
>>>> restriction:
>>>>
>>>> :distance a owl:DatatypeProperty;
>>>> rdfs:range xsd:float;
>>>> rdfs:range xsd:int;
>>>> rdfs:range xsd:double .
>>>>
>>>> and
>>>>
>>>> :weight a owl:DatatypeProperty;
>>>> rdfs:range xsd:float;
>>>> rdfs:range xsd:int;
>>>> rdfs:range xsd:double .
>>>>
>>>> I'm quite sure that it is not what you intend to mean and I imagine
>>>> that you
>>>> would like to say that the weight or the distance can be either a
>>>> float, a
>>>> double or an int. Here you actually specify that the distance and the
>>>> weight of something is necessarily a float, a int and a double at the
>>>> same
>>>> time.
>>>>
>>>> Furthermore, the OWL spec [1] says that:
>>>>
>>>> """As specified in XML Schema [XML Schema Datatypes], the value
>>>> spaces of
>>>> xsd:double, xsd:float, and xsd:decimal are pairwise disjoint."""
>>>>
>>>> This implies that :distance and :weight are in fact empty relations
>>>> since it
>>>> is impossible to have a value which is both a float and a double. Using
>>>> :distance or :weight in the predicate position of any triple would
>>>> make the
>>>> knowledge base inconsistent.
>>>>
>>>> If you want to say that a distance or weight has to be in *one of*
>>>> the three
>>>> datatypes, you should rather say:
>>>>
>>>> :weight a owl:DatatypeProperty, rdf:Property;
>>>> rdfs:range [ owl:unionOf ( xsd:float xsd:int xsd:double ) ] .

Yes, you're right - it should be an union, not an interesection.

>>>>
>>>> However, I feel unsatisfied by this because it is slightly
>>>> overconstraining.
>>>> Why not allow xsd:decimal or even owl:real as well? Or untyped literals
>>>> such as:
>>>>
>>>> ex:a :distance "1879.42" .
>>>>
>>>> I imagine that the value for such a distance will be computed
>>>> automatically
>>>> and the programme which does it will ensure that it is indeed a number.
>>>>
>>>
>>> another rookie mistake i'm afraid! i think leaving the rdfs: range
>>> unspecified perhaps makes the most sense - yes it is a common
>>> occurence to get a "NaN" distance in audio signal based similarity and
>>> other similarity calculations.
>>
>> Here the issue is that the programme, which computes the number, knows
>> of course that it is a number, but the reason to define it at least as a
>> kind of number is for reusing this values.
>> I'm somehow satisfied with the restriction rdfs:range [ owl:unionOf (
>> xsd:float xsd:int xsd:double ) ], because it is a well-defined range,
>> which expresses that the values are number. I can't really imagine other
>> values that are might used here. The XSD namespace is a kind of best
>> practice for defining the Datatypes.
>
> Reusing the value would be straightforward. In practice, the value will be
> computed in such a way that it is a number (or maybe "NaN", if relevant) and
> will most likely be given a datatype. In the end, the data will contain
> something like:
>
> ex:sim :distance "389.009"^^xsd:float .
>
> There is no problem reusing this value, regardless of the range definition.
>  However, *if* the range constraint is maintained as you suggest, the
> following triples would be each individually inconsistent wrt the ontology:
>
> ex:sim :distance "389.009" .
> ex:sim :distance "NaN" .
> ex:sim :distance "389.009"^^xsd:decimal .
> ex:sim :distance "389.009"^^owl:real .
> ex:sim :distance "0fb7"^^xsd:hexBinary .
> ex:sim :distance "6z2b76aa"^^xsd:base64Binary .
>
> Yet, it's easy to make a programme that deals equally well with all these
> values, whereas it is difficult to ensure that everybody will use the three
> datatypes mentioned in the range assertion.
>
> In the absence of range assertion, such values as:
>
> ex:sim :distance "very similar" .
> ex:sim :distance "+++"^^xsd:string .
>
> would be consistent wrt the ontology but they can be simply ignored by any
> programme using these values. In the presence of the range assertion, these
> triples would be inconsistent wrt the ontology, but this does not prevent
> anybody from writing them, so they would have to be dealt with somehow too.
>

This discussion reminds me a bit of
http://www.w3.org/DesignIssues/InterpretationProperties.html.
I don't think there's anything wrong with either approach. It is
perfectly ok to put such a range constraint in the ontology. When
building an aggregator of MuSim data, it is much easier to know what
to expect (and what to eventually reject - in case it is inconsistent
wrt the ontology) rather than committing to support all possible
datatypes! On the other hand, it's fine to leave it open - you gain in
flexibility.

Kind regards,
y

Received on Tuesday, 15 June 2010 16:36:10 UTC