Re: OWL-Time - issue with SPARQL endpoints lacking owl reasoner

Simon,


I have several remarks wrt to your message concerning:
  1) SPARQL engines supporting OWL
  2) numeric values in XSD, RDF and OWL
  3) precision & scientific notations (also related to your following email)

1) SPARQL engines supporting OWL
================================

Most SPARQL engines implement the standard "SPARQL 1.1 Query Language" 
or a subset of it. This standard does not talk about reasoning. In fact, 
you must not do reasoning at query time if you want to conform to this 
standard, otherwise you would get incorrect results. If you want to 
support reasoning as part of the engine, you have to implement a 
different standard: "SPARQL 1.1 Entailment Regimes". Few SPARQL engines 
implement it. Even if they do, they may not support the OWL entailment 
regime because you can also restrict yourself to RDFS, for instance.


2) numeric values in XSD, RDF and OWL
=====================================

There are weird subtleties in the XML Schema datatypes. First, although 
most xsd:float literals and xsd:double literals have to be interpreted 
as numbers (like all xsd:decimal literals), the value spaces of these 
two datatypes are considered mutually disjoint and disjoint from that of 
xsd:decimal. Second, the value spaces of xsd:float and xsd:double 
contain values that are not numbers, namely "NaN"^^xsd:float and 
"NaN"^^xsd:double. So, ironically, by trying to encompass all forms of 
numbers, you created a datatype "time:Number" that contains things that 
are not numbers.

With OWL, you can create custom datatypes by combining supported 
datatypes (list given in Sec.4 of the "OWL 2 Structural Specification 
and Functional Style Syntax") with unionOf, intersectionOf, oneOf, 
complementOf and datatype restrictions. However, it is never possible to 
use a custom datatype IRI as the datatype IRI of a literal. That is, the 
following is invalid: "1.2e7"^^time:Number (according to the OWL 2 SS&FSS).
Consequently, the reasoning you can do with a unionOf datatype is very 
limited.

What TopBraid Composer is doing is probably that it does not care about 
the OWL 2 SS&FSS and allows any RDF graph. What reasoning it's doing is 
unclear. Perhaps, when you do this in TopBraid:

  FILTER( "123"^^ex:notDefinedDatatype < xsd:decimal(1234) )

it's converting everything to a string and compares lexicographically?


3) Precision and scientific notation
====================================
The scientific notation has usually two purposes (as far as I know):
  a) provide a concise notation for big or small numbers
  b) (sometimes) provide an implicit notion of precision

In order to support a), your solution is to allow xsd:float and 
xsd:double. It makes sense but I say that it may not be necessary. IMHO, 
we should not assume that people are going to write down RDF files 
manually, or read RDF files visually. They will either be programmers or 
end users.
  - Programmers load RDF to memory and save RDF to files with 
programming functions. They don't have to look at the literals in their 
stored form and don't have to write them explicitly.
  - End users will input data values with interfaces that can allow 
things like 1.5e5 to be stored as an xsd:decimal, and that allow the 
users to see that a quantity stored as 45120084650320 is "45.12 
trillion" or "45,120,084,650,320" or other user friendly notation. 
Moreover, scientific notations could be written 1.5×10^5 instead.

In order to support b), xsd:float and xsd:double are not sufficient (in 
fact, they are useless for that). A notation like 4e17 may lead some 
scientists to believe that this is an approximation ("roughly" 400 
quadrillion) but this is not how the XSD standard works. 
"4e17"^^xsd:float is *exactly* the value denoted by 
400,000,000,000,000,000 in anglo-saxon writing. In order to support 
precision, you would need an extra value (that could be stored as 
xsd:decimal to allow arbitrary precision) such that a pair 
(400000000000000000,100000000000000000) is understood to be "4*10^17 
±10^17".

To conclude, my position is that xsd:float and xsd:double are not really 
needed here and I support getting rid of them, but I would not fight for it.


Hope this helps.
--AZ


On 12/04/2017 05:24, Simon.Cox@csiro.au wrote:
> In the new OWL-Time there is a class time:TimePosition which is expected to have one of
>
>  time:numericPosition - being a number on a time-line, or
>  time:nominalPosition - being a named era from an ordinal reference system
>
> alongside a
>
>  time:hasTRS - which indicates the reference system that the value relates to.
>
> time:numericPosition is intended to support things like Unix time (usually an integer or decimal) or geologic or cosmologic time, which could be a very large number. So we want the option of either xsd:decimal (which provides arbitrary precision) and xsd:double (scientific notation). So I created an OWL2 union datatype defined as follows (Turtle notation), and used it for the rdfs:range of time:numericPosition.
>
> time:Number
>    rdf:type rdfs:Datatype ;
>    rdfs:comment "Generalized number"@en ;
>    rdfs:comment "Note: integer is a specialization of decimal"@en ;
>    rdfs:label "Number"@en ;
>    owl:equivalentClass [
>        rdf:type rdfs:Datatype ;
>        owl:unionOf (
>            xsd:double
>            xsd:float
>            xsd:decimal
>          ) ;
>      ] ;
> .
>
> [Note that OWL2 has types owl:real (but no lexical representation) and owl:rational (use xsd:double for the lexical representation), neither of which meets requirements. ]
>
> A colleague has looked at a test dataset in which I had mixed value with types xsd:float and time:Number which should be OK. We ran SPARQL queries including FILTER expressions like
>
>                 FILTER ( ?targetAge > xsd:decimal(?end) )
>                 FILTER ( ?targetAge < xsd:decimal(?begin) )
>
> My test environment (TopBraid Composer) produced the expected results, but Doug found that for a variety of SPARQL engines that are not OWL2 aware, while the > and < operator succeeded when an xsd:decimal was compared with a xsd:float, they failed when xsd:decimal was compared to a time:Number.
>
> Are we being too clever? How to satisfy the requirement?
>
> Simon
>
> -----Original Message-----
> From: Douglas Fils [mailto:dfils@oceanleadership.org]
> Sent: Wednesday, 12 April, 2017 03:56
> To: Cox, Simon (L&W, Clayton) <Simon.Cox@csiro.au>
> Subject: OWL Time in SPARQL endpoints lacking owl reasoner
>
> Simon,
>    We got a response at https://github.com/blazegraph/database/issues/59
>
>     Looks like lacking a reasoner Blazegraph isn’t going to connect time:Number to a concept it can do calculations with.   Adam’s Virtuoso doesn’t even seem to understand XPATH operators so I couldn’t get to the point of seeing if it does or not.   It looks from the docs and net Virtuoso does do some OWL reasoning but it’s not complete.  So whether is has the coverage of operations needed I do not know.
>
>     I worry that TopBraid is implementing elements that are not perhaps so common in the various SPARQL end points we are seeing in use.  I confess, I don’t have the sample size to back up that statement.
>
>     If true though, I worry this could limit the use of a graph implementing OWL Time in the wild.   I’ll leave it at that.  You have far more experience and understanding of this than I.  I would be very interested in your views and thoughts on this.
>
> Thanks
> Doug
>
>

Received on Wednesday, 12 April 2017 07:33:34 UTC