Re: D-enatilment and canonicalization from Birte Glimm on 2010-03-05 (public-rdf-dawg@w3.org from January to March 2010)

From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
Date: Fri, 5 Mar 2010 14:16:17 +0000
To: Axel Polleres <axel.polleres@deri.org>
Cc: Andy Seaborne <andy.seaborne@talis.com>, ivan@w3.org, public-rdf-dawg@w3.org
Message-ID: <492f2b0b1003050616qf92f7b2haddfe67c743e6e5f@mail.gmail.com>
On 5 March 2010 13:33, Axel Polleres <axel.polleres@deri.org> wrote:
>
> On 5 Mar 2010, at 11:27, Birte Glimm wrote:
>
>> Good question indeed. My feeling is, that it is not an entailment
>> regime, but rather another source of infinite answers from datatype
>> aware systems.
>
> Datatype aware systems as Andy sketches them will not give infinite answers,
> but just return the canonical representations for each datatype, or do I miss something?

So what exactly does datatype aware mean? Do such systems implement
D-entailment for some datatype map? Otherwise I don't really see a
connection to entailment regimes. For D-entailment the only
interesting thing is that all lexical representations of a literal
value are entailed in my understanding, i.e., if I have
ex:a ex:dp "1.00"^^xsd:decimal .
in the scoping graph and a system that does D-entailment with
xsd:decimal in the supported datatype map, then a BGP
{ ex:a ex:dp ?dv }
has the solutions
?dv/"1"^^xsd:decimal
?dv/"1.0"^^xsd:decimal
?dv/"1.00"^^xsd:decimal
?dv/"1.000"^^xsd:decimal
etc.
if we just take the defined entailment relation without any
conditions. These answers are what I call possible answers.
The restrictions we currently have then means that systems return only
what they have in the SG after parsing, which could be the original
form or their canonicalised form if the system applies
canonicalisation.

Birte



> best,
> Axel
>
>> For OWL Direct Semantics, this is covered since there we only return
>> asserted data values modulo sub-property entailment. This assumes that
>> the original lexical form is returned. Internally we canonicalise
>> everything (otherwise you cannot do reasoning with facets etc
>> correctly), but we keep the original lexical form anyway to not
>> confuse users by silently changing their data values even if it is to
>> something equivalent.
>> For D-Entailment/OWL RDF-Based Semantics, I am not quite sure what the
>> best solution would be. At the moment, I restrict bindings to values
>> that occur in the skolemised scoping graph. This guarantees
>> finiteness. What is not clear to me is whether that restricts systems
>> so that they have to return the original lexical form or whether the
>> scoping graph is whatever systems build from the input when they parse
>> it. My feeling is that systems can do what they prefer since in any
>> case the result is graph equivalent to the active graph and even for
>> the active graph I am not sure whether anything defines what the
>> active graph actually contains after parsing a document with such
>> datatype triples. E.g., if the input document had the triple
>> ex:a ex:dp "1.00"^^xsd:decimal .
>> then after loading, the active graph could contain
>> ex:a ex:dp "1.0"^^xsd:decimal .
>> I guess. Is that right?
>>
>> The question is do we want to enforce something more specific?
>>
>> Birte
>>
>>
>> On 5 March 2010 10:07, Andy Seaborne <andy.seaborne@talis.com> wrote:
>> > The SPARQL query really starts where the data is already loaded (FROM etc
>> > not withstanding) so the data as it is loaded may be prepared in some
>> > fashion outside the SPARQL spec.
>> >
>> > When we discussed this last time, we recognized that systems already did
>> > work on loading RDF and did not introduce any text to obstruct them.
>> >
>> > As to whether it's an "entailment regime", if it is then it's finite and
>> > different for each system.  It is done when data is loaded not queried
>> > (think running rules over the data).
>> >
>> >
>> > For example, TDB canonicalizes integers between -2^55 and +2^55-1 but not
>> > outside that range (they have their original lexical form stored). Decimals
>> > have 48 bits of precision and 8 bits of scale and again if outside the that
>> > range, the normal node storage is used and the lexical form is not
>> > canonicalised.
>> >
>> > Derived integer types are promoted to integer.
>> >
>> > (This in TDB is all "currently" and planned to change a little).
>> >
>> >        Andy
>> >
>> > On 05/03/2010 9:29 AM, Polleres, Axel wrote:
>> >>
>> >> Thanks andy, my (maybe naïve) question would then be: is behavior 2
>> >> warranted "as is" by the current spec, or is "canonical datatype
>> >> representation" actually another (commonly used already) "entailment regime"
>> >> that should be defined as such?
>> >>
>> >> Best,
>> >> Axel
>> >>
>> >> ----- Original Message -----
>> >> From: Andy Seaborne<afs@talisplatform.com>
>> >> To: Polleres, Axel
>> >> Cc: ivan@w3.org<ivan@w3.org>;
>> >> public-rdf-dawg@w3.org<public-rdf-dawg@w3.org>
>> >> Sent: Fri Mar 05 09:06:09 2010
>> >> Subject: D-enatilment and canonicalization
>> >>
>> >>
>> >>
>> >> On 05/03/2010 8:45 AM, Polleres, Axel wrote:
>> >>>
>> >>> In my opinion this is a question concerning all entailments from
>> >>> D-entailment "upwards".
>> >>>
>> >>> ----- Original Message -----
>> >>> From: Ivan Herman<ivan@w3.org>
>> >>> To: Polleres, Axel
>> >>> Cc: Birte Glimm<birte.glimm@comlab.ox.ac.uk>; SPARQL Working
>> >>> Group<public-rdf-dawg@w3.org>
>> >>> Sent: Fri Mar 05 08:08:10 2010
>> >>> Subject: Re: [TF-ENT] Condition C2 modifications
>> >>>
>> >>>
>> >>>
>> >>> On 2010-3-5 24:36 , Axel Polleres wrote:
>> >>>>
>> >>>> No objections, but one additional side question:
>> >>>>
>> >>>> Do we have an issue with systems that use canonical forms of datatype
>> >>>> literals internally?
>> >>>>
>> >>>> Say you have:
>> >>>>
>> >>>>   :s :p "1.000"^^xsd:decimal
>> >>>>
>> >>>> is a Datatype-aware system really supposed to return
>> >>>>
>> >>>>   "1.000"^^xsd:decimal
>> >>>>
>> >>>> on { :s :p ?O}
>> >>>>
>> >>>> but not it's internal representation?
>> >>>>
>> >>>>
>> >>>
>> >>> This is a good question, I do not know the answer:-(, but is this an
>> >>> entailment specific question? I would expect that to be a question for
>> >>> SPARQL as a whole...
>> >>>
>> >>> Cheers
>> >>>
>> >>> Ivan
>> >>
>> >> There are 2 cases for value aware systems and there are examples of
>> >> systems in each case:
>> >>
>> >> 1/ Data "1.00"^^xsd:decimal,
>> >>     stores "1.00"^^xsd:decimal,
>> >>     matches "1.0"^^xsd:decimal,
>> >>     matches "1.00"^^xsd:decimal,
>> >>     returns "1.00"^^xsd:decimal
>> >>
>> >> i.e. the original term is stored and returned
>> >>
>> >> 2/ Data "1.00"^^xsd:decimal,
>> >>     stores "1.0"^^xsd:decimal,
>> >>     matches "1.0"^^xsd:decimal
>> >>     matches "1.00"^^xsd:decimal (canonicialization applied)
>> >>     returns "1.0"^^xsd:decimal
>> >>
>> >> i.e. the canonicalized term is stored and returned
>> >>
>> >>
>> >> See also "1"^^xsd:byte and "1"^^xsd:integer
>> >>
>> >> I avoided describing them as D-entailment because that really is a set
>> >> of possibilities depending on the datatypes supported and ranges of
>> >> values within the datatypes.  They don't necessarily force D-consistency.
>> >>
>> >>        Andy
>> >>
>> >> Examples:
>> >> 1 - Jena memory model
>> >> 2 - Jena TDB
>> >>
>> >> ______________________________________________________________________
>> >> This email has been scanned by the MessageLabs Email Security System.
>> >> For more information please visit http://www.messagelabs.com/email
>> >> ______________________________________________________________________
>> >
>> >
>>
>>
>>
>> --
>> Dr. Birte Glimm, Room 306
>> Computing Laboratory
>> Parks Road
>> Oxford
>> OX1 3QD
>> United Kingdom
>> +44 (0)1865 283529
>>
>
>



-- 
Dr. Birte Glimm, Room 306
Computing Laboratory
Parks Road
Oxford
OX1 3QD
United Kingdom
+44 (0)1865 283529
Received on Friday, 5 March 2010 14:16:49 UTC