Re: RDF Interfaces - equality, entailment, problems

Ivan Herman wrote:
> Nathan,
> 
> it is not that bad:-)
> 
> Forget about datatype entailment and look at the very pragmatic SPARQL. The fundamental pattern matching operation, that is at the basis of the SPARQL query, is based on the equality as defined in the RDF Concepts. For Literals that means what was said before, ie, comparison on the lexical values. However, SPARQL has also FILTER and the definition of an ?a = ?b in a FILTER is, formally, bound to XPATH's func-numeric-equal; less formally that means that the comparison is based on value space equality. 
> 
> In RDFLib, to take the example I know, a Literal is simply a subclass of the (built-in) Unicode class. Equality is a little bit more convoluted than Unicode equality, because it has to take care of language and datatype URI equality, too, (so the implementation of RDFLib.Literal had to overwrite the original methods) but the very fact that it is a subclass of Unicode emphasizes the fact that, well, _it is a string with goodies_. Additionally, though, the RDFLib Literal has a method called 'toPython' that converts the Literal into a native object in Python corresponding to its datatype, ie, an integer, a float, a boolean, a date, etc. I.e., as a user I can compare two literals through their values. (I do not find the name 'toPython', but that is a detail.) 
> 
> Hence what I proposed last week. The Literal interface should store both the (original) lexical value and the value in data space. The equality is defined as in the RDF Concepts. But if the user wants to have an application that looks at the 'converted' value, then he/she has to compare with the 'value'. Whether we follow the RDFLib approach to add a method instead of storing an attribute by default, I am not sure. To use methods delays the conversion operation to whenever it is really needed. 
> 
> Does this help?

Yes, although it's less the toNative functionality and more the 
fromNative functionality that I'm concerned about.

Let us try it and see how it looks, will reply in due course with code 
and spec.

Best,

Nathan

> Ivan
> 
> P.S. (The implementation of the 'toPython' method in RDFLib is a bit sloppy, though:-( When I implemented the OWL 2 RL layer on top of it and implemented the the corresponding datatype entailment, I had to re-write most of those to improve it, essentially to detect errors. But that is another matter...)
> 
> On May 1, 2011, at 16:36 , Nathan wrote:
> 
>> Ivan Herman wrote:
>>> On 24 Apr 2011, at 19:33, Nathan <nathan@webr3.org> wrote:
>>>> Shane McCarron wrote:
>>>>> On 4/24/2011 8:42 AM, Nathan wrote:
>>>>>> Also, just what do we do about literals people are creating? for example:
>>>>>>
>>>>>> createLiteral(100, "xsd:double");
>>>>>> createLiteral(10*10, "xsd:double");
>>>>>> createLiteral(1e2, "xsd:double");
>>>>>> createLiteral(+1e2, "xsd:double");
>>>>>> createLiteral(+100, "xsd:double");
>>>>>>
>>>>>> All of those values are of the type (number) in javascript and have the same value "100" with no access to the original form.
>>>>> To my mind all those are the same.  There is nothing we *can* do.  If you want to put in a note to that effect, it might be reasonable.
>>>> Yes they are all the same, so I guess I'm saying that it feels a little strange to have:
>>>>
>>>> createLiteral(100, "xsd:double").equals( createLiteral(+1e2, "xsd:double") ) === TRUE
>>>>
>>>> whilst if the original source was say turtle, then they would not be considered equal, seems like unexpected functionality to me.
>>>>
>>>> Back to reality, just what do we write in the RDF API specification?
>>>>
>>>> - keep it as is, which appears to work, afaict - compare value if you know the datatype, else compare lexical form
>>>>
>>>> - change to read something like "equality is defined by RDF <link> here"
>>> This is the same discussion as before... We define an API to RDF. We should not define a different form of equality; instead, refer to the relevant RDF spec.
>> Hi Ivan,
>>
>> I've been implementing the RDF Interfaces over the last couple of days, and have to admit that I'm struggling with the lexical-equality.
>>
>> Essentially it's increasingly looking like we'll need to drop all the typed literal conversion, and constrain the definition of createLiteral to be:
>>
>> createLiteral(DOMString value, DOMString language, NamedNode datatype)
>>
>> Which is very sad. Before this equality issue came up the API was quite nice to use, you could do createLiteral(10.3) and that would automatically create the appropriately typed Literal.
>>
>> A key problem is that if a graph is created from a serialization which contains the following:
>>
>>  <a> <b> "100"^^xsd:double .
>>  <a> <b> "1E2"^^xsd:double .
>>  <a> <b> "1e2"^^xsd:double .
>>  <a> <b> "+100"^^xsd:double .
>>  <a> <b> "+1E2"^^xsd:double .
>>  <a> <b> "+1e2"^^xsd:double .
>>
>> and somebody creates a Literal with the code createLiteral(100, "xsd:double"), then:
>>
>> a) there is no lexical representation to compare against for equality
>>
>> b) if equality is done on value space, then it is equal to all those terms already in the graph, and further, all those terms are considered equal by the API so the graph would only contain one triple.
>>
>> Which brings me right back to my initial conclusion above.
>>
>> Additionally, the same applies for every layer of API built on top of the RDF Interfaces as well, at no point could you work with language native types, because you just hit a or b again.
>>
>> This leads on to the topic of entailment from rdf-mt, there is of course D-entailment, but then we're getting in to the realms of RDFS and interpretations, and this raises several other issues, such as inconsistencies, unknown datatypes, practical things like different entailment regimes from SPARQL and OWL, where to practically draw the line so things are in fact usable, requiring schema/ontology awareness when working with data and so forth.
>>
>> Guidance? I'm at a bit of a loss here, the API really isn't very nice to work with when you don't have any language native types to work with, but I don't see how we can just naively map up common datatypes to value spaces without it creating unexpected functionality at some level, and I'm unsure whether we can get in to the full entailment, interpretation and semantics side of things in the API - I guess you can say I'm stuck. - Help!
>>
>> Best,
>>
>> Nathan
> 
> 
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 
> 
> 
> 
> 
> 
> 

Received on Sunday, 1 May 2011 17:59:13 UTC