Re: RDF API updates from Ivan Herman on 2011-04-24 (public-rdfa-wg@w3.org from April 2011)

From: Ivan Herman <ivan@w3.org>
Date: Sun, 24 Apr 2011 09:46:00 +0200
To: nathan@webr3.org
Cc: RDFA Working Group <public-rdfa-wg@w3.org>
Message-Id: <5C70991C-3B6A-4552-9596-0D64D86368BF@w3.org>
On Apr 23, 2011, at 17:29 , Nathan wrote:

> Hi Ivan,
> 
> Thanks for the feedback! in-line form here:
> 
> Ivan Herman wrote:
>> Nathan,
>> Terrific! I only had a cursory look at thing right now, I will go through the document in more details but that may not be before the end of Easter. Some comments, though:
>> 1. A question that just came to me while reading the Literal interface definition (2.3.4). If you look at the definition of datatypes in the RDF Semantics document[1] it talks about value space and lexical space, about the lexical-to-value mapping, etc. I think using these notions in the text would make it clearer. For example, the current description does not make absolutely clear whether the 'value' attribute is the value of the specific literal in the datatype's value space (ie, the result of the value mapping) (which I think it is, but making it clearer might help).
> 
> Yes, the mappigns to IDL and ECMAScript are all in the "Datatype Map:" in that section, and the guidance "The native value of the literal, if the datatype of the literal is not known by the implementation the nodes value MUST be the lexical representation of the value."
> 
>> The term you use 'native value' is not really defined. We do not have to define these things ourselves, b.t.w., we can just refer to the existing RDF spec...
>> Actually... I wonder whether it makes sense for applications to store in the Literal object _both_ the lexical value of the literal and the value in the datatype's value space, making it somehow clear that equality is base on the equality in the value space... Just a thought.
> 
> Perhaps, unsure whether we should require them to do it though? Will look in to the text again and see where it can be clarified (with additional text, or by pointing to other specs).

In the meantime, I checked something. The RDF Concepts document says:

[[[
6.5.1 Literal Equality

Two literals are equal if and only if all of the following hold:

	• The strings of the two lexical forms compare equal, character by character.
	• Either both or neither have language tags.
	• The language tags, if any, compare equal.
	• Either both or neither have datatype URIs.
	• The two datatype URIs, if any, compare equal, character by character.
]]] 

(See [1]).

In other words, the two literals "0000123"^^xsd:int and "123"^^xsd:int are, formally, two different nodes in RDF. I am not even sure whether the RDF Datatype entailment would generate equality for these two, simply because there is no notion of derived equality there. In the case of OWL, and even the minimal RDFS+OWL stuff, ie, OWL RL, the situation is different: if one looks at Table 8 in[2], it says:

[[[
dt-diff		T(lt1, owl:differentFrom, lt2)	for all literals lt1 and lt2 with different data values
]]]

Interestingly, not all RDF environments follow this. In RDFLib, these two literals are considered as equal as Python objects...

So... where does it leave us? I believe that

1. The Literal lexical value must be stored alongside the lexical-to-value-space mapping results
2. Formally, the equality of literals should follow 6.5.1 above. But the API can be silent on that and simply refer to the right section of the RDF Concepts document 

[1] http://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Literal-Equality
[2] http://www.w3.org/TR/owl2-profiles/#Reasoning_in_OWL_2_RL_and_RDF_Graphs_using_Rules 


> 
> Gavin Carothers also raised a related issue, lang_match equality, we haven't catered for it anywhere in the API - as in we cater for exact match, but not for "EN" matching "en-US" and "en-GB", unsure if we should cater for this or not?
> 
>> 2. The second issue is, actually, a bit related to this and, again, relates to TripleSets. Sorry if I touch on some nerves, it is really not my intention...
> 
> No problem, I know it's not personal, and fully understand the situation :) You're just doing your job, and I'm glad you are!
> 
>> I think that, at this point, there is no reason to say that a TripleSet is a subclass of Literal. I do not think any of the interfaces would change if we simply say that a TripleSet extends an RDFNode, and that is all what we can say at the moment. We do not necessarily define Graph Literals...
>> I feel I owe some sort of an explanation here, to avoid unnecessary bad feelings here. _Personally_ and intellectually I am actually in favour of the concept of a graph literal. (An that is independent of the fact whether a literal can be in subject position or not.) But... beyond the issues I raised at the call on consistency of recommendations, we have to realize that if _we_ define a graph literal (and by subclassing TripleSet from Literal this is what we do) then we have to do much more than just saying this. What we would have to do then is to define a new RDF Datatype properly in the sense of [1]. We have to define exactly what the lexical space is, what the value space is, what the lexical-to-value mapping is (in terms of [1]), we have to define things like what does equality mean in the datatype's value space (bnodes...:-), etc. All this can be done of course, but it _is_ a non-trivial amount of work, and I just do not believe this is something that should be done 
> by an API document.
> 
> I agree, and realize there would be a lot of work - personally I'm fully expecting to have that interface ripped out of the API in the future, I just wanted to ensure that it was:
> a) considered (as in nothing was defined in the spec that would preclude libraries implementing such an interface)
> b) in the document in advance of the RDF WG in the hope that ->
> 
>> Maybe the RDF WG will do it, let us see what happens
> 
> the above!
> 
>> but this is not for this group. By defining a TripleSet as extending an RDFNode, we leave the door open to whoever will handle that. 
> 
> Why don't we just remove it from the API - it's unsparqlable as has been pointed out, it's not defined in RDF anywhere, and the odds of it being defined, are well, you know :) Perhaps if it is then we can add it back in at a later date - one must make compromises where one can.
> 

:-) Yes, accepting compromises is not always easy, I fully appreciate that...

I would, however, add a note to the document somewhere, saying that the current RDF WG is looking at the issue of graph identification and if and when that is resolved and the concepts are clearer, this WG will look at how those concepts can be represented on the API level. In other words, this leaves the issue open and leaves it to the right WG to do the heavy lifting...

That being said: with the existence of the SPARQL concepts (and overall practice, I believe, in RDF environment) it might be possible and indeed good to add an optional attribute to Graph, namely an RDFNode with the name, say, context. This means that a URI can be associated to a Graph, something that SPARQL requires with its own datasets already today...


>> By the way, I think we agreed in adding an editorial node to that part, referring to the discussion at the RDF WG on that matter, and that this interface might change (including its name) as the issue progresses over there
> 
> We do, in section 2.1 - however, as above, will just remove it.
> 
>> 3. (Minor) You have a remark on the UTF-8 version of NTriples, which is of course completely valid. But I would rather see that remark as an explicit note for the reader which draws the attention on the fact that NTRiples is ASCII based, but that the RDF WG may refresh that, so this detail may change in future releases of the draft
> 
> Okay, will do.
> 
>> 4. The (node,node,node) issue... as far as I remember we agreed that there would be some sort of a constraint somewhere if we want to reinforce the RDF concept restrictions. I have not found that... I looked at the Triple interface, at the createTriple method, ... How do you plan to handle that?
> 
> I'm at a loss of what to do here to be honest, as you know I feel exceptionally strongly about maintaining the (node,node,node) definition, and the only way I can see to constrain it (taking in to account an IDL definition of the createTriple method would be needed) is to add in another interface say RDFResource which both NamedNode and BlankNode extend and use that in the subject slot, and NamedNode in the property slot, which basically will preclude node node node usage.

Yes, and I believe this is what we had before, hadn't we?

> 
> The only thing I could think of for the time being was to add the text at the start of section 2:
> 
> [[
> The RDF Concept Interfaces in this specification provide a low level API for working with RDF data.
> 
> The concepts described in this specification are more generalized than those defined by the RDF Data Model [RDF-CONCEPTS]. Whilst this may appear to be a mismatch, the RDF specification is intended to define a notation for transmitting data on the Web, however this specification defines a set of interfaces for working with that data, behind the public interface, where more generalized notions of Triples are often required by libraries and modules.
> ]]
> 
> Guidance?

Ok, my turn to take a compromise:-)

Let us follow this line. Maybe the only tiny issue is that there should be, in the document an explicit issue that makes it clear that this is, well, an issue, and we explicitly ask the RDF WG to comment on this aspect of the spec. We will see whether there will be objections.

> 
>> 5. I actually found two minor editorial issues on the document, I took the liberty to change the -src.html file. Namely:
>>  5.1 (Minor) in the status section the name of the Working Group should change:-)
>>  5.2. (Minor) a number of people have joined the WG in the last few days[2]. Which is of course great, but that means that Appendix A should be refreshed...
> 
> Thanks and thanks! (Minor, should it also say "the members of the RDF Web Application WG were" ?)

Doh. Of course... I have made the change.

Ivan


> 
>> 6. (Minor) I wonder whether a class diagram would be useful to give a visual representation of the interfaces. Do not deal with this now, it may not be necessary for the FPWD; I may play with that... It helps me in understanding things in the first place:-)
> 
> Sounds like a good idea!
> 
>> And to avoid any misunderstandings here: this document is 99% at the publication ready level. A deep bow towards Scotland...
> 
> :) thanks, and thanks to all of course for all the hard work over the years and recently that's gone in to this.
> 
> Best,
> 
> Nathan
> 
>> Thanks
>> Ivan
>> [1] http://www.w3.org/TR/rdf-mt/#dtype_interp
>> [2] http://www.w3.org/2000/09/dbwg/details?group=44350
>> On Apr 22, 2011, at 19:20 , Nathan wrote:
>>> Hi All,
>>> 
>>> I've done just uploaded a new draft of the RDF API, however haven't quite finished reviewing yet and may have a few more minor changes to the IDL and changes to the prose.
>>> 
>>> link:
>>> http://www.w3.org/2010/02/rdfa/sources/rdf-api/Overview-src.html
>>> 
>>> Changes so far are:
>>> 
>>> changed RDF Concept Interfaces introduction.
>>> 
>>> changed Aliases back to Terms.
>>> 
>>> changed GraphLiteral to TripleSet and added an instability notice.
>>> 
>>> removed Graph.the
>>> 
>>> clarified the difference between Graph.merge and Graph.import
>>> 
>>> added a Graph match(in RDFNode subject?, in RDFNode property?, in RDFNode object?, in optional unsigned long limit) method, which is pretty self explanatory, pass in an optional subject, property, object, and get back a new graph containing all triples which match the given arguments. And it's got a limit feature.
>>> note: There were so many different ways in which this could be defined, the other main alternative I considered was the method accepting a single MatchTriple or array or MatchTriples where a MatchTriple was a Triple where each of subject, property, object was optional - however I thought that was getting too close to query functionality, would require other methods to create matchtriples, and raised lots of questions about AND/OR functionality and the like. Open to discussion feedback of course.
>>> 
>>> clarified the description of Graph.apply
>>> 
>>> changed PrefixMap.resolve return null if prefix not set
>>> 
>>> removed import*FromGraph methods from TermMap, PrefixMap and Profile
>>> 
>>> removed Profile.base
>>> 
>>> removed Parser.profile
>>> 
>>> changed the try method on TripleAction to 'run'
>>> 
>>> got rid of the NCName restrictions. defined prefix/term as string w/ no colon or whitespace
>>> 
>>> Best,
>>> 
>>> Nathan
>>> 
>> ----
>> Ivan Herman, W3C Semantic Web Activity Lead
>> Home: http://www.w3.org/People/Ivan/
>> mobile: +31-641044153
>> PGP Key: http://www.ivan-herman.net/pgpkey.html
>> FOAF: http://www.ivan-herman.net/foaf.rdf
> 
> 


----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf
Received on Sunday, 24 April 2011 07:45:07 UTC