Re: Comments on the R2RM Editors' draft from Souripriya Das on 2010-12-14 (public-rdb2rdf-wg@w3.org from December 2010)

From: Souripriya Das <souripriya.das@oracle.com>
Date: Tue, 14 Dec 2010 12:00:00 -0500
To: Ivan Herman <ivan@w3.org>
CC: Seema Sundara <seema.sundara@oracle.com>, Richard Cyganiak <richard.cyganiak@deri.org>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-ID: <4D07A290.4080406@oracle.com>
Ivan,

Thanks a lot for your comments.
Please see our answers inline below.

Thanks,
- Seema/Souri.

Ivan Herman wrote:
> (Tracker, this should close ACTION-86)
>
> Souri, Seema, Richard
>
> it so happened that I had time today, so I did a review on the editor's draft of R2RML, as promised on the last telco. To be on the safe side, I looked at
>
> http://www.w3.org/2001/sw/rdb2rdf/r2rml/
>
> dated 2010-12-07 at 19:23
>
> Some or the comments might come from the fact that I am a new kid on the block...
>
> I hope it will be helpful!
>
> Cheers
>
> Ivan
>
>
> Status of the document, second paragraph: this is not a first public working draft any more:-)
>
> ----
> Intro, second paragraph on direct mapping: I find the text a little bit 'incomplete' in comparing the two approaches. I changed the last sentence and added one before that:
>
> [[[
> Besides the R2RML language, this working group will also define a fixed "default mapping" from relational databases to RDF. In the default mapping of a database, the structure of the resulting RDF graph directly reflects the structure of the database, the target RDF vocabulary directly reflects the names of database schema elements, and neither structure nor target vocabulary can be changed. To generate a graph using structures and terms that are more appropriate to the final application, graph transformation tools (e.g., SPARQL, RIF) should be used. With R2RML on the other hand, a mapping author can define highly customized views over the relational data and the full transformation is performed by the R2RML engine itself.
> ]]]
>
> This may be better...
[TBD]
>
> ---
> Intro, fourth paragraph on Turtle: the sentence reads as if Turtle is the _only_ RDF syntax that is accepted for R2RML. I see the same comment in the 2nd sentence of 1.2.
>
> I also see that this is still an open issue and is listed in 1.2. Which is fine then; I wonder whether the issue should not be listed in the intro as well, to avoid people asking questions prematurely (I began to write a comment on that until I got later down in the document:-)
>
[TBD]
> ---
> (This comment actually came up at a presentation I gave essentially on this version of R2RML, I am relying it here)
>
> At the moment, the value of rr:termtype are strings ("BlankNode", etc.). Wouldn't it be more 'Semantic Webish' if some predefined URI-s were used there? Ie, 
>
> [] rr:termtype <URI-for-the-concept-of-blank-node>
>
> I do not have strong feelings about this, but I though it is worth conveying to the group...
[TBD]
>
> ---
> Shouldn't section 2 be labelled as informative?
Yes. CHANGE DONE.
>
> ---
> Example in 2.1.2, second PredicateObjectMap: I guess it should say
>
> [ ... ; rr:datatype xsd:positiveInteger ]
>
> and not
>
> [ ... ; rr:datatype "xsd:positiveInteger" ]
>
> ie, the value is a datatype (uri) and not a string.
>
Yes. CHANGE DONE.
> ---
> Section 2.2, EMP table and 2.3, LIKES Table
>
> there are references to the empURI, graphURI, etc, as entries generated for the logical table and used everywhere as URI-s. I think it would be better to use "http://example.com/emp" everywhere, ie, include the URI scheme, rather than 'example.com/emp/' as for now. This then repeats itself in the whole of the appendix and various examples
>
Yes. CHANGE DONE.
> ---
> Section 3.1, Figure 1.
>
> I like figures in genearal, and I am also fine with that one except that... (1) it is way too huge and (2) the color scheme being used has very sharp and contrasted colours, which is very different from the colour schemes used elsewhere in the document. Can we try to make these a bit smaller and a bit more, shall we say, mild? (yes, I know, this is a matter of taste...). 
We agree. We will work with Boris for changing this.
>
> ---
> Issue (maybe to be labelled as suchy in the tracker and added to the text?): what happens exactly when, say, SubjectMapClass is missing for a TriplesMapClass instance? We may not have to answer this in the document, but label that as an issue to be solved (and I guess those are the connection point to the direct mapping!)
>
Good point. We propose to add text to refer to relevant section of the 
Direct Mapping draft.
> ---
> Similar question: what happens if the user adds more than one subjectMaps? I know there is a table at the end of the document that sets maximum cardinality for things. But there is no statements on what the error response of an R2RML processor should be if those cardinality constraints are breached. Taking into account that an R2RML instance is in RDF, we cannot rely on, say, the order within the specification (ie, something like the second coming wins). 
>
> We may just open an issue and label it as such in the document for now, b.t.w.
>
We propose to remove the restriction about max cardinality = 1 so that 
for a single row, one can have multiple subjects. The set of triples 
generated from the row will be associated with each of the subjects.
> ---
> 3.3, figure 3: I think the figure is outdated. It uses rr:value and rr:graphValue with ValueMapClass; I guess this was part of an earlier version and and rr:template and rr:graphTemplate have replaced these. The same discrepancy holds Figures 4, 5, 6, 7
>
We will work on it.
> ---
> 3.2.1.1: I am not sure what the role of table owner is. Is it some sort of a metadata? 
Any database table is owned by a user and hence needs to be referred 
using the pair <tableOwner, tableName>.
>
> ---
> 3.3.1.1: It is not clear from the text why one can have a _set_ of IRIs and blank nodes. Does it mean that all the triples in a row are, sort of, multiplied with different subjects? If this is indeed the idea, then it should be stated explicitly and maybe an example should be used in the appendix to show its usage
The *set* (i.e., <Set of valid IRIs and blank nodes>) is the range of 
rr:subject. So, a subject is an element of this set and so can be a 
valid IRI or a blank node.
However, if this causing confusion, we could change it from <Set of 
valid IRIs and blank nodes> to just <valid IRIs and blank nodes>.
>
> ---
> 3.3.1.1: another issue is: what does a blank node mean in this respect? What is the 'scope' of that blank node, ie, which graph does it belong to? I guess it is scoped to the dafault or named graph where all the triples are put; in which case this should be explained explicitly. But see also my question below on 3.3.1.5: what happens if there are several target graphs? (I guess the warning in the appendix apply...)
>
> I think some more explanatory text is warranted here on this.
In Section 5.1, we have pointed out the issue that may arise with use of 
blank nodes as subject and sending triples from same row to multiple 
graphs. We can add a line to state that the scope of a blank node is the 
graph (that is, either the named graph or the default graph) where the 
corresponding triple is being stored.
>
> ---
> 3.3.1.2: what happens if I have both an rr:column and an rr:subject structure in the same SubjecMapClass? Will all of them be valid and will I get all the triples, or does rr:column invalidates rr:subject? It should be stated explicitly somewhere
As pointed out in Section 4 (Table containing Summary of the 
Properties), rr:subject and rr:column cannot be used together for a 
SubjectMap. (Note that, since we are planning to allow multiple 
SubjectMaps, each of them could use either rr:subject or rr:column).
>
> ---
> 3.3.1.3: see my comment on possibly using URI-s rather than strings here... 
>
> If strings are used: are the strings case-sensivite or case-insentisitve? Ie, is "iri" accepted or, God forbid, is "iRi" accepted?
We are okay with adding URIs such as: rr:IRI, rr:BlankNode.
>
> ---
> 3.3.1.5: same question v.a.v. sets. What does it mean if I give several graph IRI-s here?
Same comment as in 3.3.1.1
>
> ---
> 3.4.1.3 and 3.4.1.4: will the storage of that triple in a graph happen _additionally to_ or _instead of_ the graph storage defined for an entire row? With the knowledge that there might be no graph definition for the row, ie, the triples just go into a default graph by default...
We will clarify that it is done "additionally", however, we could 
discuss having an option to specify "instead of".
>
> ---
> 3.4.1.3. and 3.4.1.4: are these two mutually excusive, or can I use both? I guess the latter, but it is worth emphasizing it (here or in an introduction somewhere...)
Not mutually exclusive. We will clarify in the text.
>
> ---
> 3.6.1.5: I now this is a bit of a mess these days:-) yes, lang can be used for a plain literal, but I think it should also be usable if the type is set to rdf:plainLiteral, which is different (it is a datatype but has a language tag in it...)
>
> http://www.w3.org/TR/2009/REC-rdf-plain-literal-20091027/
>
> (Note that the RDF WG coming up next year might make some order in this chaos...)
We will change the text to indicate that rr:language is allowed for 
rdf:PlainLiteral as well.
>
> ---
> 3.7.1.3: section heading is 'rr:graph', and the text says 'this property is similar to rr:graph property'. That is not very informative:-)
>
> Obviously refers to the usage of rr:graph as defined elsewhere, the link is correct:-)
We will fix it.
>
> ---
> 3.8.1.: 
> I am surprised that the rr:column and rr:template are not valid for RefPredicateMapClass. Any reason for that? If so, it may be worth explaining...
Use of rr:template for foreign key property is not very practical.
>
> ---
> Section 4, column for cardinality: I do not find a min=0 cardinality very informative... I suggest to keep the max cardinality only everywhere, in which case the column header can say that, and the 'max=' string (that is not used consistently in the table) could also be dropped.
>
> ---
> Appendix A.2: it would be nice to have a graph representation (I mean a figure) for the generated graphs. I would help me at least to grasp the results quickly, but I am a visual type...
We will work on it with Boris.
>
> ---
> Appendix A.2.3: I wonder whether this example (which is way simpler than the other two) brings any new aspect to the examples. If not, maybe we can drop it
>
This example illustrates the use of column value as predicate (e.g., in 
vertical tables).
>
>
>
>
>
>
>
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>
Received on Tuesday, 14 December 2010 17:02:46 UTC