- From: Harry Halpin <hhalpin@w3.org>
- Date: Mon, 8 Nov 2010 23:10:34 -0000 (GMT)
- To: public-rdb2rdf-wg@w3.org
Here's a quick review in a purely personal capacity: 1) Looking at this document [1] from Eric primarily. - It's very concise. I generally feel like I have understood the zeitgeist of the algorithm by looking at the examples in Section 2, but I'm not always confident of what's going on till I look at Section 4-5. It is also not much changed from the last look I had it a few months ago. - There are, as someone else pointed out, always issues encoding things as NCNames in IRIs. I know most people just usually use a simple library call for this, but it would be good to point out precisely where/what this algorithm is. We need a better way to describe url-encoding of column/row strings to make sure DB implementers than just referring to this IMHO [3]. - The real action (and heart of the document) seems to be in the textual description in 2.2 and then the how to form IRIs immediately thereafter. Overall, the presentation should probably present the *standard* cases before presenting the edge cases. In this case, presenting literal triple case before the reference triple case is a bit odd - talk about the case of the URI being a primary key first, then not having a primary key. It's just hard to follow the English here. - Then the IRI formation rules in English seem off from the examples. The algorithm seems to suggest that a "hash" be added in between column names and stem in predicate IRIs, where the algorithm has a slash before the column names. This is weird because the subject/object IRIs use a slash between their stem and colum names, and then add another "#_" to end of subject/object URIs. While I see the issue flagged, we should not have a difference between these two cases and divergence between English text and examples. - I'm going to ignore comments on Section 3-5 for now, will post later once we agree on text. 2) Looking at this document [2] from Juan and Marcelo primarily - Main impression is that it's more text. For me, I find the section 2.3 of this document is much easier to follow than section 2.2 of the other document [1]. - However, there's some major differences. Unlike [1], the IRI construction rules in English seem to line up with their examples. However, unlike [1] they don't use "#"s and use ',' rather '_' between column names. I tend to say '_' makes more sense, but I still am confused about the '#' vs. '\' difference. It seems we should just pick the simplest pattern (i.e. '\' and no fragment ids) and stick with it unless there's a real reason to use #/fragids. Also, document [2] seems to be generating rdf:type triples for Table IRIs, which the first document doesn't do [1]. - The examples beneath their section 2.3 seem about the same, but there seems to be one missing, the hierarchical tables approach. I remember discussing this on telecons but cannot remember the resolution. Why the discrepancy in examples? - Ignoring their Section 3 for now, noting their section 4-6 is just cut and paste of [1]'s section 3-5. Both documents: - Both documents embody the same underlying algorithm it appears with one exception, i.e. the example given by the hierarchical tables (which we need to decide if we rule out of not) and some oddness about using '_' vs ',' and '#' vs '\' in IRI formation. So it's really a matter of readability and clarity to implementers and database admins who will use this, so merging the documents should be no big deal once we get some agreement on these trivial things. Overall, I prefer the presentation of the IRI algorithms in document [2]. This is very important: the English needs to precisely detail the algorithm needed to construct IRIs, chose among cases, etc. as much as possible, and catch things implemeneters may forget, like url-encoding of text. In Document [1] the English doesn't line up well with examples. However, I like the layout of the examples in document [1] more,, although would like to add some more explanatory text from [2]. Would like "english rules" then followed by examples in sub-sections in increasing order of complexity. - In both documents, there is a severe disconnect between the examples and the formal rules, which makes it hard to connect the examples back to the IRI construction algorithms in both documents. I suggest that in Section 2 (whichever document) that the rules be presented right after their English example, and each example point out exactly which rules (Scala-style or Datalog) it's using, and then at end state the rules in their full formality if necessary (which for the default document, it may not be necessary to do if there's a separate semantics document). While I understand there is some desire to separate normative from informative material, in today's world people will likely ignore all "semantics/formalism" if it's at end of document and just code according to the examples. Having some shorthand to connect each example to each rule both in English and in some formal notation is necessary. - While I understand lots has changed, it would be better if we used the same running example in R2RML [3] and both direct graph documents. I guess that would mean more work, but would make documents flow much better together. [1]http://www.w3.org/2001/sw/rdb2rdf/directGraph/ [2]http://www.w3.org/2001/sw/rdb2rdf/directGraph/alt [3] http://www.w3.org/TR/wsdl#_http:urlEncoded
Received on Monday, 8 November 2010 23:10:36 UTC