Fw: Comments on rq24 reorganization

forwarding along Simon's review of rq24

Lee

----- Forwarded by Lee Feigenbaum/Cambridge/IBM on 09/05/2006 09:40 AM 
-----

Simon Raboczi <raboczi@itee.uq.edu.au> 
09/05/2006 09:38 AM

To
Lee Feigenbaum/Cambridge/IBM@IBMUS
cc

Subject
Comments on rq24 reorganization






Begin forwarded message:

From: Simon Raboczi <raboczi@itee.uq.edu.au>
Date: 5 September 2006 21:36:47 GMT+10:00
To: public-rdf-dawg-comments@w3.org
Subject: Comments on rq24 reorganization

This is my comparison of the relative merits of the current rq23 
specification [1] and its reorganized rq24 variant [2].  It's somewhat 
different from Lee Feigenbaum's previous series of messages [3][4][5] 
which were instead a review of rq24 itself.  What I've tried to do is 
describe what's moved where, and how I feel about each of the changes.

I'll start with the conclusion.  At a structural level, rq24 is definitely 
an improvement on rq23, and is a better way forward.  I don't doubt that 
we should adopt rq24 over rq23.  The only question in my mind is when.  
The reorganization process has left us with areas in which rq24's 
narrative lapses entirely into @@Todo notes, whereas rq23 is rambly but at 
least complete.  For this reason, I'd hesitate to say at this moment that 
rq24 is actually a better document, in the sense of being the best 
exposition of SPARQL for the public  It certainly doesn't strike me as 
CR-quality.

The key issue seems to be whether it's more important to give the public 
the most polished account of SPARQL (rq23) or the most up-to-date account 
of its evolution (rq24).  I favor choosing the latter and going with rq24 
because I believe that the holes in it won't take too long to patch, 
eliminating the only reason I see for sticking with rq23.  My ideal 
scenario would be for one more editoral pass to take place before adopting 
rq24.  At the very least, I think that sections 1.2 and 2.5 should be 
either filled out or commented out, and that the internal links broken in 
the course of the reorganization should be mechanically checked and then 
corrected [6][7].  However, if the document were to drop back to WD status 
I feel that immediately adopting rq24 as-is would also be acceptable.

-----

Now, on to the dull details of how I actually did the comparison.  I drew 
myself a diagram of how the sections in rq23 were permuted to obtain 
rq24.  I tried to group and summarize the changes at a meaningful level, 
trying to reverse engineer the editors' intentions for the reorganization.

* Text from the beginning of rq23.2 moved to rq24.1.2 (Document Outline).  
Having an outline is a good idea, and I agree with Lee that it belongs 
before the section on Document Conventions.  The only issue I take here is 
that another few sentences need to written to fill out the new section and 
(trivially) the numbering needs to be corrected.

* The previous rq23.2 (Making Simple Queries) and rq23.3 (Working with RDF 
Literals) received the brunt of the changes, reorganized into four new 
chapters.

* The new rq24.2 provides a series of example queries without detailed 
specification.  For the most most part this involved migrating 
specification text elsewhere (rq23.2.1.1 through rq23.2.8) and gathering 
the examples from rq23.3.1 and rq23.3.2.  I think the idea of tossing a 
bunch of examples at the reader right at the outset does a lot to frame 
the specifications that follow later.  O'Reilly built of publishing empire 
on this editorial principle.  :)

* The concrete syntax of triple patterns is then specified in rq24.3, with 
material drawn from rq23.2.1.1-4, and the abstract syntax and semantics  
in rq24.4 with material from rq23.2.2-4 and rq23.2.8.4-5.    I personally 
don't think dividing the abstract and concrete syntax into two chapters is 
helpful; I'd prefer to see these two interspersed, with each individual 
feature's concrete and abstract syntax treated together.  Without any 
concrete syntax examples, the entirety of rq24.4 is rather difficult to 
understand.  I'd suggest a better division would be to put the concrete 
and abstract syntax details (currently rq24.3 through rq24.4.2) into 
chapter 3, and the semantics of what qualifies as a solution to a pattern 
(currently rq24.4.3-5) into chapter 4.

* rq23.2.5 (Basic Graph Patterns) becomes its own chapter rq24.5.  The 
semantics of what qualifies as a graph solution is a pretty core topic and 
certainly deserves its own chapter.  I suspect there'd be benefit in 
drawing in material from rq24.4.3-5 on triple pattern solutions into this 
chapter, so that the entire process of filtered basic graph pattern is in 
one spot.

* I didn't note any substantial change to chapters rq23.4-6, other than 
being renumbered as rq24.6-8.

* The chapters on datasets rq23.7,8,9 have been consolidated as rq24.9 
(RDF Dataset). Putting all the stuff about querying multiple graphs at 
once into the one chapter seems a definite improvement.

* The preamble to Appendix A (before the EBNF meat of it, in A.7) has been 
reorganized.  This seems to be a modest improvement; I don't have any 
strong opinions about it.

-----

My goal was to compare rq23 and rq24 rather than to proofread either of 
them, but some collateral proofreading happened anyway and might as well 
be captured:

* Section rq24.1.2 appears out of sequence.

* Section rq24.2.2 concludes with the statement "all the variables used in 
the query pattern must be bound in every solution."  I'm not entirely 
certain this is correct; is a variable which has been projected away (in 
this particular case, ?x) still bound in that particular solution?

* There's a forward reference to CONSTRUCT at the beginning of rq24.2.7 
which probably can't be avoided, but at least ought to be linked to 
rq24.10.3

* EBNF fragments of the grammar appear throughout rq24.3.  It would 
probably help to add a link to the format used for these (XML 1.1 section 
6) in rq24.1.1 (Document Conventions) rather than (or in addition to) the 
link at the beginning of rq24.A.7.  (On closer inspection, I see a @@ note 
in rq21.1.2 which indicates the editors have already thought of this, 
although not quite in the place I expected.)

* The link to #syntaxMisc in section rq24.3.2 is broken: "there are 
abbreviated ways of writing some common triple pattern constructs."  It 
might helpful to add some text noting that these abbreviations are all 
adapted from Turtle, with a link.

* The grammar rules throughout of rq24.10 are empty.

* The comment just before rq24.A.1 "rules A.1 to A.5 apply" should 
probably include A.6 as well.

* A link in rq24.A.7 to #Keywords is now broken, since rq23.A.3 has been 
moved to the beginning of rq24.A.7: "Matching is case-sensitive except as 
noted above for keywords."

As a final miscellaneous observation, the section headings do not make it 
easy to look up a particular keyword.  It might be more useful to name 
rq24.3.1.1 as "PREFIX and BASE", rq24.7 as "OPTIONAL", rq24.9.2.1 as 
"FROM", rq24.10.3 as "CONSTRUCT" and so forth, or perhaps to use longer 
section titles combining the keyword and description, e.g. "Matching 
Alternatives using UNION".

The first paragraph of thet Abstract would fit better in rq24.1 
(Introduction).

-----

References:
[1] http://www.w3.org/2001/sw/DataAccess/rq23/ (revision 1.692)
[2] http://www.w3.org/2001/sw/DataAccess/rq23/rq24.html (revision 1.17)
[3] 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0107.html
[4] 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0108.html
[5] 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JulSep/0109.html
[6] http://www.w3.org/2001/sw/DataAccess/rq23/rq24.html#Keywords
[7] http://www.w3.org/2001/sw/DataAccess/rq23/rq24.html#syntaxMisc

Received on Tuesday, 5 September 2006 13:41:54 UTC