- From: Pat Hayes <phayes@ihmc.us>
- Date: Sun, 18 Mar 2007 23:50:28 -0500
- To: "Seaborne, Andy" <andy.seaborne@hp.com>
- Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Overall comment (important). There is a disconnect between the ideas of dataset and graph, which I think needs to be fixed. Section 8 discusses datasets in great detail with many examples, but it nowhere actually defines explicitly which RDF graph is determined to be the one that BGPs are required to match against. Section 12.3.2 defines matching for BGPs, but speaks of matching to a dataset (mia culpa). Section 12.5 finally introduces and uses the terminology "active graph", but it does not formally define this notion or say how it is computed. (See detailed comments of 12.5 below) In any case, it is far too late in the document for this idea to be defined. "Active graph" is a basic concept which should be defined in section 8, which should give clear criteria for how to determine it given a query and a dataset. Then 12.3.2 should use this term when defining BGP matching, and the references in 12.3.2 and 12.5 should have internal links to the definition in section 8. -------- Comments on Section 12 "query as as string" -> "query as a string" "abstract query comprises operators" -> "abstract query comprising operators" "this can then be evaluated" Should we only say 'can' here? Suggest "this is then evaluated". "This section defines the correct behavior for evaluation of graph patterns and solution modifiers, given a query string and an RDF dataset. " But you just said we would cover the process starting with the abstract syntax, not the string. Correct one of these statements. This whole process seems awfully complicated and unmotivated. Can you give some guidelines on what the differences are between abstract syntax and abstract query and SPARQL algebra (?) . This whole topic of converting form one form to another isn't mentioned again until 12.2, and none of the intervening definitions in 12.1 seem to be relevant to it. In fact, I would suggest switching sections 12.1 and 12.2, and maybe merging 12.1 with 12.3. 12.1.1 "IRIs, a subset of RDF URI References that omits spaces." Is that *really* the definition of an IRI? Suggest provide a link to the IRI publication. The link on the word 'updated' is to http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref. Is this the most appropriate link? First definition: et -> Let 12.1.2 "each <ui> is an IRI. Each <ui> is distinct." -> "each <ui> is a distinct IRI." The notation used here seems odd to me. Usually, ordered pairs are indicated using <> as brackets. Why do you need them round the IRI names? Wouldnt it make sense to write this in this way "An RDF dataset is a set: {G, <u1, G1>, ...}..." 12.1.3 "...is a member of an infinite set V which is disjoint from RDF-T. " You don't give a definition for BGP (its a set of triple patterns, yes?) I suggest it should be between 12.1.4 and 12.1.5. Right now there is no connection between the material up to 12.1.4 and the stuff starting at 12.1.5. Also, the internal link http://www.w3.org/2001/sw/DataAccess/rq23/rq25.html#rBasicGraphPattern is broken. 12.1.5 seems a bit bare. Where do we find out more about these various kinds of pattern? Can you provide links? (As you do in 12.1.7) 12.1.6. (Question about terminology. Is *every* such mapping a "solution" mapping? Or only the ones which actually are solutions? Right now we say the first, which seems a bit odd to me, because a solution mapping might not be a solution. (Later. Was it me who suggested this terminology? If I did, mia culpa again.)) No need to say from V to T since these are globally fixed. -> "A solution mapping É is a partial function É : V -> T" The note about multisets seems out of place here, since we havnt mentioned matching graph patterns yet, and nothing has been said about there being multiple answers. Suggest moving this to 12.2, and omitting the last sentence "It is described..." which reads like an implementation suggestion and seems out of place. Readers will likely know what a bag is in any case, right? 12.1.7 What is a solution sequence, that we can have a modifier of it? Is there a missing definition of 'solution sequence' ? ---- 12.2 "an SPARQL query" -> "a SPARQL query" This is hard to follow. After parsing, the syntax tree is composed of .. a table?? What is the 'query form' in this table? Is it part of the syntax tree, or just there for reference? "uses theses symbols" -> "uses these symbols" What exactly is meant by "mapping" in "The result of mapping a SPARQL query..." ? This mapping idea hasn't been mentioned previously or defined (unless you mean solution mapping? Surely not.) Is this mapping the same as "converting"? The early material in the beginning of the section 12 talks about a series of 'steps' and of 'turning into', but does not say 'mapping' or 'converting'. Suggest choosing a uniform terminology and sticking to it throughout. Might also be a good idea to review that early material here (unless you put 12.2 before 12.1, as I suggested above) What is a 'result form' in the definition of abstract query? The internal link is broken. 12.2.1 What does the title of this section mean? (Mapping graph patterns to what?) Step 2 second line, remove comma after "GroupGraphPattern" "replace with a sequence of nested union operators:" => "replace with nested union operators, associated to the left:" Step 3. Odd change of font. Is it meaningful? Does "Map ... to ..." mean the same as "replace ... by ...."? Suggest use consistent terminology in describing these steps. "Replace ..by.." seems nicely unambiguous. Step 4. What is the point of the link from the cryptic word "Constraint" in parentheses, without explanation? What does "Write: "A" for an algebra expression" mean? The earlier steps have been instructions to do something: is this an instruction (imperative) also? If not, what is it? If it is, where does one write "A" exactly? In box: "for i := 0 ; i < length(SP); i++" Yechhh, do we really want to use C++ in the formal spec? Couldn't you write this in some kind of readable pseudocode? BTW, what is the scope of this iteration? Is the "If F is nonempty" inside it or after it? "LeftJoin(G , A, true)" -> "LeftJoin(G, A, true)" (no space after G) "SP := List " -> "SP := list " "If G = Join(A1, A2) then G := Filter(F, Join(A1, A2)" -> "If G = Join(A1, A2) then G := Filter(F, Join(A1, A2))" (extra paren at end) ---------- This step 4 is incomprehensible as written, I have to say. I have no idea what it is telling me to do. If that stuff in the box is a procedure, where is A initialized? I can't see how G can ever get rid of a LeftJoin; is this right? What does "Map all sub-patterns contained in this group" mean? Sub-pattern hasn't been defined, and contain hasn't been defined. step 5. "join({}, A)" -> "join({ }, A)" (space added) 12.2.3 What is this doing? A word or two would be helpful. Step 1 "There is no implied ordering to the sequence" OK, but does it have to be fixed? That is, is ToList a real function? This step says "set M =". Earlier part of this section have used assignment := or said "replace ... by ..." Later steps in the subsection omit "set" and are written using equality, which is misleading if read as an equation. Suggest using uniform notation and terminology. Step 2. Where does the list of order conditions come from? Step 3. What is a 'named variable' ? Suggest rephrase as "all variables occurring in the query" Step 5. "If the query mentions.." Does this mean the same as "If the query contains.." ? If so, suggest use consistent wording. "defaults to the (size(M)-start)" -> "defaults to (size(M)-start)" -------------- 12.3 ( The definitions in this section seem to continue directly on from those in section 12.1, and not be very connected to those in section 12.2.) "for multiset" -> "for the multiset" Definition of Compatible mappings. I'd suggest defining merge explicitly, rather than talking about the set-union of two mappings (tricky idea to get right): Definition: The merge(mu1, mu2) of two compatible mappings is the mapping which is identical to mu1 on dom(mu1) and to mu2 on dom(mu2). Delete "Following the terminology of RDF semantics [RDF-MT]" Make "* An [RDF instance mapping]" with [ ] a hyperlink to http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#definst (Because the rest of the terminology defined here *isn't* in RDF-MT :-) 12.3.1 Delete second sentence (no longer true of the material in this section). Could replace it with a forward reference to section 12.6 Solution mapping has already been defined, omit definition here. Delete "P(x) = É (É(x))" Definition of BGP Matching, change to: ----- Let BGP be a basic graph pattern and G be an RDF graph. <mu> is a <em>solution</em> for BGP from G when there is a pattern instance mapping P such that P(BGP) is a subset of G and <mu> is the restriction of P to the query variables in BGP. ----- A <em>solution sequence</em> is some total ordering of the multiset of all solutions for BGP from G, each derived from a distinct pattern instance mapping. ---- (NOTE. I hope this last bit is still right :-) 12.3.2 "as identifying nodes in the dataset." -> "as identifying nodes in the active graph of the dataset." "understood to be not from DS itself," -> "understood to be not from the active graph of DS itself," "which is graph-equivalent to DS but shares no" -> "which is graph-equivalent to the active graph of DS but shares no" "SPARQL adopts a simple subgraph matching criterion for this. A SPARQL answer is the restriction of a SPARQL instance mapping M to query variables, where M(BGP) is a subset of the scoping graph. There is one answer for each distinct such SPARQL instance mapping M." -> "SPARQL uses the subgraph match criterion to determine the multiset of answers. There is one answer for each distinct pattern instance mapping from the basic graph pattern to a subset of the active graph." Next para, "when the dataset is lean" -> "when the active graph of the dataset is [lean]" and put a hyperlink on [lean] to http://www.w3.org/TR/rdf-mt/#deflean ------------ 12.4 Definitions here all refer to 'mappings'. As we have defined a number of different mappings, say which one of them is intended. Defn of filter: "an expression that has a boolean effective value of true" Is this verbiage really necessary? You havn't used the phrase "boolean effective value" elsewhere. Why not just say "an expression with the value true" ? Is "card[Filter(expr, ‡)](É ) = card[‡](É )" really true? Surely the filter can reduce the cardinality, no?? Defn Join: "sum over É in (‡1 set-union ‡2), card[‡1](É 1)*card[‡2](É 2)" What does this mean? The sum expression doesnt contain É . Defn. Diff; again, is that equation for the cardinality really true? Similarly for the union case: surely one only gets the sum of the cardinalities when the original sets are disjoint. Is the C in [x | C] a condition on the sequence or on the elements of the sequence? --------- 12.5 What is the range of eval? Its hard to read expressions like "Join(eval(D(G), P1), eval(D(G), P2))" without knowing this :-) What is the "active graph" exactly? (See first comment.) Its not clear (to me) what it means to say that the active graph is "initially" the default graph. (Initially? How did time get into the question?) Suggest "eval(D(G), BGP) = multiset of solution mappings" -> "eval(D(G), BGP) = multiset of all distinct solution mappings for BGP from G" (assuming that the earlier suggested changes are made so this makes sense.) Defn of Evaluation of a Union Pattern. "join" is written in lower case. Should this be "Join" ? BTW, this would all be a lot easier to understand if you used some systematic way of distinguishing the evaluation function from the SPARQL algebra term, say by a font change or something? But its getting late, so never mind.... --------- 12.6 "needless of inappropriate" -> "needless or inappropriate" "... if and only if the triple (" ends a line, which is a pity. "consistent source document SD is uniquely specified and is E-equivalent to SD." -> "consistent active graph AG is uniquely specified and is E-equivalent to AG." "For any basic graph pattern BGP and pattern solution P" -> "For any basic graph pattern BGP and pattern solution mapping P" "and answer set {P1 ... Pn} " -> "and answer sequence <P1 ... Pn>" "and where {BGP1 .... BGPn} is a set of basic graph patterns" -> "and where <BGP1 .... BGPn> is a sequence of basic graph patterns" "guarantee that every BGP and SD" -> "guarantee that every BGP and AG" "(a) SG will often be graph equivalent to SD" -> "(a) SG will often be graph equivalent to AG" "that SG share no blank nodes with SD or BGP. In particular, it allows SG to actually be SD." -> "that SG share no blank nodes with AG or BGP. In particular, it allows SG to actually be AG." "graph-equivalent to SD but shares no blank nodes with SD or BGP" -> "graph-equivalent to AG but shares no blank nodes with AG or BGP" ----------- Phew. Pat -- --------------------------------------------------------------------- IHMC (850)434 8903 or (650)494 3973 home 40 South Alcaniz St. (850)202 4416 office Pensacola (850)202 4440 fax FL 32502 (850)291 0667 cell phayesAT-SIGNihmc.us http://www.ihmc.us/users/phayes
Received on Monday, 19 March 2007 04:50:43 UTC