- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Thu, 2 Jun 2005 10:12:17 +0300
- To: public-rdf-dawg-comments@w3.org
On Jun 1, 2005, at 22:18, ext Kendall Clark wrote: > > Working Draft: SPARQL Protocol for RDF > > The RDF Data Access Working Group has released a second Working > Draft of the SPARQL Protocol for RDF. The draft describes a > protocol for conveying RDF queries from clients to query > services. The protocol is compatible with the SPARQL query > language (pronounced "sparkle") and may be used to to convey > queries from other RDF query languages as well. > > http://www.w3.org/TR/2005/WD-rdf-sparql-protocol-20050527/ > > The RDF Data Access Working Group is seeking feedback and > comments on the SPARQL Protocol for RDF from stakeholders, > interested parties, and potential implementors. > > Please direct all feedback to the DAWG comments mailing list: > > public-rdf-dawg-comments@w3.org > > Thanks, > Kendall Clark > > Great work! A few questions/comments: 1. All examples explicitly specify the background graph, and while that's of course not incorrect, it would perhaps be good if the first few basic examples would omit specification of a background graph to reflect what will most likely be the most common use case, that of a given SPARQL portal recieving queries without any specification of a the background graph, and defaulting to the default background graph of the portal itself. 2. The parameter for specifying the background graph should follow the terminology used in the SPARQL spec and thus should be named 'background-graph-uri' and not 'default-graph-uri' as the term "default graph" has no definition in SPARQL. (The above comments of course presume that the parameter for specifying a background graph will remain, which, per my next comment, may not be the case) 3. Since the SPARQL language itself provides facilities for explicitly specifying which graph a given query, or components of a query, should be evaluated against, and the query itself can (and if needed, must) include FROM/FROM NAMED qualifications naming the relevant graphs, why is it necessary to redundantly specify any graph URIs in the parameters? If a specific graph needs to be specified, then it seems to me that it would be better, as in more economical and elegant, to use the existing machinery in the SPARQL query language itself to identify those graphs rather than introducing alternative, and potentially redundant parameters for doing so. This also alleviates any chance of conflicts between the query and the parameters, such as if they disagree about which graph is the background graph. It seems to me that the only parameter needed is 'query', and everything else can then be specified as required in the actual SPARQL query provided to the SPARQL service. If there is sufficient justification for adding these potentially redundant parameters, then that should be discussed in sufficient detail in the specification (otherwise, they should be removed). 4. Related to the above, but actually a comment regarding the SPARQL spec itself, it seems there is a conflict between the FROM construct and the definition of a dataset, since, if the background graph is "unnamed", then how could one refer to it with a FROM construct? I think the problem here is simply with language, not an inherent flaw in SPARQL. It is my understanding that, while not manditory, the URIs specified using the FROM and FROM NAMED constructs are often expected/hoped to be resolvable at run time to a graph, by dereferencing such URIs, and that many SPARQL processors when encountering unknown graph names will attempt to retrieve those graphs via their URIs. That's fine, and demonstrates how well the OFWeb and SemWeb can be integrated on the basis of a shared set of URIs (let's just hope that everyone agrees that graphs are information resources ;-) but the bottom line is that a named graph is a named graph is a named graph, so if one can use FROM to specify the background graph of a dataset, then the background graph of a dataset can be a named graph (even if it need not be named for all queries/applications). I think that the definition of a dataset should not state that the background graph is necessarily unnamed, but rather than it is simply the background graph, such that any queries evaluated against that dataset, which do not specify any graph, are evaluated against that background graph. Now, how a given SPARQL processor knows which graph is the background graph for a given query is of course relevant, and I don't see that any major changes are needed to SPARQL to identify the background graph. Namely, if no FROM clause is provided, then it is left up to the SPARQL processor to decide which is the background graph for a given query. If there is a FROM clause provided, then the graph thus specified is the background graph for the query. Thus, it is not essential to stipulate whether the background graph be either unnamed or named insofar as the definition of a dataset is concerned, only that it is clear to the processor which graph is the background graph of a dataset when evaluating a given query. This can be fixed easily enough, I think, by changing the single word 'does' to 'need' in section 7 of the SPARQL spec. I.e. change [ There is one graph, the background graph, which does not have a name, and zero or more named graphs, identified by URI reference. ] to [ There is one graph, the background graph, which need not have a name, and zero or more named graphs, identified by URI reference. ] and then later, add some statement such as [ If a given query does not specify the background graph by name, using the FROM operator, then the SPARQL processor must decide which background graph is most appropriate for evaluating the query. The SPARQL processor should be consistent in the default background graph used for all queries not specifying a background graph explicitly. ] Of course, serialization of a dataset introduces some additional issues, as to how to identify the background graph. My recommendation would be to use any generic RDF serialization which supports named graphs, and define a vocabulary to describe a dataset, which specifies the background and/or named graphs belonging to that particular dataset. E.g. using TriG, the dataset from Example 1 in section 7.1 of the SPARQL spec could be unambiguously serialized as: @prefix sparql: <http://www.w3.org/TR/rdf-sparql-query/> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix : <http://example.com/myDatasetSerialization/> . :ds a sparql:Dataset ; sparql:BackgroundGraph :bg ; sparql:NamedGraph <http://example.org/bob> ; sparql:NamedGraph <http://example.org/alice>. :bg { <http://example.org/bob> dc:publisher "Bob" . <http://example.org/alice> dc:publisher "Alice" . } <http://example.org/bob> { _:a foaf:name "Bob" . _:a foaf:mbox <mailto:bob@oldcorp.example.org> . } <http://example.org/alice> { _:a foaf:name "Alice" . _:a foaf:mbox <mailto:alice@work.example.org> . } It's important to note that, in the case of serializing datasets with unnamed background graphs, it is necessary to give the background graph a name, but in doing so, it also means that by using this approach, serialization formats such as TriG and TriX can be used to serialize multiple datasets in a single TriG or TriX instance (if ever useful or necessary to do so) in addition to unambiguously serializing a single dataset. (I've been generally uncomfortable with processors naming unnamed graphs, for the sake of round trip integrity and consistency, but I've come to see this approach as the least expensive and disruptive to existing tools and processes, and one which maximally exploits the RDF machinery. Earlier comments regarding serialization were also based in the understanding that background graphs must be unnamed, hence introducing a problem when directly parsing/syndicating a serialization where the background graph has been named -- but as this is actually not the case, and such a conflict would not arise, I feel much more comfortable with this approach) Regards, Patrick -- Patrick Stickler Senior Architect Forum Nokia Hatanpäänkatu 1 A 33900 Tampere Finland phone: +358 40 801 9690 fax: +358 7180 75700 email: patrick.stickler@nokia.com Forum Nokia provides a wealth of resources to mobile developers. For the latest on mobile tools, devices and technologies, go to http://www.forum.nokia.com
Received on Thursday, 2 June 2005 07:12:41 UTC