- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Mon, 06 Jun 2005 13:11:00 +0100
- To: Patrick Stickler <patrick.stickler@nokia.com>
- CC: public-rdf-dawg-comments@w3.org
Patrick Stickler wrote: > <snip/> [Kendal has addressed the protocol parts] > > 4. Related to the above, but actually a comment regarding the > SPARQL spec itself, it seems there is a conflict between > the FROM construct and the definition of a dataset, since, if > the background graph is "unnamed", then how could one > refer to it with a FROM construct? I think the problem here > is simply with language, not an inherent flaw in SPARQL. > > It is my understanding that, while not manditory, the URIs > specified using the FROM and FROM NAMED constructs are > often expected/hoped to be resolvable at run time to a graph, > by dereferencing such URIs, and that many SPARQL processors > when encountering unknown graph names will attempt to retrieve > those graphs via their URIs. That's fine, and demonstrates > how well the OFWeb and SemWeb can be integrated on the basis > of a shared set of URIs (let's just hope that everyone agrees > that graphs are information resources ;-) but the bottom line > is that a named graph is a named graph is a named graph, so > if one can use FROM to specify the background graph of a dataset, > then the background graph of a dataset can be a named graph (even > if it need not be named for all queries/applications). Not all graphs have names (natural names, if you like). Examples include some of the simpler use cases: UC1: A program reads a file: URI (which are rarely global) which has the serialization of a graph. That graph happens to be a copy from the web stored locally. Here naming the graph just because it was identified by a file: URI is misleading. UC2: A program reads two graphs, performing an RDF merge. UC3: A program reads a graph, and wishes to query over the RDFS entailments. [more else where later on this as it is more applicable to a later message] We could insist that every graph has a name - I find that people aren't very diligent in generating globally unique names when used purely within their own application. > > I think that the definition of a dataset should not state that > the background graph is necessarily unnamed, but rather than it is > simply the background graph, such that any queries evaluated against > that dataset, which do not specify any graph, are evaluated > against that background graph. Now, how a given SPARQL processor > knows which graph is the background graph for a given query is > of course relevant, and I don't see that any major changes are > needed to SPARQL to identify the background graph. > > Namely, if no FROM clause is provided, then it is left up to the > SPARQL processor to decide which is the background graph for a > given query. If there is a FROM clause provided, then the graph > thus specified is the background graph for the query. Thus, it > is not essential to stipulate whether the background graph be > either unnamed or named insofar as the definition of a dataset > is concerned, only that it is clear to the processor which > graph is the background graph of a dataset when evaluating a > given query. > > This can be fixed easily enough, I think, by changing the single word > 'does' to 'need' in section 7 of the SPARQL spec. > > I.e. change > > [ > There is one graph, the background graph, which does not > have a name, and zero or more named graphs, identified by > URI reference. > ] > > to > > [ > There is one graph, the background graph, which need not > have a name, and zero or more named graphs, identified by > URI reference. > ] > > and then later, add some statement such as > > [ > If a given query does not specify the background graph by > name, using the FROM operator, then the SPARQL processor > must decide which background graph is most appropriate > for evaluating the query. The SPARQL processor should > be consistent in the default background graph > used for all queries not specifying a background graph > explicitly. > ] > > Of course, serialization of a dataset introduces some additional > issues, as to how to identify the background graph. My recommendation > would be to use any generic RDF serialization which supports named > graphs, and define a vocabulary to describe a dataset, which specifies > the background and/or named graphs belonging to that particular dataset. > > E.g. using TriG, the dataset from Example 1 in section 7.1 of > the SPARQL spec could be unambiguously serialized as: > > @prefix sparql: <http://www.w3.org/TR/rdf-sparql-query/> . > @prefix dc: <http://purl.org/dc/elements/1.1/> . > @prefix foaf: <http://xmlns.com/foaf/0.1/> . > @prefix : <http://example.com/myDatasetSerialization/> . > > :ds a sparql:Dataset ; > sparql:BackgroundGraph :bg ; > sparql:NamedGraph <http://example.org/bob> ; > sparql:NamedGraph <http://example.org/alice>. > > :bg > { > <http://example.org/bob> dc:publisher "Bob" . > <http://example.org/alice> dc:publisher "Alice" . > } Couldn't that be: :ds a sparql:Dataset ; sparql:BackgroundGraph { <http://example.org/bob> dc:publisher "Bob" . <http://example.org/alice> dc:publisher "Alice" . } ; sparql:NamedGraph <http://example.org/bob> ; sparql:NamedGraph <http://example.org/alice>. The :bg is a way of making a syntactic connection bewteen the sparql:BackgroundGraph triple and the sub-serilization. The fact it is used a name (externally visible) is one way to doing it. Having a local-scoped label like "=:bg" would also meet the serialization requirements. Andy > > <http://example.org/bob> > { > _:a foaf:name "Bob" . > _:a foaf:mbox <mailto:bob@oldcorp.example.org> . > } > > <http://example.org/alice> > { > _:a foaf:name "Alice" . > _:a foaf:mbox <mailto:alice@work.example.org> . > } > > It's important to note that, in the case of serializing datasets with > unnamed background graphs, it is necessary to give the background graph > a name, but in doing so, it also means that by using this approach, > serialization formats such as TriG and TriX can be used to serialize > multiple datasets in a single TriG or TriX instance (if ever useful > or necessary to do so) in addition to unambiguously serializing a > single dataset. > > (I've been generally uncomfortable with processors naming unnamed > graphs, > for the sake of round trip integrity and consistency, but I've come to > see this approach as the least expensive and disruptive to existing > tools and processes, and one which maximally exploits the RDF machinery. > Earlier comments regarding serialization were also based in the > understanding > that background graphs must be unnamed, hence introducing a problem when > directly parsing/syndicating a serialization where the background graph > has > been named -- but as this is actually not the case, and such a conflict > would not arise, I feel much more comfortable with this approach) > > Regards, > > Patrick > > -- > > Patrick Stickler > Senior Architect > Forum Nokia > Hatanpäänkatu 1 A > 33900 Tampere Finland > > phone: +358 40 801 9690 > fax: +358 7180 75700 > email: patrick.stickler@nokia.com > > Forum Nokia provides a wealth of resources to mobile > developers. For the latest on mobile tools, devices and > technologies, go to http://www.forum.nokia.com > >
Received on Monday, 6 June 2005 12:11:21 UTC