- From: Steve Harris <S.W.Harris@ecs.soton.ac.uk>
- Date: Wed, 23 Mar 2005 10:28:34 +0000
- To: DAWG public list <public-rdf-dawg@w3.org>
Appologies for the length of this mail, but its a complicated subject and I want to be as clear as possible. I have been very uncomfortable with the way that the graph data model is specifed in the SPARQL working draft. I've been mulling over it and its implications for some time, and it still seems like the wrong thing. As Andy has shown, it is possible for (most?) existing system to emulate the proposed model, but the emulation doesn't sit right with me, and it is inconvienient for users in what is the current most-common behaviour in my experience. So, I have a slightly different suggestion that I hope is a bit more implementation neutral. Implementation should be little/no effort for background+named graph systems (it requires the ability to identify the background graph with a URI, but I understand cwm can allready do this), and it is less effort to support and less of a departure for quad-based systems than the current proposal. Two query (protocol/whatever) parameters, I'l call them "use" and "constrain", though those are not good names. They both take lists of URIs, and systems can indicate thier defaults, eg. via the SADDLE scheme. SPARQL does not specify what the defaults should be. "Use" Use is a list of URIs that are to used as GRAPH URIs to match graph patterns in which the GRAPH keyword is not used. Systems using background graphs can set thier default use value to be some URI representing the background graph, aggregator-type systems can default it to some value indicating that all known graphs (modulo constrinats in "constrain") are to be used (I'm not sure about how to invoke this feature, using a legal URI to indicate 'all' is a bit fraught, * seems ugly, and using the empty list is wrong). "Constrain" This is a hard limit on the graphs that may be used to answer queries, it could be equivalent to adding GRAPH ?gN {...} FILTER ?gN = <uri1> || ?gN = <uri2> ... to every triple in an aggregator, for dynamic loading systems its just the list of graphs to load. All graphs are identified by some URI, even those specifed in "use", though it may be urn:x-local:background or something equally vague. Pros Hopefully this is more neutral to the implementation of the rdf stores engine. It allows background-like behaviour, but it doesnt limit it to a particular graph in a given instance of a store/query pair, and it doesnt require the graph to be stored twice (even conceptually) if you want both provenance and answers without additional constraints. Cons It still has the behaviour that SELECT ?x ?y ?z WHERE {?x ?y ?z .} is different from SELECT ?x ?y ?z WHERE GRAPH ?g {?x ?y ?z .} which makes me uneasy, but I can live with it. - Steve
Received on Wednesday, 23 March 2005 10:28:37 UTC