- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Wed, 13 Sep 2006 17:49:23 +0200
- To: Chimezie Ogbuji <ogbujic@bio.ri.ccf.org>
- Cc: Nuutti Kotivuori <naked@iki.fi>, public-sparql-dev@w3.org
Hi Chimezie, On 13 Sep 2006, at 15:45, Chimezie Ogbuji wrote: > On Wed, 13 Sep 2006, Richard Cyganiak wrote: >> Some of your options are not really possible with named graphs >> because graphs need to be *named*, that is, the name *must* be a >> URI and not a blank node. > > I don't agree. What's the source of this assertion? The discussion is about SPARQL, so I assumed the definition of Named Graphs from the SPARQL spec would apply. See also various papers from Bizer et al., e.g. [1]. As Dan pointed out, there's no community consensus on wether Named Graphs are a good thing or not, but the definitions that use this very term seem to require URIs as graph names. Contexts are not Named Graphs. [snip] > Well, Blank nodes used within a graph can't be referred to directly > but they can still be matched by SPARQL - doesn't make them any > less useful. The problem isn't the use of Blank nodes for graph > names but > a the lack of a mechanism [2] to match the graph name(s) associated > with a node. Given how closely coupled SPARQL is with (admittedly > informal) named graph semantics, I would expect to be able to > answer questions such as: > > "What are the graph names in which all the statements about > <someIRI> are asserted?" I'm afraid I'm missing the point here. Why not this? SELECT DISTINCT ?graph WHERE { GRAPH ?graph { <someIRI> [] [] } } (Now of course the problem is that when I allow blank nodes as graph labels, then the answer to this query might be: "a blank node, a blank node, and another blank node".) [snip] > If BNodes are used for existential assertions about nodes, why > wouldn't they be used as existential assertions about graphs? I can offer my personal and subjective viewpoint: If you extend RDF triples with a fourth element that works exactly as the others, then it instantly raises the question why not to add a fifth element? Or a sixth? I think that three is the sweet spot, but in practice triples often occur in "bags", and sometimes it's useful to be able to talk about these "bags", and I find that Named Graphs provide exactly the minimum of machinery necessary to do that, and nothing more. I'm sure that a full-blown fourth element (and fifth) would offer lots of interesting possibilities, but personally I haven't come across any urgent need for it. Named Graphs, as defined in [1] and SPARQL, work well for me. YMMV, of course. Yours, Richard [1] http://www.wiwiss.fu-berlin.de/suhl/bizer/pub/Carroll_etall- TrustWorkshop-ISWC2004.pdf > And if there is some semantic consequence, it furthers the argument > that the formalisms for named graphs should be well articulated > before they are tightly integrated into a query language. > >> I would suggest that Alice and Bob each mint a new URI for the >> graph containing the statements of unknown origin *in their own >> store*. Or mint a new URI to hold each individual statement, or >> anything in between. Since the owner of a URI gets to say what the >> meaning of the URI is, they can declare that this chunk of URI >> space is reserved for this purpose (assuming Alice and Bob each >> own a chunk of URI space). >> >> I wonder why you discounted this solution? > > I don't think it's an elegant solution when we already have the > means (within 'vanilla' RDF Model Theory) to express existential > assertions - which is exactly the scenario here. > > If a graph label is nothing but a name associated with a set of > graphs, why should it not behave the same as the name associated > with a node within a graph? > >> I also question the existence of "statements without a known >> origin". They surely didn't just pop up magically inside your >> triple store, eh? I guess it's more like "statements whose origin >> I don't want to model". > > How different is this from "nodes whose names I don't care to > maintain / model?" > > [1] http://ninebynine.org/RDFNotes/ > UsingContextsWithRDF.html#xtocid-6303976 > [2] http://copia.ogbuji.net/blog/2006-07-14/querying-named-rdf- > graph-aggregate > > Chimezie Ogbuji > Lead Systems Analyst > Thoracic and Cardiovascular Surgery > Cleveland Clinic Foundation > 9500 Euclid Avenue/ W26 > Cleveland, Ohio 44195 > Office: (216)444-8593 > ogbujic@ccf.org > > >> >> >> On 11 Sep 2006, at 19:51, Nuutti Kotivuori wrote: >> >>> This isn't exactly a SPARQL question, but it is very closely >>> related. I will first outline the question context. >>> Assume an RDF statement store, which has a mechanism for tracking >>> statement origin (scope, context, graph, source whatever). Many >>> of the >>> statements have a distinct origin, or source graph, they were >>> imported >>> from. But there are also those which either seemingly have no >>> origin, >>> or the origin is not known. The origin of these statements have >>> to be >>> handled somehow. We'll come to the specific choices later on. >>> This statement store offers a SPARQL query interface into it. The >>> facilities for querying named graphs in SPARQL would obviously be >>> used >>> to query the different origins in the store. But there are two >>> things >>> to decide. First, how should statements without an origin be >>> accessed >>> in SPARQL? There are several choices on this, which I will outline >>> below. And related to the first one, second, what should the default >>> graph be for the queries if none is given explicitly. >>> I will list a few possibilities and mention the problems and >>> benefits >>> that seem to result from them as a basis for discussion. >>> 1. Unknown origin is a distinct node, but separate from all uris, >>> blank nodes or literals. The default graph for the query is the >>> graph of the unknown origin nodes. >>> - Separation of identifier spaces, no fear of any overlap. The >>> graph of statements with unknown origin is separate from any >>> named graph. >>> - Since there is no way to represent the unknown origin in >>> SPARQL >>> syntax, the default graph is the only way to access the >>> nodes in >>> that graph. >>> - The nodes in the unknown origin graph are not matched by any >>> graph query, since the name of the graph could not be returned >>> reasonably. That is: >>> SELECT ?g ?s ?o ?p >>> WHERE { GRAPH ?g { ?s ?p ?o } } >>> cannot return ?g for the unknown origin graph. >>> 2. Unknown origin is a distinct node, as above. The default >>> graph is >>> the RDF merge of all graphs in the store, including the >>> statements >>> with an unknown origin. >>> - The problems above. >>> - In addition, there is no way to select nodes that explicitly >>> have an unknown origin. (Or is there? Could one match all the >>> statements for which there is no graph with the same >>> statement? >>> In any case, this would be quite contorted.) >>> 3. Unknown origin is represented by a distinct blank node; that is, >>> every statement has it's own blank node as the graph name, which >>> is not shared with any of the other statements. The default >>> graph >>> is the RDF merge of all graphs in the store, including the >>> statements with an unknown origin. >>> - This is probably closest to accurate modelling of the >>> situation. We know every statement has an origin, we just >>> don't >>> know what it is - a situation commonly modelled with a blank >>> node. Also, we don't know which statements might share an >>> origin, so until we know better, we make them all distinct. >>> - The origin of the statements is nicely queryable with SPARQL >>> queries and every statement has an origin, even if unknown. >>> - Queries which specify several statements from a single graph >>> will not match the statements with unknown origins as it >>> cannot >>> be confirmed that they would be from the same graph. >>> - There is no way to match the origin of a single statement as >>> there is no way to match a certain blank node explicitly. The >>> current SPARQL treats it as an open variable(?). >>> - There is no way to explicitly match statements that have an >>> unknown origin, since the origins are just distinct blank >>> nodes. >>> - Possibly hard to implement, because of the number of distinct >>> blank nodes. >>> 4. Unknown origin is represented by a singleton blank node; that >>> is, >>> every statement with an unknown origin shares one single blank >>> node as the graph name. The default graph is the RDF merge of >>> all >>> graphs in the store. >>> - Lumps all statements with an unknown origin under a single >>> named >>> graph. Queries which match several statements from a single >>> graph will match statement sets from unknown origin as well. >>> - The origin of the statements is nicely queryable with SPARQL >>> queries and every statement has an origin, even if unknown. >>> - There is no way to explicitly match statements that have an >>> unknown origin, since the origin is a single blank node. If >>> the >>> application provided a magic type for this blank node (_:x a >>> rdfx:UnknownOrigin), this could be matched with: >>> SELECT ?s ?o ?p >>> WHERE { ?g a rdfx:UnknownOrigin . >>> GRAPH ?g { ?s ?o ?p } } >>> But this again is quite contorted. (The same could be >>> applied to >>> the third case as well, but the implementation of that >>> would be >>> really tricky to be effecient.) >>> 5. Unknown origin is represented by a singleton blank node as >>> above. The default graph is the singleton blank node of unknown >>> origin. >>> - Mostly as above, but in the common case, explictly matching >>> statements that have an unknown origin would be easy in just >>> matching the statements from the default graph. >>> 6. Unknown origin is represented by a well known URI that is shared >>> universally. The default graph is the RDF merge of all graphs in >>> the store. >>> - Somewhat incorrectly asserts that the statements have a >>> certain >>> origin, even though we don't know the origin. >>> - The origin of the statements is nicely queryable with SPARQL. >>> - Statements with an unknown origin can be easily explicitly >>> matched by comparing them against the well known URI. >>> - Assigns a special meaning to an URI. >>> - Hard to coordinate with a number of people implementing >>> similar >>> solutions if not standardized. >>> Some other variants of the above were omitted, since their problems >>> and benefits are easily reasoned. >>> On irc, 'chimenzie' outlined the problem as such: >>> 17:35 chimezie:#swig => Hmm.. well, seems like what is missing is >>> a good >>> definition of a 'name for nodes that don't have an explicit >>> context' >>> 17:36 chimezie:#swig => or rather 'a name for the context of >>> nodes that aren't >>> assigned to a context explicitely' >>> So, I'm out for some input on what might be the sanest route to >>> through this. >>> TIA, >>> -- Naked >> >> >
Received on Wednesday, 13 September 2006 15:49:31 UTC