- From: Ron Alford <ronwalf@umd.edu>
- Date: Tue, 05 Jul 2005 16:42:38 -0400
- To: "Seaborne, Andy" <andy.seaborne@hp.com>
- CC: "Eric Prud'hommeaux" <eric@w3.org>, Dan Connolly <connolly@w3.org>, public-rdf-dawg-comments@w3.org, Amy Alford <aloomis@glue.umd.edu>
- Message-ID: <42CAF0BE.1050502@umd.edu>
Seaborne, Andy wrote: [snipping quite a bit] > The data provider could have chosen to provide a URI to a graph node. By > using a blank node, they are stopping clients directly addressing that > node in > the graph. Maybe there is a reason for that. There seems to be a tension > between publisher and consumer of the data here. Why did the data > publishers > choose that data model over, say an rdf:Seq? This tension is being introduced ex post facto. The use of bnodes has always restricted linkability, and not accessibility. As for why choose lists over a sequence, the answer has to do with modeling. It's impossible to express restrictions on an infinite number of properties in the current ontology languages. > The working group has decide to postpone the issue of accessing RDF > collections: > > http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2005Jun/0016.html > > > One of the reasons is because there are non-query ways of addressing the > matter. FOAF's approach is inverse functional properties; inference may > also > be used. As I've said before, accessing lists is just one repercussion of not being able to directly address bnodes. Not only that, the proposed solution has little use where order needs to be maintained. There are also containers other than the RDF sanctified ones. Take OWL-S, for instance. They define a shadow list class at http://www.daml.org/services/owl-s/1.1/generic/ObjectList.owl The rdf:member predicate won't help at all in these cases. It's a separate issue, but you can read about why they had to do this here: http://www.w3.org/Submission/OWL-S/#AppendixB It boils down to an OWL-DL syntax issue. As for inverse functional properties, I've already shown many cases where bnodes are used as syntax in various rdf based languages. Even in foaf, there are many cases where either a) The IFPs just aren't available on a node. b) The IFPs are not trustworthy (ie, foaf:homepage and LiveJournal) Inferencing doesn't help in any of these three cases, and actually hurts in the second part of the last case. Also, relying on inferencing is rather painful, because the cases where bnodes are going to hurt worst are aggregated data sets (search engines). Adding an inferencing layer on top of such large stores is going to cost. > ==== Protocol > > RDQL/Jena has had for some while the ability to pass in values for > variables > at the start of query execution. One use of this is to pass in programming > language level objects, include bNodes, so that the all solutions of the > query > have that a fixed value for a variable. It's a mechanism akin to SQL > client > templates but done by naming, not position. > > This can be extended to the SPARQL protocol: > > ?query=SELECT...&varX=bNode:xyz&... > > Use > SELECT ?item ?tail WHERE { ?x rdf:first ?item ; rdf:rest ?tail } > which becomes at the server: > SELECT ?item ?tail WHERE { <bnode(xyz)> rdf:first ?item ; rdf:rest > ?tail } Slick and useful beyond the scope of bnodes. I see several benefits: a) It's easy for the protocol layer of a sparql server determine for itself whether and how it allows bnode access b) Lets you have template queries that don't need to be munged. There are a wide variety of apps where it's just easier not to be doing string substitutions. c) Provides a limited hook for access control to a sparql store. The protocol layer can have a list of valid queries for a given access level that must string match exactly, but can be parameterized. d) Provides consistency in selecting bnodes, uris, and literals. e) It's implementable, if not efficiently, on top of almost sparql query engine. Just filter the results on the way out. Are the any comments from query implementers and protocol people on this solution? > ==== SPARQL Extensibility > > SPARQL has two extension points: value functions and DESCRIBE. > > == SPARQL Function Extension > > (idea from Steve Harris) > > Have a custom function that tests the bNode label. This isn't covered > by the > SPARQL value model - it's using the function extension point as a tunnel > between client and server inside the SPARQL syntax. > > FILTER ext:bNodeLabel(?x, "label") > > SELECT ?item ?tail WHERE { ?x rdf:first ?item ; > rdf:rest ?tail . > FILTER ext:bNodeLabel(?x, "xyz") } > There is a drastic inconsistency here between accessing bnodes, and accessing literals and URIs. This requires a fair amount of query munging when doing substitutions. > == DESCRIBE > > Accessing list elements one by one isn't nice if the list is of any size so > get it all at once. Your use case is about a description of the whole > recipe In my use case I was accessing the instructions in chunks, which would be perfectly fine many cases (think of web tutorials that split instructions across multiple pages). > - this could be the CDB (Concise Bounded Description) of the thing and > other > similar schemes for the information provider to give an answer that the > client > can't completely determine. In SPARQL, the DESCRIBE result form provides a > hook for this. It enables the server to return the whole recipe in a > single call. > > CDB can be found at http://sw.nokia.com/uriqa/CBD.html There are a several of problems with using DESCRIBE. The most obvious is that it requires parsing and requerying the data on the client side. The query conditions end up just being a filter on the data. Also, unless you're just querying about a single uri, it's easy to lose the context of the original query[1]. Thirdly, the DESCRIBE hook is some what limited since the only way to provide different result patterns is to provide different end points. This is unfortunate, since unlike the function extensions, there is no consistent name across stores for the client to specify. > ==== Make nodes addressible > > == Dynamically assign identifiers ... > Some may not like automatically assigning URIs to replace the bNodes. > True. > But you want to reference the blank nodes by their identity. Exposing the > labels is no different. Then a generic client would have no way of knowing what was and was not a URI. This solution would break any rdf that was based off the the results of a query. > > == Split the label space of bNodes > > Use a different prefix to identify the two spaces of bNodes. > > _:a for ones that are query bNodes and > _!:xyz for ones in the target graph. > > Pick marker characters to your heart's content. > > A variation is to in the space of labels: _:!xyz > > This a bit like syntax support for the dynamically assigned identifiers. Icky, ugly, and confusing, but workable. > Of these, the protocol approach would appear to fit with your session > paradigm > best. I've used the the local version for sometime. These solutions are mostly orthogonal to implementing sessions. Sessions are a means of assuring data stability across multiple queries. This has has more obvious effects on bnodes than more anything else, but it's still important to have. I know I'm not alone in needing to reference bnodes[2], and you seem to have roughed out a workable solution. It would be nice if this was included in the spec. -Ron [1] Here's a simple example where without the context of the query, the meaning of the results is lost. Background graph: PREFIX : <http://example.com/>. PREFIX foaf: <http://xmlns.com/foaf/0.1/> :Ron foaf:knows _:Amy. _:Amy foaf:mbox <aloomis@glue.umd.edu>; foaf:knows _:John . _:John foaf:knows _:Amy . Query: PREFIX foaf <http://xmlns.com/0.1> PREFIX ex <http://example.com/> DESCRIBE ?person WHERE ex:Ron foaf:knows ?person . Results of CBD: PREFIX : <http://example.com/>. PREFIX foaf: <http://xmlns.com/foaf/0.1/> _:Amy foaf:mbox <aloomis@glue.umd.edu>; foaf:knows _:John . _:John foaf:knows _:Amy . The solution here is to put ex:Ron after DESCRIBE, but this could lead to quite a bit more data than I needed. [2] >From #Swig this morning: http://ilrt.org/discovery/chatlogs/swig/2005-07-05.html#T10-58-51 SeRQL Discussion: http://www.openrdf.org/issues/browse/SES-40?page=all
Received on Tuesday, 5 July 2005 20:42:56 UTC