- From: Seaborne, Andy <Andy_Seaborne@hplb.hpl.hp.com>
- Date: Thu, 15 May 2003 20:46:51 +0100
- To: "'Alexander Jerusalem'" <ajeru@vknn.org>
- Cc: "'www-rdf-interest@w3.org'" <www-rdf-interest@w3.org>
Alexander, The lack of ability to push value processing to the backend is an issue and with some additional restrictions (such as an OWL ontology applying to regularise the data) could be done. A heavy duty implementation of RDQL should have such features, especially for cases where the backend data does have specialised database structure over and above an RDF graph layout to make it possible to exploit indexes for sort, group and aggration. When the values are extracted into the table of results, we know from SQL what sort of query facilities to provide. The initial stage of RDF->Result set doesn't fit the relational algebra so this is the more interesting research - indeed some applications don't want tables of results anyway and want a graph, or sequence of graphs, one per solution, as the result of a query. > I'm was asking because there's the complementOf property in OWL and I > wonder how I can implement it without this kind of negation. RDQL is "RDF Data Query Language" - there is an OWL level query language DQL [1] > Maybe I'm just abusing these technologies when I think of RDF as a flexible > database format, of OWL as a data modelling language and of RDQL as a data > query language. Yes RDF is sort of like a flexible database format and that flexibility leads to some loosening elements of SQL; like values of a property always being integers. See OWL. > RDQL seems to suggest an in memory/in process view. Not really, it can been used with very large datasets as the triple pattern can be compiled to a single SQL join. One of the reasons for the Jena2 architecture is to make this possible. There is then the issue is about the processing of the values so found; index structure can assist with sort/group but if there are no indexes can be hugely expensive. In the general case, in RDF, there are no indexes. There isn't the equivalent of database tuning and design yet. Specifically about the optional patterns > > > SELECT ?lastname, ?email > > > WHERE > > > (?r, <my:lastname>, ?lastname) , > > > (?r, <my:email>, ?email) > > > >Both property values must exist - it is a graph pattern to match > >against the RDF graph. > > So I guess I would need multiple graphs ORed together to get what I want. this an important feature to add and RDF makes it significant as merged/ad hoc data often has bits missing (it's the vCard problem - retrive all an (RDF) vCard which has optional properties and bNode trees). Andy [1] http://www.daml.org/dql/ PS the SQL world has had a head start on RDF query! -----Original Message----- From: Alexander Jerusalem [mailto:ajeru@vknn.org] Sent: 15 May 2003 20:03 To: Seaborne, Andy Cc: 'www-rdf-interest@w3.org' Subject: RE: Some RDQL questions Thanks a lot for your reply! > > * Is there any way to specify ordering like with the SQL order by > > clause? > > * Am I right to assume that there is no support for aggregate functions? > >You are right - RDQL does not have the features to sort or process the >values returned from a query. In Jnea, they are streamed back in the >order found and this may vary. As RDF does not constrain the data, >results can be a mix of plain string, resources or datatyped literals. My problem with this is that if the database backend doesn't handle sorting, grouping and aggregating, I have to fetch the whole result set from the database process and then do it without access to indexes. That's a problem with large datasets. > >* Would it be possible to query for all resources that do not have a >certain property? > >Not really. RDF does not express negation and the triple patterns >matched on the graph also do not allow tests for the absence of >something. I'm was asking because there's the complementOf property in OWL and I wonder how I can implement it without this kind of negation. > >* If I think of a TriplePatternClause in terms of SQL joins, does it > >have inner or outer join semantics? For example if I say: > > > > > > SELECT ?lastname, ?email > > WHERE > > (?r, <my:lastname>, ?lastname) , > > (?r, <my:email>, ?email) > >Both property values must exist - it is a graph pattern to match >against the RDF graph. So I guess I would need multiple graphs ORed together to get what I want. > > * Does RDQL mandate anything with respect to inference along > > subPropertyOf/subClassOf lines or is this considered an > > implementation >detail? > >The assumption is that inference happens in the triple interfgace to >the data being stored. It is not a feature of the query language. That sounds like a very elegant but hard to implement idea. Gives me something to think about :-) >RDQL just looks a bit like SQL - it isn't SQL. It is more about the >handling of the RDF than about handling after the values have been >extracted from the model. Maybe I'm just abusing these technologies when I think of RDF as a flexible database format, of OWL as a data modelling language and of RDQL as a data query language. RDQL seems to suggest an in memory/in process view. -Alexander
Received on Thursday, 15 May 2003 15:47:18 UTC