- From: Seaborne, Andy <Andy_Seaborne@hplb.hpl.hp.com>
- Date: Wed, 30 Jan 2002 15:36:32 -0000
- To: "'www-rdf-comments@w3.org'" <www-rdf-comments@w3.org>
In [1] Dan Connolly wrote: > We've done a few thought experiements about modifying > our implementation's treatment of literals to be (lex, type) pairs, > but it gets horribly messy. > > But lots of folks would consider our code horribly messy > as it is, so that's perhaps not much of an argument. > > But as Sergey and I pointed out, there seem to be a lot > of RDF query engines and such deployed that consider > "abc" a match for "abc". RDQL, the query systems in Jena and which implements SquishQL, does indeed consider "abc" as a match for "abc". RDQL does not use any schema information unless the application writer puts it in the query. This is because it is really just a "graph-access" mechanism - since there is no datatype information in RDF, unless something has been encoded into the graph in some way, then the dynamic typing (i.e. parse as needed) was the best solution *at the time* [Note: I didn't want to require the presence of RDF Schema]. Comparsion in triple patterns (the WHERE clause) involving literals is by exact string match; URIs are distibgusihed. Comparision in the filters (AND clause) depends on the operator: if is a numeric operator than an attempt to turn it into a number is made. Patrick Stickler wrote: > Of course, I would expect that a query API would be > based on an abstraction of the "raw" RDF graph, which > takes datatype context into account, so that a query > such as above would not be based on string comparison > of literals, but on comparison of TDL pairings > (lexical form + datatype). When RDF gets datatypes, then I would be planning on doing a new query language (or changing the old one) which worked over the new, improved datatyped literals. The datatyping may not break APIs which work at the details of the graph but there again, it is no longer what the application writer would like (IMHO). Now, queries would be over what the application thinks in terms of and I don't think that will be the details of the graph encoding for types so I would be aiming for syntactic forms at least to avoid this. What is hard is if there are 2+ ways to encode the same thing (in the application writers frame). If the query system has to be aware that the information could be in one of more local idioms and/or a global idiom then it is going to be tedious; having the application writers have to be aware of this is worse. Queries will be really ugly and might mean having general disjunction in the pattern matching which then opens up the possibility of undefined variables. The current situation, no type information, isn't so bad because it is clear what the rules of the game are. Datatyping would improve the robustness of queries, avoid the occasional unexpected result, help storage and efficiency. Patrick wrote: > Hmmm.... Why not just include the range constraints in the > query? After all, it's knowledge that's in the graph. E.g. > > select ?x ?y ?z ?r > where > (?x <dc:Title> ?y) > (?z <age> ?y) > (<dc:Title> <rdfs:range> ?r) > (<age> <rdfs:range> ?r) That works in RDQL as does: SELECT * WHERE (<foo>, ?pred, ?z) , (?pred, <rdfs:range>, <integer>) AND ?z < 5 to select predicates with a given,fixed range. Andy [1] http://lists.w3.org/Archives/Public/w3c-rdfcore-wg/2002Jan/0358.html
Received on Wednesday, 30 January 2002 10:36:36 UTC