Re: RDQL + DAWG = BRQL from Alberto Reggiori on 2004-06-29 (public-rdf-dawg@w3.org from April to June 2004)

From: Alberto Reggiori <alberto@asemantics.com>
Date: Tue, 29 Jun 2004 19:06:01 +0200
To: Andy Seaborne <Andy_Seaborne@hplb.hpl.hp.com>
Cc: 'Asemantics Staff' <staff@asemantics.com>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <99514392-C9EE-11D8-8D4E-0003939CA324@asemantics.com>
On Jun 17, 2004, at 12:48 PM, Seaborne, Andy wrote:
>
> Out of those discussions, we came up with an outline design that  
> extends
> RDQL to meet the DAWG requirements for a query langauge. It does not  
> cover
> protocol issues.
>
>     http://jena.hpl.hp.com/~afs/BRQL.html

Dave, Stave, Andy, we like the BRQL proposal a lot, and we find it a  
good starting point indeed for the upcoming DAWG design phase - see  
some comments inline below (both about syntax and design issues/ideas)

>
> Features:
>
> + New result form: CONSTRUCT

+1 useful

but, we are wondering whether or not this would require full XML  
well-formed syntax support in CONSTRUCT then, for example when the  
query results/output would contain rdf:parseType="Literal" literal  
values (i.e. including entities, XML escapes, attributes and so on)  
i.e. more like XQuery constructors. Of course, if we would require the  
CONSTRUCT clause to be used to "build" simpler RDF graphs without  
requiring those to be in XML format this would not be a big issue (or  
escape those using N-Triples or Turtle canonical XML syntax then?).  
Simpler design is better perhaps - but we might need to re-consider  
this more once users and developers will start using BRQL for  
real-world Web/XML applications.

Concerning about a more syntax aspect of BRQL, perhaps AS keyword could  
be used as a synonym for CONSTRUCT for read-ness.

> + Triple source identification (sometimes known as "quads")

+1 very useful feature

even though we wonder whether or not the use of SOURCE keyword for  
quads would clash with the current FROM synonym to specify the input  
source to query from (e.g. select ?foo ?bar source my-foo-bar.rdf where  
(?foo....) using bar for <>.....) - perhaps DOMAIN, GRAPH or NAME  
keywords could be used instead for quads?

> + Optional triple matching

+1 very useful

what about using '[]' (square brackets) around triple-patterns  
themselves as an alternative to spell out the OPTIONAL keyword? or use  
instead a '?' (question mark) in front of a triple-pattern to express  
that it is optional? i.e. more UNIX like and familiar to developers -  
see also  
http://lists.w3.org/Archives/Public/www-rdf-rules/2003Apr/0030.html for  
some old ideas along those lines.

More, it is not clear at  the BRQL syntax level how to "group"  
triple-patterns by the "optional"  clause i.e. using nested parenthesis  
or use braces to
group sources vs. repeated keywords

> + Non-existent triple testing

+1 useful

Perhaps allow some alternative syntactic sugar like '~' or '!' in front  
of triple-pattern as a synonym of NOT - and also here, it is not clear  
from the syntax how to "group" triple-patterns by the "not"  clause  
i.e. using nested parenthesis or use braces to group sources vs.  
repeated keywords  i.e. how to gain readability and user friendliness

> + Filter functions on values

+1 very useful

Perhaps need to mention in the document a basic set of filter-functions  
(profiles) to be used on a restricted set of XSD data types (i.e.  
integers, doubles and dates) - like  
http://www.w3.org/TR/xquery-operators/ or http://exslt.org modules -  
and extract a basic list of *must* support extensions for a BRQL  
implementation to be compliant e.g. numericals, maths and dates  
comparison. The QName function-call-name idea  coupled with the USING  
clause is very cool and it should nicely fit with EricP extensibility  
idea as shown its Algae2 - see  
http://lists.w3.org/Archives/Public/public-rdf-dawg/2004AprJun/ 
0688.html - e.g. adding a GML profile (specific geo-functions)


In addition, here is our wish-list of additions/improvements to the  
BRQL proposal:

+ allow '$' (dollar sign) as an alternative to '?' on variables (or  
better replace it due to clashing with SQL interface usage of '?' for  
'placeholders and bind values' - see for example DBI interface  
http://www.perl.com/lpt/a/2001/03/dbiokay.html#placeholders )

+ allow LIKE operator explicitly for free-text /stemming on literals at  
triple-pattern level (not only in AND/constraints part as in RDQL  
today)

    E.g. select all triples which literals $b contain 'ave'

         select
                  $a $b
         where
                  ($a foo:prop $b LIKE %ave% )

+ allow variables on xml:lang and rdf:datatype literal parts at the  
triple-pattern level E.g. give me all xml:lang of such and such graph  
pattern

	select
		$lang
	where
		($a dc:title $b@$lang)

    or get rdf:datatype

	select
		$dt
	where
		($a dc:foo $b^^$dt)

Where here people might argue about $dt or $lang not being nodes into  
the RDF graph as such, and which data type they are instead - but  
rather some "string" type being intrinsically part of the query API.   
And even if one could use function-filters to "grep" specific literal  
parts in BRQL, it would not be generally possible to extract literal  
parts explicitly (ok, this is not a requirements either - true). But if  
one consider how this would interact with CONSTRUCT the following  
example starts to make sense in real-world applications:

E.g. use BRQL to map xml:lang fields to dc:language properties (or the  
other way around)

construct
	($foo dc:language $lang)
where
	($foo prop:bar $value@$lang )

Which might really take us to the formulation of a new requirement  
about "how to select parts of RDF literals (i.e. xml:lang or  
rdf:datatype)" into a query

+ ORDER BY keyword when result set (bindings) is a table (list of  
bindings --> not sure how happen we actually removed that  
requirement/design-issue from the UC&R document at some time)

And even if ordering in RDF might not make sense if we are talking  
about its graph-ical representation, it would rather make a lot of  
sense in real-world applications, where developers will demand ad-hoc  
keywords at JDBC/ODBC/DBI level to sort results in some meaningful way.  
Brute force Unicode lexical order sorting might be a starting  
algorithm, even if it might not contemplate the most general case for  
ordering of literals though.

> The link above describes these in more detail.  It isn't a finished  
> design;
> most of the features have been implemented somewhere before.

again, good to see this proposal happening, and glad to see it has  
already been partly implemented :)

Yours

Alberto
Received on Tuesday, 29 June 2004 13:09:14 UTC