W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2006

Re: make regex optional in SPARQL

From: Fred Zemke <fred.zemke@oracle.com>
Date: Mon, 23 Oct 2006 10:13:27 -0700
Message-ID: <453CF837.9020800@oracle.com>
To: Henry Story <Henry.Story@Sun.COM>
CC: public-rdf-dawg@w3.org

I can agree with making regex optional, but it is not true that large 
databases
cannot implement regular expressions.  Oracle has in its SQL product.
As with any compute-intensive query, it is a question of giving the user
the option to run the query, and the user can decide if he can tolerate the
performance hit. 

Also, some regex expressions are much harder to
compute than others.  User documentation can frequently help to point
out to users the best ways to frame their queries, or which subsets of
the complete language are more likely to be performant.

But as I said, I can agree with making it optional.
Or we might define a mandatory subset known to be performant,
leaving the rest as optional.

Fred

Henry Story wrote:

>
> Hi,
>
> I am a great SPARQL enthusiast [1], but have become concerned about  
> the regex functionality in the specification. Having worked at  
> AltaVista I have developed a feeling for volume and complexity. When  
> one is dealing with 100 million searches a day or more every cpu  
> instruction counts.
>
> My feeling is that regex is much too powerful for any large database  
> to allow it to be a mandatory part of the spec. (I may very well be  
> wrong, if so please point me to a study that shows this not to be the  
> case). Any such database will just not be able to implement such a  
> query language. Regex therfore must be optional.
>
> If possible it would be good to allow in addition for a simpler form  
> of string query language that has been proven to work with large  
> databases. This would treat words as atom units, allow their  
> inclusion or exclusion from a string. Things like this
>
>    sparql -cat "Danny Ayers"
>
> which would find Danny's posts that don't involve cats. AltaVista did  
> allow for simple suffix wildcards such as searches on cat* .
>
> I am just posting this because regex raised an alarm flag in my head.  
> Perhaps this has been dealt with before.
>
> Henry
>
> [1] http://blogs.sun.com/bblfish/entry/sparqling_roller among other  
> posts
>
> Home page: http://bblfish.net/
> Sun Blog: http://blogs.sun.com/bblfish/
> Foaf name: http://bblfish.net/people/henry/card#me
>
>
>
>
Received on Monday, 23 October 2006 17:14:22 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:27 GMT