- From: Henry Story <Henry.Story@Sun.COM>
- Date: Mon, 23 Oct 2006 13:17:03 +0200
- To: public-rdf-dawg@w3.org
Hi, I am a great SPARQL enthusiast [1], but have become concerned about the regex functionality in the specification. Having worked at AltaVista I have developed a feeling for volume and complexity. When one is dealing with 100 million searches a day or more every cpu instruction counts. My feeling is that regex is much too powerful for any large database to allow it to be a mandatory part of the spec. (I may very well be wrong, if so please point me to a study that shows this not to be the case). Any such database will just not be able to implement such a query language. Regex therfore must be optional. If possible it would be good to allow in addition for a simpler form of string query language that has been proven to work with large databases. This would treat words as atom units, allow their inclusion or exclusion from a string. Things like this sparql -cat "Danny Ayers" which would find Danny's posts that don't involve cats. AltaVista did allow for simple suffix wildcards such as searches on cat* . I am just posting this because regex raised an alarm flag in my head. Perhaps this has been dealt with before. Henry [1] http://blogs.sun.com/bblfish/entry/sparqling_roller among other posts Home page: http://bblfish.net/ Sun Blog: http://blogs.sun.com/bblfish/ Foaf name: http://bblfish.net/people/henry/card#me
Received on Monday, 23 October 2006 11:54:19 UTC