- From: Henry Story <Henry.Story@Sun.COM>
- Date: Mon, 23 Oct 2006 13:17:03 +0200
- To: public-rdf-dawg@w3.org
Hi,
I am a great SPARQL enthusiast [1], but have become concerned about
the regex functionality in the specification. Having worked at
AltaVista I have developed a feeling for volume and complexity. When
one is dealing with 100 million searches a day or more every cpu
instruction counts.
My feeling is that regex is much too powerful for any large database
to allow it to be a mandatory part of the spec. (I may very well be
wrong, if so please point me to a study that shows this not to be the
case). Any such database will just not be able to implement such a
query language. Regex therfore must be optional.
If possible it would be good to allow in addition for a simpler form
of string query language that has been proven to work with large
databases. This would treat words as atom units, allow their
inclusion or exclusion from a string. Things like this
sparql -cat "Danny Ayers"
which would find Danny's posts that don't involve cats. AltaVista did
allow for simple suffix wildcards such as searches on cat* .
I am just posting this because regex raised an alarm flag in my head.
Perhaps this has been dealt with before.
Henry
[1] http://blogs.sun.com/bblfish/entry/sparqling_roller among other
posts
Home page: http://bblfish.net/
Sun Blog: http://blogs.sun.com/bblfish/
Foaf name: http://bblfish.net/people/henry/card#me
Received on Monday, 23 October 2006 11:54:19 UTC