- From: Axel Polleres <axel.polleres@deri.org>
- Date: Mon, 04 May 2009 20:34:31 +0100
- To: Kjetil Kjernsmo <Kjetil.Kjernsmo@computas.com>
- CC: public-rdf-dawg@w3.org
Kjetil, John, Can your proposal/discussion be summarized (without going int o extensions such as loc: ...) in that what you want is a simpler full-text "surface syntax for Regular expressions"? The proposal of the basic expression set looks reasonable and I think this could fly as a part of SurfaceSyntax, if we find agreement on that. Opinions? Axel Kjetil Kjernsmo wrote: > John, > > Thank you very much for your support! > > On Monday 04 May 2009 16:23:12 Clark, John wrote: >> I agree, and I think it's a useful exercise to try to standardize "general >> text search", perhaps even for consumption by technologies other than >> SPARQL. > > Possibly, but I care first and foremost about SPARQL :-) If anybody else has > any use for it, I'd say fine. > >>> All we have used so far can be summarised as follows: >>> 1) Terms shorter than three characters are ignored. >> So, with this feature, query string "Amazon S3" would be equivalent to >> "Amazon" and query string "theorems about ?" would be equivalent to >> "theorems about", correct? Â This makes me uneasy. > > Yeah, it has some drawbacks, clearly. I think it is mostly a practical matter, > as far as I know, this restriction exists in LARQ, Virtuoso, MySQL to name a > few I've worked with. It is painful at times, but I guess that it is simply > too time-consuming to create an index that will match any two-letter > combinations? > >>> 2) a single terms is matched exactly against a whole word. >>> 3) a single term ending in asterisk is matched against words beginning >>> with the term. >>> 4) multiple terms with AND matches all words in any order. >>> 5) multiple terms with OR matches any words in any order. >>> 6) multiple terms without an operator matches all words in the given >>> order. >>> >>> At some point, we had phrase search too, which is a nice feature but I >>> think we dropped it. >> I think this is a reasonable set, but I'd also like to approach it slightly >> differently and try to standardize what already exists (and thus is >> reasonably "well understood" by users). > > Thank you! > >> That is, I'd suggest standardizing >> generalized text search as "what Google does", > > Well, some of what "what Google does" could be > http://www.google.com/support/websearch/bin/answer.py?hl=en&answer=136861 > and indeed, I think some of that is quite reasonable, but I don't know if it > is right for us. > >> including phrase search with >> quotes, term negation, and query extensions with syntax like "loc: >> cleveland, ohio" (e.g. in Google maps). > > Hmmm, I think we might end up standardising a bit too much of CQL (which is > quite nice and a nice complement to SPARQL in many situations): > http://www.loc.gov/standards/sru/specs/cql.html > Also, I don't think loc: would belong in the object, since that is a predicate > for us, and I feel that such specific things belong in a application layer > that translates to SPARQL. Also, with property paths, we might be able to say > stuff like "geo:location or any sub properties". > > Anyway, I hope we can discuss this a bit further on Wednesday. My agenda here > is to constrain the feature so that it is a useful feature, yet something > that will not take a lot of WG time and not a lot of time for implementers. > > Kind regards > > Kjetil Kjernsmo -- Dr. Axel Polleres Digital Enterprise Research Institute, National University of Ireland, Galway email: axel.polleres@deri.org url: http://www.polleres.net/
Received on Monday, 4 May 2009 19:35:14 UTC