On Wed, May 22, 2013 at 3:24 PM, Jeremy J Carroll <jjc@syapse.com> wrote: > filter(regex(?o,'\Wlymp','i')) > vs. > filter(regex(?o,'\\Wlymp','i')) > > They seem to accept e.g. "\t" rather than "\\t" as a tab. > > Is this me misreading the specs, or is it an implementation bug shared > between everything I have tried so far … :( > > My reading of the spec is that > SPARQL REGEX defers to > http://www.w3.org/TR/xpath-functions/#func-matches > which for the \W defers to > http://www.w3.org/TR/xmlschema-2/#dt-regex > which includes > http://www.w3.org/TR/xmlschema-2/#nt-charClass > http://www.w3.org/TR/xmlschema-2/#nt-charClassEsc > http://www.w3.org/TR/xmlschema-2/#nt-MultiCharEsc > Interesting question. I think the "\W" is undefined. The second parameter of the regex is a SPARQL string, meaning that the SPARQL parser will read it before providing the string as a parameter to the regex. SPARQL accepts escape sequences according to: http://www.w3.org/TR/sparql11-query/#grammarEscapes "\W" is not an accepted escape sequence, though "\t" is. For a parser to leave a sequence starting with "\" untouched in a string would be very strange (IMO), since the accepted way to achieve a '\' character is with the sequence "\\". Looking at the spec, I don't see anything indicating what should happen to a character preceded by a backslash that is not a known escape character. There may be parsers which leave a "\W" alone in the string, but I expect that this will be implementation dependent. To correctly get a sequence of "\W" into the string that is provided to regex you will need a "\\W" as you have found. Regards, Paul GearonReceived on Wednesday, 22 May 2013 20:22:37 UTC
This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:33 UTC