- From: Jeremy J Carroll <jjc@syapse.com>
- Date: Wed, 22 May 2013 12:24:31 -0700
- To: "semantic-web@w3.org" <semantic-web@w3.org>
- Message-Id: <A87FBE90-1EB2-4D93-8B23-637CFCFA7E89@syapse.com>
I am trying to follow the advice of: http://stackoverflow.com/questions/2397574/how-to-find-a-word-within-text-using-xslt-2-0-and-regex-which-doesnt-have-b-w and applying it to SPARQL REGEX - specifically the suggestion is to use "(^|\W)" or "($|\W)" instead of "\b" I am trying to match literals like "Accessory cervical lymph node" that contain words starting in "lymp" but not literals like "Endolymphatic duct of right membranous labyrinth" that do not. My reading of the specs is that the SPARQL I want is, e.g.: prefix skos: <http://www.w3.org/2004/02/skos/core#> select distinct ?o where { ?s skos:prefLabel|skos:altLabel ?o. ?s skos:inScheme <http://syapse.com/vocabularies/fma/anatomical_entity#> . filter(regex(?o,'\Wlymp','i')) } LIMIT 10 (ignoring the initial word issue) However, the systems I have tried so far (bigdata, bigOWLIM and dydra) all require an additional \ wanting prefix skos: <http://www.w3.org/2004/02/skos/core#> select distinct ?o where { ?s skos:prefLabel|skos:altLabel ?o. ?s skos:inScheme <http://syapse.com/vocabularies/fma/anatomical_entity#> . filter(regex(?o,'\\Wlymp','i')) } LIMIT 10 They seem to accept e.g. "\t" rather than "\\t" as a tab. Is this me misreading the specs, or is it an implementation bug shared between everything I have tried so far … :( My reading of the spec is that SPARQL REGEX defers to http://www.w3.org/TR/xpath-functions/#func-matches which for the \W defers to http://www.w3.org/TR/xmlschema-2/#dt-regex which includes http://www.w3.org/TR/xmlschema-2/#nt-charClass http://www.w3.org/TR/xmlschema-2/#nt-charClassEsc http://www.w3.org/TR/xmlschema-2/#nt-MultiCharEsc defining \W (with only one \) Jeremy J Carroll Principal Architect Syapse, Inc.
Received on Wednesday, 22 May 2013 19:25:04 UTC