W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > April to June 2012

Re: STRBEFORE() etc.

From: Steve Harris <steve.harris@garlik.com>
Date: Mon, 21 May 2012 16:36:30 +0100
Cc: public-rdf-dawg@w3.org
Message-Id: <C84E7280-23C9-4DB9-A137-8A1E79A18B93@garlik.com>
To: Andy Seaborne <andy.seaborne@epimorphics.com>
+1

On 2012-05-21, at 14:32, Andy Seaborne wrote:

> Yes - having a fixed result for the "no match" seems sensible.
> 
> F&O calls out the case of the empty string for second argument. Reading the -bfore and -after cases, I think it is defines what is mean by a match of the empty string (after returns the whole thing) and it is menat to be a match.  The argument type rules would apply in the SPARQL case.
> 
> Proposed change (draft - the editor will wordsmith as needed when the changes are made):
> 
> strbefore:
> 
> """
> The function returns a literal of the same kind
> (simple literal, plain literal same language tag, xsd:string)
> as the first argument arg1. The lexical form of
> the result is the substring of the value of arg1
> that precedes in arg1 the first occurrence of
> the lexical form of arg2;
> otherwise the lexical form of the result is the empty string.
> If the lexical form of arg2 is the empty string,
> the lexical form of the result is the empty string.
> """
> ==>
> """
> For compatible arguments, if the lexical part of the second argument occurs as a substring of the lexical part of the first argument, the function returns a literal of the same kind
> as the first argument arg1 (simple literal, plain literal same language tag, xsd:string). The lexical form of
> the result is the substring of the value of arg1
> that precedes in arg1 the first occurrence of
> the lexical form of arg2.  If the lexical form of arg2 is the empty string, this is considered to be a match and the lexical form of the result is the empty string"
> 
> If there is no such occurrence, an empty simple literal is returned.
> """
> 
> for strafter, change 'precedes' to 'follows' and the return is the whole lexical form for the arg2 in the empty string case.
> 
> + expand the examples
> 
> strbefore("abc"@en, "z"@en) -> ""
> strbefore("abc"@en, "z") -> ""
> strbefore("abc"@en, ""@en) -> ""@en
> strbefore("abc"@en, "") -> ""@en
> 
> strafter("abc"@en, "z"@en) -> ""
> strafter("abc"@en, "z") -> ""
> strafter("abc"@en, ""@en) -> "abc"@en
> strafter("abc"@en, "") -> "abc"@en
> 
> 	Andy
> 
> On 24/04/12 10:29, Steve Harris wrote:
>> Hi all,
>> 
>> [sorry if this is covered in the testcases, I still don't have a harness for them]
>> 
>> So I'm just looking at STRBEFORE(), http://www.w3.org/2009/sparql/docs/query-1.1/rq25.xml#func-strbefore
>> 
>> The text says:
>> 
>> “The function returns a literal of the same kind (simple literal, plain literal same language tag, xsd:string) as the first argument arg1.”
>> 
>> However - does this trump the statement in http://www.w3.org/TR/xpath-functions/#func-substring-before that under some circumstances fn:substring-before “returns the zero-length string”.
> 
> Not such "trump" because the XSD function does not have to worry about lang tags or xsd:string/simple literals.
> 
>> 
>> I suspect that the right behaviour is for the function to return "" under those conditions - both as it makes more sense logically, and to make error handing in queries simpler.
>> 
>> e.g.
>>    STRBEFORE("foo"@en-GB, "bar") → ""@en-GB
>> is somewhat misleading, and a little tricky to test for e.g. you'd have to use
>>    STR(STRBEFORE(?string, "\t")) = ""
>>  in order to catch non-matches
>> 
>> I /think/ what I'd want as a user is:
>> 
>> arg1		arg2		result		comment
>> -------------	------------	----------		-----------------------------------------------------
>> "foo"@fr		"o"			"f"@fr		"normal" case
>> "foo"@fr		"bar"		""			exception case
>> "foo"@fr		"foo"		""@fr		the empty lang-tagged string is before
>> "foo"@fr		""			""			exception case
>> 
>> i.e. the return values in the exception cases are as per fn:substring-before, and the "literal of the same kind" only applies to the non-excpetion cases.
>> 
>> But maybe that's confusing too?
>> 
>> FWIW, I've currently implemented "literal of the same kind" only, and have no end-user experience.
>> 
>> Thoughts?
>> 
>> Cheers,
>>    Steve
>> 
> 

-- 
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 653331 VAT # 887 1335 93
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ
Received on Monday, 21 May 2012 15:37:26 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:48 GMT