Re: test for DATATYPE on plain literals, please

On Mon, Nov 14, 2005 at 02:08:34PM +0000, Seaborne, Andy wrote:
> 
> 
> 
> Dan Connolly wrote:
> >"It returns <xsd:string> if the argument is an untyped literal."
> > -- http://www.w3.org/2001/sw/DataAccess/rq23/#func-datatype
> >
> >
> >SPARQLer doesn't seem to agree; I tried this query:
> >
> >SELECT ?book ?title WHERE { ?book dc:title ?title FILTER
> >( DATATYPE(?title)) }
> >
> >http://sparql.org/books?query=PREFIX+books%3A+++%3Chttp%3A%2F%
> >2Fexample.org%2Fbook%2F%3E%0D%0APREFIX+dc%3A++++++%3Chttp%3A%2F%
> >2Fpurl.org%2Fdc%2Felements%2F1.1%2F%3E%0D%0ASELECT+%3Fbook+%3Ftitle
> >+WHERE+%7B+%3Fbook+dc%3Atitle+%3Ftitle+FILTER+%28+DATATYPE%28%3Ftitle%
> >29%29+%7D%0D%0A&stylesheet=xml-to-html.xsl
> >
> >and got no results.
> 
> ARQ 1.0 is broken in this area - I've rewritten the whole area.  But your 
> query still returns nothing :-(
> 
> >
> >I suspect SPARQLer is not quite caught up there. To make sure
> >developers learn about this detail, let's make sure there's a test,
> >please.
> >
> >Steve, would you please add a test or get somebody to do it?
> >Or if one is already there, point me to it?
> >
> >I looked, and the closest I see is something with doubles.
> >http://www.w3.org/2001/sw/DataAccess/tests/#datatype-1
> >
> 
> There are several things going on here as well:
> 
> 1/ Whether DATATYPE(?title) should xsd:string
> 
> 2/ The EBV rules
> 
> 3/ Whether EBV applies at all.
> 
> - - - - - -
> 
> 1/ Whether DATATYPE(?title) => xsd:string
> 
> It's a D-entailment:
> http://www.w3.org/TR/rdf-mt/#DtypeRules
> 
> xsd 1a  uuu aaa "sss".  		=> uuu aaa "sss"^^xsd:string .
> xsd 1b 	uuu aaa "sss"^^xsd:string . 	=> uuu aaa "sss".
> 
> that equates plain literals and xsd:strings.  It seems reasonable to treat 
> this like any other XSD vocabulary interpretation but I see datatype() as 
> acting on the syntax to extract the datatytpe of the literal, not acting on 

This seems like a perfectly reasonable principle, but it was tricky to
say what could be returned that was either an IRI or an error if the
argument had not datatype. The consistency checking use case came up,
but I decided to relegate that to extension functions. Downside is
that you lose interop on, and maybe ubiquity of deployment of,
my:fussyDatatype().

The current wording says that it's syntactic plus the
""->""^^xs:string :
[[
Returns the datatype of arg if arg is a typed literal. It returns
<xsd:string> if arg is an untyped literal.
]]

Nothing in there says that "foo"^^subtype implies "foo"^^supertype.
11.1.1 Type Promotion [PROM] describes the type promotion mechanism
for finding an appropriate numeric functions but keeps the datatype
pristine:
[[
Promotion does not change the bindings of variables.
]]

Do we need more text specifically steering folks away from potential
misconception?

> entailments in the graph.  Concretely, in a graph with both untyped 
> literals and literals with xsd:strings, how can the application sort them 
> out for, say, consistency checking?  Raising an error on an a plain literal 
> seems more consistent than returning xsd:string
> 
> - - - - - -
> 
> 2/ EBV rules
> 
> As Steve pointed out, the boolean effective value rules are "default true".
> 
> I have implemented Boolean Effective Value to test for the enumerated cases 
> in rq23 but on unknown types and things that aren't literals at all it 
> raises an error.  As written, the rules say somesort of coercion is always 
> possible - what if there is an error raised?
> 
> ARQ (in CVS) returns xsd:string on plain literals so still rejects all 
> solutions. Maybe it should return false (xsd:string is known not to be 
> xsd:boolean value true) - ARQ behind SPARQL is the 1.0 release and is buggy 
> in this area.
> 
> I would prefer the BEV rules to say something like:
> 
> -----------------------------
> 
> 11.2.2 Effective Boolean Value
> 
> When an operand is coerced to xsd:boolean through invoking a function that 
> takes a xsd:boolean argument, the following rules apply:
> 
> If the operand is not a literal, a type error is generated.
> 
> If the literal is known to be the value true the result is true.
> 
> If the literal is of unknown result is false.
> 
> If the literal is an XSD datatype then the result is TRUE unless any of the 
> following are true:
> 
>     * The operand is unbound.
>     * The operand is an xsd:boolean with a FALSE value.
>     * The operand is a 0-length untyped RDF literal or xsd:string.
>     * The operand is any numeric type with a value of 0.
>     * The operand is an xsd:double or xsd:float with a value of NaN
> -----------------------------

The EBV rules come from XPath's EBV [XEBV] which leans on the
fn:boolean constructor in F&O [BOOL]. The SPARQL ones have the
sequence rules removed, and resulting in two TRUE cases added:
IRI and bNode.

Are you still motivated to change this, given that it will fall
away from the XQuery semantics?

> - - - - - -
> 
> 3/ Whether EBV applies at all?
> 
> rq23 says "When an operand is coerced to xsd:boolean through invoking a 
> function"  FILTER is not a function.  The definition of Value Constraint 
> applies only on boolena valued expressions so somewhere between the two, we 
> need to fix the text.

Filter operatos on an EBV. Any proposed changes to:?
[[
SPARQL FILTERs restrict the set of solutions according to the given
expression. Specifically, FILTERs eliminate any solutions that, when
substituted into the expression, result in either an effective boolean
value of false or produce a type error. Effective boolean values are
defined in section 11.2.2 Effective Boolean Value, type error is
defined in XQuery 1.0: An XML Query Language [XQUERY] section 2.3.1,
Kinds of Errors.
]]


[PROM] http://www.w3.org/2001/sw/DataAccess/rq23/#promotion
[XEBV] http://www.w3.org/TR/xpath20/#id-ebv
[BOOL] http://www.w3.org/TR/xquery-operators/#func-boolean
[TEST] http://www.w3.org/2001/sw/DataAccess/rq23/#tests
-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +81.90.6533.3882

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Monday, 14 November 2005 15:51:54 UTC