- From: Steve Harris <S.W.Harris@ecs.soton.ac.uk>
- Date: Wed, 13 Oct 2004 23:25:04 +0100
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
On Wed, Oct 13, 2004 at 06:49:14 +0100, Andy Seaborne wrote: > >Agree about the oddity of having datatype pricessing in one place and not > >the other being potentially confusing, but they are different, > > They *may* be different - that is the decision we have to make! Syntaxically they are different, wether they are or not semantically. > On one side (user/application centric), Why should pattern matching of XML > datatypes be different to numeric-equals? On the other side, what's the > implementation impact? Does it apply to everyone? > > >and the > >implementaion complexity of handling datatype objects specially is not > >small. Also if we requre that triple expression do datatype manipulation > >then we may require some other means to explicitly turn it off. e.g. > >1.00000000001f != 1. > > > >op:numeric-equal doesnt descuss floating point formats, the result of > >op:numberic-equal(a, b) where a and b are fp should probably be undefined, > >with formats like IEEE-768 is is not really posibly to answer, and without > >requiring SPARQL implementations to provide thier own software fp bit > >opertations we cant sensibly require a given behaviour. Each VM/processor > >will give different results (if its even supported). > > I am proposing that the semantics of the operations would be as defined in > XPath/Xquery Functions and Operators (F&O). > > F&O includes NaNs, -INF, +INF, -0 and +0 from IEEE 754-1985. Databases > support this don't they? What about denormals? Also there are several NaNs in IEEE, though I dont think RDF can represent those anyway. In practical terms, determining if one float is equal to another in any meaningful sense is extremely hard. (SQL) databases do offer == on floats, but I dont think the result is any more meaningful than in C, FORTRAN, or most other systems. I dont have a copy of the SQL '92 spec to check though. There is a general note about the difficulties of comparing floats here: http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm Its fine to allow it (as C does), as long as the result is undefined. Java attepted to provide exact reproducability between architectures, with notable failure, c.f http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf > My reading was that op:numberic-equal applies to floats and doubles. Its no problem to allow it as long as you never test or specify it :) > >ditto with some of the Functions on Strings: fn:substring-before, > >fn:substring-after, > > I belive it helps in optimization, else its regular expression tests for > these. OK, that seems reasonable. I still have a preference for fewer functions rather than more though. > > fn:string-join (seems like an array operator to me), > >fn:normalize-space, fn:normalize-unicode (ouch!), fn:escape-uri (hard to > >define as many systems with want to use the underlying URL escaping > >features of thier enviroment, not make one to SPARQL spec). > > Manipulations could go. > > (There are no operations in rq23/ that involve XML sequences by the way). OK, my reading of some of them was that they did, but I dont understand the XQuery vocab. > >fn:matches states perl5 regex, but that seems a bit onerous for ssytems > >that are eg. based on JavaScript, building a complete perl5 regex engine > >seems like too much work. POSIX would be more reasonable IMHO. > > We should go for whatever XML Schema datatypes goes for. > > [[ F&O: > The regular expression syntax used by these functions is defined in terms > of the regular expression syntax specified in XML Schema (see [XML Schema > Part 2: Datatypes]), which in turn is based on the established conventions > of languages such as Perl. However, because XML Schema uses regular > expressions only for validity checking, it omits some facilities that are > widely-used with languages such as Perl. This section, therefore, describes > extensions to the XML Schema regular expressions syntax that reinstate > these capabilities. > ]] > > I haven't looked recently - what are the differences here? Perl 5 has a non-greedy operator form, eg (.*?), its very useful, but adds a lot to the complexity of the engine, as I understand it. POSIX just seems like a more stable standard to me than whatever perl 5 happens to say this month. I can see the value of aligning with XML Schema though. > >isBound seems like an unneccesary depature from SQLs IS NULL, but I'm not > >that bothered, just mentioning it. > > As discussed on IRC, there are no nulls to help with the issues around > comparing nulls. An SQL implementation may choose to use SQL NULLs > internally if that gets the right answers. Defining in terms of NULLs for > non-SQL engines is unnnecessary. It doesnt have to be NULL, it could be any symbol, but I would prefer that there was one myself. Probably just a matter of taste. - Steve
Received on Wednesday, 13 October 2004 22:25:08 UTC