Comments on Functions and Operators (30th April) from David Carlisle on 2002-05-21 (public-qt-comments@w3.org from May 2002)

From: David Carlisle <davidc@nag.co.uk>
Date: Tue, 21 May 2002 12:11:05 +0100
To: public-qt-comments@w3.org
Message-Id: <200205211111.MAA08459@penguin.nag.co.uk>
[using public-qt-comments  as some WG members have indicated that was
the intended comment list for F&O in common with the other documents in
this round]



2.6 xf:unique-ID

   I agree with others that there seems little point in this function.  If
   for some reason you needed the functionality, it is just something like
   @*[. instance of xs:ID] isn't it?


3.2 Numeric Constructors

   Having xs:byte and friends is almost (but not quite) reasonable in
   schema where it provides a shorthand to constrain integer values to
   set ranges (if they are the ranges you happen to want) but I see no
   reason not to map all these to a single numeric type (or
   integer/float distinction at most) within Xpath. Standard Xpath
   constructs can be used to ensure that any calculated value is within
   range. (However I'll assume in the comments below that all the types
   are kept)


3.2.1 Returns the decimal value that is represented by the characters
    contained in the value of $srcval. For this constructor, $srcval
    must be a string literal. 

   I commented on this before, as did others I believe. restricting the
   constructors to literals, especially if they use a function syntax is
   unreasonable. Arguments involving optimisation don't really hold as
   any optimising compiler must be able to spot when a function argument
   is literal and so optimise away the function call.  (This is a
   generic comment for all constructors, I only comment on this one.)


3.3 Operators on Numeric Values
The arguments and return types for the arithmetic operators are the
basic numeric types: decimal, float, and double and types derived from
them. For simplicity, each operator is defined to operate on operands of
the same datatype and to return the same datatype 


The type promotion scheme includes only two rules: 

A derived type may be promoted its base type.

decimal may be promoted to float, and float may be promoted to double. 



   It is not clear to me from that wording if the arguments are _always_
   promoted to their base type. The use of "may" suggests not but the
   examples involving integer being promoted to decimal suggest yes.
   What happens if the two arguments are the same (non-base) type?

   In particular I can not tell if 5 div 3 is first converted
   to 5.0 div 3.0 (and so compatible with XPath 1) or, as in earlier
   drafts of this document does integer division returning an integer.


   similarly

3.3.5.1 Examples
op:numeric-mod(10,3) returns 1.

    If 10 and 3 there are integers (which I think they are?) then it
    should be made clear that 1 is not of integer type (if it isn't).

3.3.7 op:numeric-unary-minus
    should this specify what happens on NaN (presumably returns NaN?)



3.4.1 op:numeric-equal
op:numeric-equal(numeric $operand1, numeric $operand2) => boolean
Returns true if and only if $operand1 is exactly equal to
$operand2. This function backs up the "eq" and "ne" operators on numeric
values. 

   As Jeni commented in here comments on this draft, it's not clear that
   eq is particularly useful at the surface Xpath syntax (as opposed to
   the semantics), if it were dropped the last phrase would need
   replacing with one referring to = and !=.



4.1 String Types
The operators described in this section are defined on the following string types.

string 
 normalizedString 

   As I commented on the last draft and above for the numeric types the
   justification for copying over all the schema derived types into
   XPath is rather weak. 

   normalizedString is a particularly bad example it is just a subset of
   string, in many uses of Xpath (XSLT for example) it's rather hard to
   generate a string that _isn't_ in this subset.

   The purpose of the datatype in schema is to instruct the parser on
   the mapping from the input characters to the value space, but this is
   just a parsing issue it has no use at all in XPath/XQuery.



4.2.3 xf:token
   Unless there are some useful functions defined on this datatype that
   are not available to strings, why would anyone use this? 



4.2.8 xf:ID
The semantic correctness of ID values (that they must be unique within a
given document) is not enforced by the xf:ID function.
   

    Since this property is the only useful property that IDs have, there
    seems no point in this function, if values of type ID are not
    guaranteed to be unique to a document  (as they are
    effectively in XPath 1, even for invalid documents) then you may as
    well use string.



[Issue 144: Should the concat function accept sequences as arguments? ] 

    Maybe you don't want to overload concat but a mapconcat that
    takes the string value of every item in a sequence and concatenates
    the result would be useful (of course if you'd added proper higher
    order function  support this could have been done with some
    combination of fold and concat, but....



4.4.5 xf:contains
   How do collations work with respect to substring matching as opposed
   to ordering, if ss and sharp-s collate equal, is "s" a substring of
   sharp-s? (I suspect this is ignorance on my part, but I'm probably
   not the only one, some examples might help, as is done in some detail
   for collations and the comparison operators).


4.4.10.1 Examples
xf:normalize-space(" hello world ") returns "hello world". 

   As came up recently on xsl-list it would be good to have an example
   of   
   xf:normalize-space("     ") returns "". 

  As some readers (surprisingly) managed to construe the XPath 1
  description as allowing a return value of a space.




> A "lower-case letter" is a character whose Unicode General Category
> class includes "Ll". The corresponding upper-case letter is determined
> using [Unicode Case Mappings].  

   In the unicode tables each character is unambiguously upper or lower
   case, but the mappings are locale-dependent.
   so i is lowercase and
   I and LATIN CAPITAL LETTER I WITH DOT ABOVE are both uppercase
   but which of these is the the uppercase of i depends on who you ask.
   (probably it just depends on whether you are in Turkey, but still...)
   


 4.4.15 xf:string-pad

    Would have been more useful in XPath 1 than it is in 2, where it
    can trivially be defined as a user function.

    It would be better to take out all of these "shortcut functions"
    from the core language and perhaps add them again as a "standard
    library" of functions that are defined within Xquery (or/and
    XSLT). Many other languages follow this model, have a small core set
    of functions but a larger collection of useful functions that are
    provided to the end users but are (specified as being) defined using
    the language rather than being a core part of the language.  Of
    course an implementation may or may not use the "obvious" definition
    in the language or a built in optimisation.



4.4.17 xf:replace

    This doesn't even come close to what is needed, but I think
    a future draft will have more on this...



6.1 Duration, Date and Time Types

     This is all so unspeakably horrible. It's not all your fault mind, 
     starting from the mass of date related constructs in W3C schema it
     may be that nothing better could be done. (Sorry this isn't that
     constructive a comment, I'm going to skip this section, and didn't
     want lack of comment to be taken as agreement that it was all
     fine...)




9.1.2 op:base64-binary-equal

      Wouldn't equality on this type be more useful if it ignored
      whitespace? That way equality would imply that you get the same
      binary data if you decode the base64 encoding. If you really want
      to check the _encoded_ string is equal you can cast to string, but
      why would you want that?




11.1.6 xf:deep-equal

   Should not be provided as a core fuction, you always need to write a
   recursive function for the exact notion of equality that you need
   (do attributes count?, namespaces? white space text nodes?)
  
11.1.11 xf:copy
11.1.12 xf:shallow

   Should (at most) be in XQuery and not in the core XPath.


11.2.1 xf:if-absent
11.2.2 xf:if-empty

   Should not be provided in the core (at most it could be provided in a
   standard library) but since presumably the second argument is evaluated
   whether or not the first argument is true, a user would always (?)
   be better advised to use  the conditional if expression.


12.2.5 xf:empty
12.2.6 xf:exists

   these are simply shorthands for simple expressions, should not be in
   the core.

12.3.1 xf:sequence-deep-equal

   see previous comments on deep-equal


[Issue 67: Should duplicates be eliminated for count() and sum()?] 

   No.

12.4.2 xf:avg
Values that equal the empty sequence are discarded.

  I think that that is just a special case of the (implied) operation
  that what is constructed is the sequence of values of the data()
  function.
  the fact that empty sequences vanish is then just a consequence of the
  flattening rules for sequences, and this would then also specify what
  happens if some values are non-empty sequences, which is not fully
  specified at present.


12.4.3 xf:max
  As Jeni commented it would be useful if max could return the _item_
  that has a max value rather than just its value.


12.4.5 xf:sum
  As for count I think that the description on empty sequences should be
  generalised to any sequence. (Also I'm pleased to see that sum() now
  returns 0 on () unlike the last draft)


12.5.1 xf:id
  Have to have this of course for compatibility although key() turns out
  to be a lot more useful. (Which is why it is rather odd that XPath2 has => op
  which is even more restricted than id().)


12.5.2 xf:idref
  Not clear that this is really useful.


12.5.3 xf:filter
[Issue 167: Semantics of xf:filter are underspecified and, perhaps, incorrect.] 

   This function probably should be removed.

David

_____________________________________________________________________
This message has been checked for all known viruses by Star Internet
delivered through the MessageLabs Virus Scanning Service. For further
information visit http://www.star.net.uk/stats.asp or alternatively call
Star Internet for details on the Virus Scanning Service.
Received on Tuesday, 21 May 2002 07:12:14 UTC