RE: Some F&O issues

Thanks for the reply.

Sorry about the (n,m) comment on section 6.4, that got orphaned from its
context during cut/paste.  I think I was referring to section 6.4.16.1,
regular expression syntax, reluctant quantifiers, just making an editorial
comment on this:
"X{n,m}? matches X, at least n times, but not more than m times".
The definition (taken literally) breaks down when n > m.


-----Original Message-----
From: Ashok Malhotra [mailto:ashokma@microsoft.com] 
Sent: Tuesday, February 11, 2003 9:38 AM
To: xquery@attbi.com; public-qt-comments@w3.org
Subject: RE: Some F&O issues


Thank you for your comments.  Please see inline.

All the best, Ashok

-----Original Message-----
From: XQuery [mailto:xquery@attbi.com] 
Sent: Saturday, January 25, 2003 3:59 PM
To: public-qt-comments@w3.org
Subject: Some F&O issues


Apologies if some of these have been previously reported or are open issues;
I did not perform due diligence and check.  (For that matter, I may have
reported some of these already myself, I didn't keep track.)

***
*** All sections
***
- For all functions that accept an optional collation argument: What happens
if a collation is specified but the sequence is not strings? Is the
collation just ignored? [AM] Good point.  I've put it on the F&O telcon
agenda.

- According to the Unicode(tm) website, Unicode(tm) is a trademark and
should be designated so everywhere it is used.  See
http://www.unicode.org/unicode/consortium/logo.html#4
[AM] Good point.  Editorial.

***
*** 6.2
***
The codepoints-to-string() and string-to-codepoints() functions are great,
but they are completely underspecified.

- What happens if the input sequence to codepoints-to-string() is empty?
(Presumably the empty string, but this is not stated.)

- What happens if the codepoints are not valid characters in XQuery strings,
for example codepoints-to-string((0))?

- What happens if the codepoints contain a combining sequence that is not
valid?

- What happens if the codepoints are not valid codepoints (i.e., outside the
range 0-10FFFF inclusive)? [AM] 
[AM] On the agenda.

***
*** 6.4
***

- The following functions perform comparisons, and therefore should probably
accept an optional collation argument (but currently do not): tokenize [TBD]
substring [this one's especially weird, because
substring-after/substring-before do accept an optional collation argument]

- The lower-case and upper-case functions are locale-unaware.  Consider
supplying another argument (collation or otherwise) to allow applications to
handle locale-dependent casing (such as the famous Turkish I).

- Consider adding at least title-case() and possibly also case-fold(). It is
strange to select only two of the four functions described in the Unicode
Case Model (TR #10).  Both the Java and .NET class libraries support these
functions, so there is no barrier to development.

[AM] We have discussed similar suggestions but they are on the agenda again.

- Is it an error if in the quantifier {n,m}, n > m ?[AM] 

[AM] I presume this refers to the translate function.  The semantics for
this case are specified. 

- What happens if n or m exceeds the maximum integer value supported by the
implementation?


***
*** 14.3.1, 14.3.2
***
- It's strange that sequence-deep-equal() and sequence-node-equal() return
the empty sequence if either argument is the empty sequence, but otherwise
return false if their two arguments have different lengths.  I recommend
against special-casing the empty sequence here.  The empty sequence should
compare true with itself, and false with every other sequence. [AM] This is
on the agenda based on other mail.

***
*** 14.4
***
- I found section 14.4 difficult to interpret precisely and correctly. Some
particular problems with the wording: [AM] The text in these sections has
been rewritten.  I hope you will find that the next version of the document
addresses your concerns. Some revision may be necessary now that we have
agreed to preserve the timezone on date/time values. 

***
*** 14.4.3, 14.4.4
***
- How do min()/max() behave on the empty sequence? The text does not say.

- Must all members really be the exact same type?  The other functions
promote all numeric types to a common type.

- What happens if members are subtypes (e.g., subtypes of the duration
types)?  Presumably they're promoted to the appropriate duration type and
the comparison performed there, but the text taken literally precludes this
possibility.

***
*** 14.4.2, 14.4.5
***
- The sum function really ought to be listed immediately before avg. Fine,
they're in alphabetical order, blah blah.  I'm not convinced.

- The constraints for sum() and avg() are strangely not alike.  

- Both functions talk about timezone normalization, even though
date/time/dateTime values are not allowed (because both functions require
that all members be numeric or else the same type and support op:add(). The
only definition of op:add() on non-numeric types in which both operands have
the same types are yearMonthDuration and dayTimeDuration).

- Similarly, both functions talk about support op:add() (and in the case of
avg, division by integer).  But clearly the numeric types and the two
duration types I just mentioned are the only types that fit this bill, so
the description is unnecessarily indirect.



Cheers,

Michael Brundage
xquery@attbi.com

Received on Thursday, 13 February 2003 12:05:55 UTC