FW: Some F&O issues from Ashok Malhotra on 2003-03-17 (public-qt-comments@w3.org from March 2003)

From: Ashok Malhotra <ashokma@microsoft.com>
Date: Mon, 17 Mar 2003 08:10:54 -0800
To: <xquery@attbi.com>, <public-qt-comments@w3.org>
Message-ID: <E5B814702B65CB4DA51644580E4853FB070195D0@red-msg-12.redmond.corp.microsoft.com>
In my 2/11 reply to your note below a few items were not addressed as
these needed to be taken up by the F&O task force.  These items were
discussed on 3/3 and resolved as below.  Again, we thank you for your
comments and close reading of the document.

> - For all functions that accept an optional collation argument:
> What happens if a collation is specified but the sequence is not
strings?
> Is the collation just ignored?
> [AM] Agreed.  The next draft will contain appropriate wording.

> - According to the Unicode(tm) website, Unicode(tm) is a trademark and
> should be designated so everywhere it is used.  See
> http://www.unicode.org/unicode/consortium/logo.html#4
[AM] There is no precedent for this in any of the XML specs.  One of the
taskforce participants said that their lawyers had told him that the
trademark needed to be cited only if there was a contractual dependency
between the partners. 
> 
> ***
> *** 6.2
> ***
> The codepoints-to-string() and string-to-codepoints() functions are
great,
> but they are completely underspecified.
> 
> - What happens if the input sequence to codepoints-to-string() is
empty?
> (Presumably the empty string, but this is not stated.)
> 
> - What happens if the codepoints are not valid characters in XQuery
> strings,
> for example codepoints-to-string((0))?
> 
> - What happens if the codepoints contain a combining sequence that is
not
> valid?
> 
> - What happens if the codepoints are not valid codepoints (i.e.,
outside
> the
> range 0-10FFFF inclusive)?
[AM] Agreed.  Additional semantics will be specified in the next draft.
> 
> ***
> *** 6.4
> ***
> 
> - The following functions perform comparisons, and therefore should
> probably
> accept an optional collation argument (but currently do not): tokenize
> substring [this one's especially weird, because
> substring-after/substring-before do accept an optional collation
argument]
> 
> - The lower-case and upper-case functions are locale-unaware.
Consider
> supplying another argument (collation or otherwise) to allow
applications
> to handle locale-dependent casing (such as the famous Turkish I).
[AM] From the minutes of the meeting:

On locale-dependent casing; we will ask the author of the message to
please refer to a specification that we can reference to make this
change.
(Collations do not have casing information.)
Unicode TR21 defines a table that has locale-insensitive folding and
mentions locale-sensitive folding.  
The editors will ask for public feedback in this area.

> 
> - Consider adding at least title-case() and possibly also case-fold().
It
> is strange to select only two of the four functions described in the
> Unicode
> Case Model (TR #10).  Both the Java and .NET class libraries support
these
> functions, so there is no barrier to development.
> 
[AM] The taskforce did not agree to these suggestions. 
> ***
> *** 14.3.1, 14.3.2
> ***
> - It's strange that sequence-deep-equal() and sequence-node-equal()
return
> the empty sequence if either argument is the empty sequence, but
otherwise
> return false if their two arguments have different lengths.  I
recommend
> against special-casing the empty sequence here.  The empty sequence
should
> compare true with itself, and false with every other sequence.
[AM] Agreed.  The next draft will contain these changes.
> 

All the best, Ashok

> -----Original Message-----
> From: Ashok Malhotra
> Sent: Tuesday, February 11, 2003 9:38 AM
> To: 'xquery@attbi.com'; public-qt-comments@w3.org
> 
> Thank you for your comments.  Please see inline.
> 
> All the best, Ashok
> 
> -----Original Message-----
> From: XQuery [mailto:xquery@attbi.com]
> Sent: Saturday, January 25, 2003 3:59 PM
> To: public-qt-comments@w3.org
> Subject: Some F&O issues
> 
> 
> Apologies if some of these have been previously reported or are open
> issues;
> I did not perform due diligence and check.  (For that matter, I may
have
> reported some of these already myself, I didn't keep track.)
> 
> ***
> *** All sections
> ***
> - For all functions that accept an optional collation argument:
> What happens if a collation is specified but the sequence is not
strings?
> Is the collation just ignored?
> [AM] Good point.  I've put it on the F&O telcon agenda.
> 
> - According to the Unicode(tm) website, Unicode(tm) is a trademark and
> should be designated so everywhere it is used.  See
> http://www.unicode.org/unicode/consortium/logo.html#4
> [AM] Good point.  Editorial.
> 
> ***
> *** 6.2
> ***
> The codepoints-to-string() and string-to-codepoints() functions are
great,
> but they are completely underspecified.
> 
> - What happens if the input sequence to codepoints-to-string() is
empty?
> (Presumably the empty string, but this is not stated.)
> 
> - What happens if the codepoints are not valid characters in XQuery
> strings,
> for example codepoints-to-string((0))?
> 
> - What happens if the codepoints contain a combining sequence that is
not
> valid?
> 
> - What happens if the codepoints are not valid codepoints (i.e.,
outside
> the
> range 0-10FFFF inclusive)?
> [AM]
> [AM] On the agenda.
> 
> ***
> *** 6.4
> ***
> 
> - The following functions perform comparisons, and therefore should
> probably
> accept an optional collation argument (but currently do not): tokenize
> [TBD]
> substring [this one's especially weird, because
> substring-after/substring-before do accept an optional collation
argument]
> 
> - The lower-case and upper-case functions are locale-unaware.
Consider
> supplying another argument (collation or otherwise) to allow
applications
> to
> handle locale-dependent casing (such as the famous Turkish I).
> 
> - Consider adding at least title-case() and possibly also case-fold().
It
> is strange to select only two of the four functions described in the
> Unicode
> Case Model (TR #10).  Both the Java and .NET class libraries support
these
> functions, so there is no barrier to development.
> 
> [AM] We have discussed similar suggestions but they are on the agenda
> again.
> 
> - Is it an error if in the quantifier {n,m}, n > m ?[AM]
> 
> [AM] I presume this refers to the translate function.  The semantics
for
> this case are specified.
> 
> - What happens if n or m exceeds the maximum integer value supported
by
> the
> implementation?
> 
> 
> ***
> *** 14.3.1, 14.3.2
> ***
> - It's strange that sequence-deep-equal() and sequence-node-equal()
return
> the empty sequence if either argument is the empty sequence, but
otherwise
> return false if their two arguments have different lengths.  I
recommend
> against special-casing the empty sequence here.  The empty sequence
should
> compare true with itself, and false with every other sequence.
> [AM] This is on the agenda based on other mail.
> 
> ***
> *** 14.4
> ***
> - I found section 14.4 difficult to interpret precisely and correctly.
> Some
> particular problems with the wording:
> [AM] The text in these sections has been rewritten.  I hope you will
find
> that the next version of the document addresses your concerns. Some
> revision may be necessary now that we have agreed to preserve the
timezone
> on date/time values.
> 
> ***
> *** 14.4.3, 14.4.4
> ***
> - How do min()/max() behave on the empty sequence? The text does not
say.
> 
> - Must all members really be the exact same type?  The other functions
> promote all numeric types to a common type.
> 
> - What happens if members are subtypes (e.g., subtypes of the duration
> types)?  Presumably they're promoted to the appropriate duration type
and
> the comparison performed there, but the text taken literally precludes
> this
> possibility.
> 
> ***
> *** 14.4.2, 14.4.5
> ***
> - The sum function really ought to be listed immediately before avg.
> Fine,
> they're in alphabetical order, blah blah.  I'm not convinced.
> 
> - The constraints for sum() and avg() are strangely not alike.
> 
> - Both functions talk about timezone normalization, even though
> date/time/dateTime values are not allowed (because both functions
require
> that all members be numeric or else the same type and support
op:add().
> The
> only definition of op:add() on non-numeric types in which both
operands
> have
> the same types are yearMonthDuration and dayTimeDuration).
> 
> - Similarly, both functions talk about support op:add() (and in the
case
> of
> avg, division by integer).  But clearly the numeric types and the two
> duration types I just mentioned are the only types that fit this bill,
so
> the description is unnecessarily indirect.
> 
> 
> 
> Cheers,
> 
> Michael Brundage
> xquery@attbi.com
Received on Monday, 17 March 2003 11:11:13 UTC