W3C home > Mailing lists > Public > public-rif-comments@w3.org > September 2009

Re: Jim Melton: XML Query WG review of RIF Datatypes and Built-Ins 1.0

From: Jim Melton <jim.melton@oracle.com>
Date: Wed, 30 Sep 2009 14:04:34 -0600
Message-Id: <7.0.1.0.2.20090930133422.0b614d58@oracle.com>
To: Sandro Hawke <sandro@w3.org>
Cc: "Jim Melton" <jim.melton@oracle.com>,public-rif-comments@w3.org, w3c-xsl-query@w3.org
Sandro,

Many thanks for your detailed response to the XML Query WG's comments 
(and for the heads-up phone call this afternoon!).  My WG will not 
meet again until Tuesday, 6 October; because you have expressed a 
strong desire to publish tomorrow, I'm taking some liberties here for 
which I hope my WG will forgive me.  That is, I'm responding 
unilaterally, hoping that my response accurately reflects what the WG 
participants would endorse.  If not, then we may have to send you a 
retraction and/or supplementary comments that you would have to 
consider after this publication cycle.

My high-level response is that we appreciate the serious 
consideration that you gave our comments and that we, except as 
indicated below, are satisfied with your responses.  I do not believe 
that we have any reason to object to your progression to CR since you 
have already agreed (below) to further consider a few of our comments 
during the CR period.

My responses below are all preceded by the string "Jim:" for easy 
identification.

At 9/29/2009 10:24 PM, Sandro Hawke wrote:

>Dear Jim,
>
>Thank you and the XML Query WG for your detailed review [1] of RIF
>Datatypes and Builtins.  We appreciate the time you put in, finding
>weaknesses and errors in our draft.
>
>Our responses to your comments are inline below.  It would be most
>helpful if you could let us know very soon whether you find these
>responses satisfactory.  (If you are satisfied, we can go ahead and
>publish as Candidate Recommendation immediately.  Best case, we could
>publish on Thursday October 1.)
>
>Our wiki has the latest version (including the changes made in
>response to your comments):
>
>   http://www.w3.org/2005/rules/wiki/DTB
>
>and a diff of those changes:
>
> 
>http://www.w3.org/2005/rules/wiki/index.php?title=DTB&diff=11377&oldid=11060
>
> > The XML Query WG has completed its review of
> >
> > RIF Datatypes and
> > Built-Ins 1.0 and has developed some comments.  Please note that
> > Sharon and I initially agreed to submit the XML Query WG's comments and
> > the XSL WG's comments jointly, but my WG objected on the grounds that
> > they had not yet seen the XSL WG's comments and did not want to wait for
> > them.  Consequently, Sharon will submit the XSL WG's comments
> > separately whenever they are ready.
> >
> >
> > <comments>
> >
> >
> > 1) Thanks for giving us the opportunity to review this
> > document.  We were very pleased to see that you have based much of
> > this spec on the Functions and Operators specification that we developed,
> > as well as on the XML Schema Part 2 Datatypes spec on which we also
> > depend.  There are other W3C WGs whose documents made use of the
> > F&O functions, but redefined the functions instead of incorporating
> > them by reference.  Your approach is manifestly appropriate.
> > Thanks!
>
>We were glad to be able to reuse so much of your work.  (I expect our
>users and implementors will be glad, too.  Several implementors have
>said they plan to re-use xpath libraries.)

Jim: That's nice to know -- thanks.


> > 2) We are slightly concerned by the fact that you state in the Overview
> > that "A large part of the definitions of the listed functions and
> > operators are adopted from [XPath-Functions]," but that you define a
> > different namespace
> > (
> > http://www.w3.org/2007/rif-builtin-function#) for the functions
> > instead of using the defined namespace
> > (
> > http://www.w3.org/2005/xpath-functions) of those functions that have
> > been adopted.
> >
> > We note that Section 4 uses the word
> > "adapted" instead of "adopted", which has
> > significantly different connotations.  We have concluded from
> > additional text in the document that "adapted" is the word that
> > you intended to use and recommend that you resolve the discrepancy by
> > correcting the Overview.
>
>We believe "adapted" is the more accurate term here, and have changed
>the document to reflect this.  At the end of section 4 (just before
>4.1) we added this clarifying sentence:
>
>"The differences from the original [XPath-Functions] include the
>handling of errors, the differentiation between predicates and
>functions, and a few specific differences noted in the definitions
>below."

Jim: That sentence is most welcome, as I often find it difficult to 
discover the ways one spec has "adapted" material in another spec.  I 
now have some guidance on what those differences are and how to look for them.


>The different namespace URIs are intended to allow for these changes,
>and are also to save users from having to remember which RIF functions
>are xpath functions and which are xpath operators (for which RIF would
>have to provide a namespace, since xpath does not) or new RIF
>functions.  (RIF does not have the function/operator distinction.)

Jim: I expect that you understand that the XPath "operator functions" 
are for definition purposes only and are not in any way normative for 
implementations.  That, of course, is why we didn't define a 
namespace for them.  Some of our participants are a little 
uncomfortable with a referencing specification choosing to make those 
operator functions normative, but overall we do not object.


> > 3) In section 2.2.1, we read the statement "since xs:duration does
> > not have a well-defined value space."  We believe that
> > mischaracterizes the rationale for the creation of the types
> > xs:dayTimeDuration and xs:yearMonthDuration.  The rationale is
> > actually that the xs:duration data type is not fully ordered, while the
> > two types derived from xs:duration are fully ordered.  It is
> > unlikely that XML Schema will be able to redefine xs:duration in a way
> > that is both compatible and fully ordered.
>
>It looks like this comment about xs:duration in our draft was based on
>the XSD 1.0 defintion.  As you suggest, it may have been fixed in XSD
>1.1.  We have decided not to add it, however, since we already agreed
>to have the same datatypes as OWL 2, which is already at Proposed
>Recommendation.

Jim: Thanks for the explanation (and the removal of the incorrect 
explanation).  Do you think it would be worthwhile adding a 
(non-normative) note telling the reader the reasoning behind the 
decision (that is, the relationship with OWL 2)?


>We have removed the now-incorrect explanation you cite.
>
> > 4) Also, in section 2.2.1, since xs:dateTimeStamp is taken from XSD 1.1,
> > it would also make sense to take xs:dayTimeDuration and
> > xs:yearMonthDuration from XSD 1.1, rather than from XDM. The definitions
> > are equivalent by design. (This also affects section 2.3.)
>
>We would rather avoid this change, at this point, because it would
>increase the risky dependency on XSD 1.1.  The OWL WG was recently
>delayed because of such a dependency, and is left with the awkward
>work-around you see here:
>
>   http://www.w3.org/TR/2009/PR-owl2-overview-20090922/#sotd-xml-dep
>
>If XSD 1.1 makes it to PR before RIF, we can make this change at that
>point.

Jim: Fair enough.


> > 5)
> >
> > In section 2.3, the type hierarchy for integer subtypes
> > appears to be incorrect.  unsignedLong should not be a subtype of
> > positiveInteger (because it allows the value zero). Also the prefix
> > "xs:" is included or omitted indiscriminately.
>
>Thanks for catching this; it's fixed now.
>
> > 6) In section 4.3, we learn that "Itruth Iexternal( ?arg1;
> > pred:is-literal-not-DATATYPE ( ?arg1 ) )(s1) = t if and only if s1 is in
> > the value space of one of the datatypes in
> > <
> > http://www.w3.org/TR/rif-dtb/#sec-data-types>DTS but not in the
> > value space of the datatype with shortname DATATYPE, and f
> > otherwise."  We believe that means that the predicate
> > pred:is-literal-not-integer returns f if the value of its argument is not
> > in the value space of any datatype in DTS!  If that is true, then it
> > is highly misleading, because returning false implies that the value is a
> > literal of type integer.  We recommend that you reconsider this
> > definition so that the predicate returns true when the value is either
> > (a)not in the value space of any datatype in DTS or (b)is in the value
> > space of some data type in DTS but not in the value space of the
> > specified datatype.
>
>We believe the definition as given is correct, but that the intended
>meaning of negative guards was not clear.  We have added this note to
>the end of section 4.3:
>
>"Note: The semantics of negative guards may be surprising. The
>is-literal-not-String guard essentially asks, "Is this a literal, and
>(if it is) is it something other than a String?" It could also be read
>as "Is this a decimal or a float or a double or a date or a dateTime,
>etc, [for every datatype except string] ?". The negative guards are
>formulated like this to allow for rules which detect, for instance,
>some kinds of bad inputs, while still using the open world assumption
>of some RIF dialects."
>
>Hopefully, that's detailed enough to show that the definition is
>correct.  A more-detailed explanation of why we can't provide
>is-not-String seems out-of-scope for this document.

Jim: Thanks for the explanation. I now understand why the predicate 
has the semantics that it does.  I must say, though, that I find the 
name itself unfortunate because of its counter-intuitiveness.  Full 
disclosure: I have long advised people to not depend on intuition or 
on Webster's Dictionary for the meaning of keywords and function name 
in a programming language, but to depend solely on the language 
spec.  This is obviously a case where I am not following my own 
advice.  But I also advise designers of languages to avoid 
consciously using counter-intuitive terms whenever possible.

Jim: In spite of the conflicting tone of the preceding paragraph, I 
do not ask that you reconsider the name of the predicate, because 
there is great value in having consistency amongst the names used for 
similar purposes in a programming language and that consideration 
probably outweighs the counter-intuitiveness (which might not affect 
every reader anyway).


> > 7) In section 4.4.1, we discovered the trivial typographical error
> > "funcitons".  We also noticed the trivial typographical
> > error "ab" (should be "an").
>
>Fixed, thanks.
>
> > 8) In section 4.5.1, Numeric functions, it is not clear whether functions
> > such as func:numeric-add accept arguments of mixed type (e.g. integer
> > plus double).  Although neither sections 1.3, 1.4 and 6.2 of
> > Functions and Operators nor appendix B of XPath 2.0 are wonderfully clear
> > on the point, our reading is that the underlying function
> > op:numeric-add() does not accept mixed arguments; rather, when the XPath
> > "+" operator is applied to an integer and a double, the integer
> > is promoted to a double and the function op:numeric-add(double, double)
> > is called. The operator accepts mixed-type arguments, but the underlying
> > function does not. (Others may disagree with this reading, as it really
> > isn't 100% clear.)
>
>Thanks for pointing out this omission.  We have added text requiring
>types be promoted in RIF, in the Mapping part of section 4.5.1.
>
> > 9) Section 4.7.1.2. Note that for reasons that are entirely
> > paternalistic, the fn:concat() function requires two or more arguments.
> > Also, the reference to xs:anyAtomicType seems odd: this abstract type
> > doesn't seem to be present in RIF.
>
>Okay, we've gone with two-or-more, and removed anyAtomicType.
>
> > 10) Section 4.11: We suspect there is a fourth difference between RIF
> > Lists and XPath sequences: in RIF, there is no equivalence between an
> > atomic value and a singleton list containing that value. (Otherwise,
> > pred:is-list() would be meaningless).
>
>Fixed, thanks.
>
> > 11) Section 4.11.1. Is it wise to number positions in a list starting
> > from zero, while numbering characters within a string (for example, in
> > the substring() function) from 1?  We think this inconsistency will
> > confuse your readers and users.
>
>We struggled with this some more today, but decided to leave indexing
>as is.  It's a really infortunate situation, and we can't see any way
>forward which wont confuse users.  Given that lists are substantially
>different from xpath sequences, well, hopefully people will understand
>and tolerate this approach.

Jim: This is most unfortunate, and may be the only thing to which we 
might actually object.  Please note that our comment didn't use the 
example of sequences, because your language doesn't contain that 
concept; we used strings, which your language does have.  Yes, lists 
and character strings are not the same thing, but many application 
programmers will be familiar with treating strings as lists of 
characters.  Numbering lists of characters (that is, strings) 
starting with 1, but numbering other kinds of lists starting with 0 
is, in our opinion, likely to be a serious source of confusion and 
erroneous code.  We strongly urge you to further consider this 
decision.  Assuming that you will choose to publish the CR without 
making a change here, I must advise you that our WG might choose to 
make an additional comment on this point during your CR 
period.  (Personally, I sincerely doubt that any of our members would 
go so far as to raise a formal objection, so you need not fear that outcome.)

Jim: If you do not make changes to resolve this concern, then please 
be sure that the spec clearly points out the different base value for 
the two kinds of positions.  That is probably best done with 
non-normative notes in two places -- where character strings are 
defined and where lists are defined, cross referencing one another.


> > 12) Section 4.11.4.11: There is no function fn:union. The link is to
> > op:union, but the RIF function is essentially unrelated to op:union, as
> > it is defined on atomic values rather than nodes. Same applies to
> > 4.11.4.13 fn:intersect and 4.11.4.14 fn:except. XPath contains no
> > functions to manipulate sequences of atomic values in this way: such
> > functions can easily be written by users as explained in F+O appendix
> > E.2.
>
>Our sense was that this difference was within the wiggle-room of
>"adapted from", but I guess we could change it to "inspired by" or
>"contrast with", if you think that's important.  We did want to
>highlight the fact that xpath does have something with the same name.

Jim: I don't think you can get away with the "adapted from" argument 
here.  The semantics (and signature) of the op:union operator from 
XPath and those of the corresponding RIF function are different in 
very important ways.  I believe that my WG's members would argue that 
you should highlight this function as not coming from the F&O spec, 
characterize it as "inspired by" if you wish, and explicitly state 
that XPath has something with the same name, but that it is not the 
same function.  This can all be done in a non-normative note, but I 
believe that it's important to avoid misunderstanding by your readers.


> > 13) Section 4.11.4.12: it's not clear what "in the same order"
> > means. Order of first appearance, perhaps?
>
>Fixed, thanks.
>
> > 14) In various places in section 4, we read phrases such as "the
> > value of the function is unspecified".  The discussion early in
> > section 4 of that term states that implementations are free to do as they
> > wish, including returning either true or false, as well as aborting
> > evaluation of the containing expression/query.  In the specs for
> > which we are responsible, as well as some well-known international
> > standards, the term "implementation-dependent" is used for the
> > same purpose.  You might consider the use of that term instead.
>
>Thanks, I agree with you on this, but the group hasn't had a chance to
>talk it through.  I'd like to let addressing this wait until the next
>round.

Jim: Fair enough.


> > 15) Near the beginning of section 2.3 and in Appendix 6, we see three
> > places where an unexpected character (a hollow square box) appears.
>
>There is a community, including some of our editors, who use that box
>to signal the end of a formal definition.  Personally, I find it
>confusing, and would prefer we use some CSS styling to set off the
>definition.  The WG hasn't had a chance to talk about this, and I'd
>like to wait until the next round for this, as well.

Jim: I don't personally object to the convention.  But you are 
undoubtedly aware that a great many fonts use that glyph as a signal 
that there is a (Unicode) character for which the font does not have 
a corresponding glyph.  Most readers of your spec will see those 
boxes and assume that they are spurious characters for which the font 
their browser is using does not have the proper glyph.  If you must 
use that convention, then you must clearly tell your readers what it 
means.  I actually think your readers will be better served if you 
look into other W3C specs to see how they deal with formal 
definitions and use those specs as inspiration.


> > </comments>
> >
> > Hope this helps,
> >    Jim
>
>Indeed, thank you again for catching all this, and please let us know
>if our response is satisfactory.
>
>      -- Sandro (on behalf of RIF WG)

Jim: It was our pleasure to help in whatever way we could.

Jim: Going out on that limb, I will assert that we do not object to 
progressing this spec to CR, but that we may submit additional 
comments during the CR period.

Hope this helps,
    Jim



>[1] http://lists.w3.org/Archives/Public/public-rif-comments/2009Sep/0008

========================================================================
Jim Melton --- Editor of ISO/IEC 9075-* (SQL)     Phone: +1.801.942.0144
   Chair, W3C XML Query WG; XQX (etc.) editor       Fax : +1.801.942.3345
Oracle Corporation        Oracle Email: jim dot melton at oracle dot com
1930 Viscounti Drive      Standards email: jim dot melton at acm dot org
Sandy, UT 84093-1063 USA          Personal email: jim at melton dot name
========================================================================
=  Facts are facts.   But any opinions expressed are the opinions      =
=  only of myself and may or may not reflect the opinions of anybody   =
=  else with whom I may or may not have discussed the issues at hand.  =
========================================================================  
Received on Wednesday, 30 September 2009 20:05:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 30 September 2009 20:05:25 GMT