- From: Kay, Michael <Michael.Kay@softwareag.com>
- Date: Thu, 9 Oct 2003 01:21:35 +0200
- To: Ashok Malhotra <ashokma@microsoft.com>, "C. M. Sperberg-McQueen" <cmsmcq@acm.org>, public-qt-comments@w3.org, "Kay, Michael" <Michael.Kay@softwareag.com>
- Cc: W3C XML Schema IG <w3c-xml-schema-ig@w3.org>
- Message-ID: <DFF2AC9E3583D511A21F0008C7E62106073DD19D@daemsg02.software-ag.de>
Just a clarification about the "#" character. The list of "reserved"
characters in RFC 2396 describes characters that have a special role in a
URI. "#" does not have a special role in a URI, but it does have a special
role in a URI-reference. Since we are dealing with URI-references rather
than URIs, it is appropriate to add "#" to the list.
Michael Kay
> -----Original Message-----
> From: Ashok Malhotra [mailto:ashokma@microsoft.com]
> Sent: 07 October 2003 23:14
> To: C. M. Sperberg-McQueen; public-qt-comments@w3.org; Kay, Michael
> Cc: W3C XML Schema IG
> Subject: RE: XML Schema WG comments on Functions and Operators
>
>
>
> This is a response to your comment [2.8] below on
> fn:escape-uri. I'm copying the I18N WG because this response
> also addresses some of material in their comments [67], [68]
> and [69] in
> http://lists.w3.org/Archives/Public/public-qt-comments/2003Jul
> /0105.html.
>
> Essentially, your comment said "use the algorithm in the
> Linking Spec ...". But, as I argue below, the algorithm in
> the F&O is closer to RFC 2396 than the algorithm in the
> Linking Spec. There is one exception to this which is the
> situation with the # character, of which more later.
>
> First, let us discuss the behaviour where escape-reserved =
> 'true'. I believe this is the algorithm discussed in the
> Linking Spec. The Linking spec says "the disallowed
> characters include all non-ASCII characters, plus the
> excluded characters listed in Section 2.4 of [IETF RFC 2396],
> except for the number sign (#) and percent sign (%) and the
> square bracket characters re-allowed in [IETF RFC 2732]. "
>
> However RFC 2396 says
> " Data characters that are allowed in a URI but do not have
> a reserved
> purpose are called unreserved. These include upper and lower case
> letters, decimal digits, and a limited set of punctuation marks and
> symbols.
>
> unreserved = alphanum | mark
>
> mark = "-" | "_" | "." | "!" | "~" | "*" | "'" |
> "(" | ")"
>
> Unreserved characters can be escaped without changing the semantics
> of the URI, but this should not be done unless the URI is
> being used
> in a context that does not allow the unescaped character
> to appear."
>
> Thus, our reading of the above is that all characters except
> the above should be escaped and, in particular, the marks
> should not be escaped.
>
> A little later RFC 2396 says
>
> " Because the percent "%" character always has the reserved purpose of
> being the escape indicator, it must be escaped as "%25" in order to
> be used as data within a URI."
>
> Our reading of this rule is that the % must be escaped unless
> it is the start of an escape sequence %HH.
>
> This reading of 2396 was the basis of the rule in the F&O which says
>
> "If $escape-reserved is true, all characters are escaped
> other than lower case letters a-z, upper case letters A-Z,
> digits 0-9 and the characters referred to in [RFC 2396] as
> "marks": specifically, HYPHEN-MINUS ("-"), LOW LINE ("_"),
> FULL STOP ".", EXCLAMATION MARK "!", TILDE "~", ASTERISK "*",
> APOSTROPHE "'", LEFT PARENTHESIS "(", and RIGHT PARENTHESIS
> ")". The PERCENT SIGN "%" character itself is escaped only if
> it is not followed by two hexadecimal digits (that is, 0-9,
> a-f and A-F)."
>
> RFC 2396 says the set of characters included as "reserved"
> can occur in specific contexts only and must be escaped if
> they are not used in these contexts. We interpret this to
> mean that if they are used correctly, reserved characters may
> not be escaped. This is the motivation behind the algorithm
> for escape-reserved='false'. The rule in the F&O says
>
> "If $escape-reserved is false, the behavior differs in that
> characters referred to in [RFC 2396] and [RFC 2732] as
> reserved characters, together with the NUMBER SIGN '#'
> character, are not escaped. These characters are SEMICOLON
> ";", SOLIDUS "/", QUESTION MARK "?", COLON ":", COMMERCIAL AT
> "@", AMPERSAND "&", EQUALS SIGN "=", PLUS SIGN "+", DOLLAR
> SIGN "$", COMMA "," NUMBER SIGN "#", LEFT SQUARE BRACKET "["
> and RIGHT SQUARE BRACKET "]"."
>
> The set of reserved characters in the above is correct
> according to RFC 2396 amended by RFC 2732 except for the
> inclusion of the # character.
>
> Thus, if we have read the background material correctly there
> are two possible actions. Please advise.
>
> POSSIBLE ACTIONS:
>
> 1. Do nothing except remove the # sign from the set of
> reserved characters in the rule for escape-reserved='false'.
> 2. Change the wording to conform to the Linking Spec even
> though it is at variance with RFC 2396.
>
> All the best, Ashok
>
> > -----Original Message-----
> > From: public-qt-comments-request@w3.org [mailto:public-qt-comments-
> > request@w3.org] On Behalf Of C. M. Sperberg-McQueen
> > Sent: Friday, August 01, 2003 7:55 PM
> > To: public-qt-comments@w3.org
> > Cc: W3C XML Schema IG
> > Subject: XML Schema WG comments on Functions and Operators
> >
> >
> > Dear colleagues:
> >
> > The XML Schema Working Group congratulates the XML Query and XSL
> > Working Groups on their progress, and in particular on the
> Last Call
> > draft of "XQuery 1.0 and XPath 2.0 Functions and Operators".
> >
> > We have not been able to review the last call draft in as
> much detail
> > as we would have liked, but for what they are worth our
> comments are
> > at
> http://www.w3.org/XML/Group/2003/07/xmlschema-fo-comments.html (an
> > ASCII version is reproduced below for the convenience of those with
> > access to their email but not to the Web).
> >
> > We apologize for the tardy arrival of these notes.
> >
> > -C. M. Sperberg-McQueen, for the W3C XML Schema WG
> >
> > ................................................................
> >
> >
> > [1]W3C [2]Architecture Domain [3]XML | [4]XML Schema | [5]Member
> > Events | [6]Member-Confidential!
> >
> > W3C XML Schema WG
> >
> > Notes on XQuery 1.0 and XPath 2.0 Functions and Operators
> >
> > 1 August 2003
> >
> > _________________________________________________________________
> >
> > * 1. [7]Schema-related issues
> > + 1.1. [8]Alignment of date/time values
> > + 1.2. [9]The type anyAtomicType
> > + 1.3. [10]The type untypedAtomic
> > + 1.4. [11]Alignment on strings and URIs
> > + 1.5. [12]Whitespace handling and lexical forms
> > + 1.6. [13]Negative zero
> > + 1.7. [14]Totally ordered Booleans
> > * 2. [15]Other technical issues
> > + 2.1. [16]The fn:base-uri property
> > + 2.2. [17]Alignment of references
> > + 2.3. [18]Characters and collation units
> > + 2.4. [19]Surrogate pairs and Unicode scalar values
> > + 2.5. [20]Definition of whitespace
> > + 2.6. [21]Required normalization functionality
> > + 2.7. [22]Case folding
> > + 2.8. [23]Escaping URIs
> > + 2.9. [24]The binary types
> > + 2.10. [25]Minor items
> > o 2.10.1. [26]User control of collations
> > o 2.10.2. [27]Section 7.3.1.1 Examples
> > * 3. [28]Editorial notes
> >
> > _________________________________________________________________
> >
> > This document contains comments on the Last Call draft
> of 2 May 2003
> > of XQuery 1.0 and XPath 2.0 Functions and Operators
> transmitted to the
> > XML Query and XSL Working Groups on behalf of the XML
> Schema Working
> > Group. These draft comments have not been reviewed by
> the XML Schema
> > Working Group and do not necessarily command consensus
> within the
> > group; because we will not meet again until 28 August,
> the Working
> > Group directed at its meeting today that these notes should be
> > transmitted to the XML Query and XSL Working Groups
> without awaiting
> > review.
> > In addition to the comments below, please note that
> several of the
> > [29]general comments sent on 14 July relate to the functions and
> > operators and data model specifications. Some of those
> comments sent
> > earlier overlap with some comments below.
> >
> > 1. Schema-related issues
> >
> > The comments in this section relate to the use of XML
> Schema in the
> > F/O specification and thus to the particular area of
> responsibility
> > borne by the XML Schema WG.
> >
> > 1.1. Alignment of date/time values
> >
> > The provision for preserving timezone information in
> the values of
> > xs:dateTime, xs:date, and xs:time continues to concern
> us. We believe
> > that a discrepancy of this kind between F/O and XML
> Schema will hurt
> > users and impede uptake of both specifications.
> >
> > We believe F/O and XML Schema need to align on this,
> either by F/O
> > changing to the XML Schema value space, or by changing
> the value space
> > as part of XML Schema 1.1, or by some other mutually agreed upon
> > solution.
> >
> > 1.2. The type anyAtomicType
> >
> > We reiterate our concern over the introduction of
> anyAtomicType into
> > the type hierarchy. We believe that a discrepancy of
> this kind between
> > F/O and XML Schema will hurt users and impede uptake of both
> > specifications.
> >
> > We believe F/O and XML Schema need to align on this,
> either by F/O
> > aligning with XML Schema 1.0 or by XML Schema 1.1 aligning with
> > F/O.
> >
> > 1.3. The type untypedAtomic
> >
> > We reiterate our concern over the introduction of
> untypedAtomic into
> > the type hierarchy. As with the other discrepancies, we believe
> > alignment of the QT specs and XML Schema is critically
> important.
> > Section 1.3.2 says xdt:untypedAtomic is used wherever
> the PSVI has
> > xs:anySimpleType; please note that in the PSVI, this
> will be the
> > case
> >
> > * when the element or attribute in question was
> declared as having
> > type anySimpleType
> > * when the attribute in question had no declaration
> and the schema
> > processor assumed the simple urtype for it in the
> course of lax
> > validation or error recovery
> >
> > Note that elements will not be assigned the
> anySimpleType as their
> > type property in the course of lax validation or error
> recovery; they
> > will have xs:anyType instead. Your use of xdt:untypedAtomic for
> > xs:anySimpleType but not for elements which (a) lack
> child elements
> > and (b) are assigned to xs:anyType may lead to results
> which puzzle
> > some of your users; we believe you may wish to consider
> changing your
> > mapping rules to assign xsd:untypedAtomic to such elements.
> >
> > 1.4. Alignment on strings and URIs
> >
> > The table at the beginning of section 2, Accessors,
> shows functions
> > which are intended (judging by their names) to return
> URIs and which
> > return values of type xs:string instead of xs:anyURI. Similarly,
> > various functions which accept URIs as arguments are
> given signatures
> > using xs:string as the type, which in turn necessitates
> ad hoc rules
> > of the form "If $collationLiteral is not in the lexical space of
> > xs:anyURI, an error is raised".
> >
> > As you know from our inquiry to you in mid-July, it has
> been suggested
> > that in XML Schema 1.1 the xs:anyURI type be made a
> restriction of
> > xs:string. But for now, there appears to be a
> discrepancy between the
> > use of strings to represent URIs here and the provision
> of a distinct
> > (and, for typing purposes, disjoint) type in XML Schema 1.0.
> > We need to align on this.
> >
> > 1.5. Whitespace handling and lexical forms
> >
> > In section 5.1, paragraph 4 reads in part: "If the argument to a
> > constructor function is a string literal, the literal
> must be a valid
> > lexical form for its type ... Whitespace normalization
> is applied
> > before validation ..."
> >
> > In all the cases which immediately come to mind, if the
> argument is a
> > valid lexical form for a type, there is no need to perform any
> > whitespace normalization on it. In XML Schema, it is
> the result of
> > whitespace normalization, not the input to it, which
> must be a legal
> > lexical form; we believe readers will be less confused
> if your usage
> > of the terms and ours is consistent.
> >
> > A possible rewording: "If the argument to a constructor
> function is a
> > string literal, then whitespace normalization is
> applied as indicated
> > by the whitespace facet for the datatype. The
> whitespace-normalized
> > string must be a valid lexical form for the type, as specified
> > ..."
> >
> > 1.6. Negative zero
> >
> > In section 6, a note explains that the value space of
> xs:float and
> > xs:double has been extended vis-à-vis that given by XML
> Schema, to
> > include a negative zero. The note also explains that
> the negative zero
> > will "never be obtained from the typed value of a node."
> >
> > We believe this discrepancy is untenable, and we are
> not clear why it
> > has proven necessary to introduce it.
> >
> > As far as we can tell by examining the specification, the spec
> > mentions different treatment for positive and negative
> zero only for
> > the functions described in section 6.4 (fn:floor, fn:ceiling,
> > fn:round, and fn:round-half-to-even): in the
> description of each of
> > these functions it is noted that if a zero is given to
> the function as
> > an argument, the sign of the zero returned as the value of the
> > function is the same as the sign of the zero passed in
> as an argument.
> > (The discussion of fn:ceiling mentions other cases when
> negative zero
> > is returned; the discussion of fn:floor passes over the
> analogous
> > cases in silence.) Other mentions of the signed zeroes in this
> > specification invariably specify either that something
> is true both
> > for positive and for negative zero or else that a
> constructor may
> > return either a positive or a negative zero.
> >
> > Could you explain the motive for introducing this
> discrepancy with the
> > value space defined in XML Schema? Would it not suffice
> to observe
> > that IEEE 754 has both positive and negative zeroes,
> which are treated
> > as different machine representations of the same values in the
> > xs:float and xs:double value spaces, and (optionally)
> that the prose
> > occasionally mentions these distinct representations of
> zero in the
> > interests of alignment with IEEE 754, even though
> formally they are
> > the same value?
> >
> > Is it essential to introduce an incompatibility with
> XML Schema here
> > instead of treating positive and negative zeroes as one
> value with two
> > machine representations?
> >
> > 1.7. Totally ordered Booleans
> >
> > We do not believe that it makes sense to impose a user-visible
> > ordering on the Boolean data type. Can you explain the
> rationale?
> > This is a discrepancy between F/O and XML Schema which must, we
> > believe, be aligned.
> >
> > 2. Other technical issues
> >
> > The comments in this section relate to technical issues
> other than the
> > use of XML Schema in the F/O specification; the XML
> Schema WG claims
> > no particular responsibility or expertise on these questions but
> > raises them because they seem to need attention.
> >
> > 2.1. The fn:base-uri property
> >
> > In section 2.5, the first paragraph defines a base-uri
> property for
> > all node types: "Document, element and
> processing-instruction nodes
> > have a base-uri property.... The base-uri of all other
> node types is
> > the empty sequence."
> >
> > The next paragraph begins by explaining what happens
> "If the accessor
> > is called on a node that does not have a base-uri
> property ..." If all
> > nodes have the property, how can such a node exist?
> >
> > 2.2. Alignment of references
> >
> > XML Schema and the Functions and Operators spec should
> refer to the
> > same version of Unicode. At the moment, this appears not to be
> > true.
> >
> > 2.3. Characters and collation units
> >
> > The discussion of collation units in the second note of
> section 7.3
> > says that collation decomposes a string "into a
> sequence of units,
> > each unit consisting of one or more characters", and
> that various
> > comparison operations are performed on these units. The
> functions
> > fn:starts-with, fn:ends-with, fn:substring-before, and
> > fn:substring-after are all mentioned as operating on
> such a segmented
> > string.
> >
> > The list of functions at the beginning of section 7.4, however,
> > describes them as operating on characters, not on the nameless
> > collation units consisting of one or more characters
> each. This looks
> > like a contradiction.
> >
> > We believe that the general level of confusion is best
> minimized, and
> > the world becomes a better place, if in XML-related
> specifications the
> > word character is used always and only for the units of
> the Universal
> > Character Set defined by Unicode and by ISO 10646. The
> word should not
> > be used (however great the temptation becomes at times)
> to denote the
> > culturally specific units of writing systems (e.g.
> letters, symbols,
> > signs, graphemes, or what have you).
> >
> > We suggest recasting the descriptions in 7.4 to
> describe the effect of
> > the functions in terms of the collation units, rather
> than in terms of
> > characters. In order to avoid repeating the phrase "the
> nameless units
> > of one or more characters into which a collation
> segments a string for
> > purposes of comparison", you may wish to define the term letter,
> > grapheme, collation unit, or thingy with that meaning.
> >
> > 2.4. Surrogate pairs and Unicode scalar values
> >
> > Section 7.4.6 (like some others) has a note calling
> attention to the
> > fact that some implementations will represent
> characters with code
> > points higher than xFFFF by using surrogate pairs. You
> quite correctly
> > avoid using the term code point for the things which make up the
> > surrogate pair, since in section 7.1 you have defined
> code point as
> > excluding surrogates. But the term 16-bit values is not
> defined, as
> > far as we can tell.
> >
> > Also, in Unicode 2 and 3 there are (as far as we have
> been able to
> > tell) no rules that forbid a double encoding of
> characters outside the
> > Basic Multilingual Plane (i.e. first representing them
> within the BMP
> > as surrogate pairs, and then encoding the sequence of
> BMP items in
> > UTF-8). Even if it is discouraged (and it is indeed outlawed in
> > Unicode 4.0), surrogate pairs might well show up not
> only in UTF-16
> > but also in UTF-8, where they will presumably be presented by
> > Unicode-oblivious character libraries not as pairs of
> 16-bit values
> > but as four-octet sequences whose intepretation in
> terms of Unicode
> > scalar values requires slightly special rules.
> >
> > Note that the definition of code points given in
> section 7.1 agrees
> > with the definition of Unicode scalar values in Unicode 4.0 in
> > excluding the surrogate range, but not with Unicode 2.0
> (the version
> > cited in your normative references), or Unicode 3,
> which define a
> > Unicode scalar value as "a number N from 0 to
> 10FFF[16]", without
> > leaving any gap for the surrogates.
> >
> > 2.5. Definition of whitespace
> >
> > Section 7.4.10 defines the function fn:normalize-space as doing
> > various things to whitespace, but it does not define the term
> > whitespace. It should, since various definitions are possible.
> > The Unicode character database, for example, lists the following
> > Unicode characters as whitespace in the file PropList-3_1_0.txt:
> >
> > * 0009..000D ; White_space # Cc [5] <control>..<control>
> > * 0020 ; White_space # Zs SPACE
> > * 0085 ; White_space # Cc <control>
> > * 00A0 ; White_space # Zs NO-BREAK SPACE
> > * 1680 ; White_space # Zs OGHAM SPACE MARK
> > * 2000..200A ; White_space # Zs [11] EN QUAD..HAIR SPACE
> > * 2028 ; White_space # Zl LINE SEPARATOR
> > * 2029 ; White_space # Zp PARAGRAPH SEPARATOR
> > * 202F ; White_space # Zs NARROW NO-BREAK SPACE
> > * 3000 ; White_space # Zs IDEOGRAPHIC SPACE
> >
> > The XML specification defines a smaller set of characters as
> > whitespace, for purposes of whitespace normalization.
> >
> > So some definition is definitely needed.
> >
> > 2.6. Required normalization functionality
> >
> > Section 7.4.11 requires conforming implementations to
> support Unicode
> > normalization form NFC.
> >
> > Why is normalization form W3C not also required?
> >
> > 2.7. Case folding
> >
> > Sections 7.4.12 and 7.4.13 define functions for case folding.
> > Since case folding is not consistent across languages
> and locales, we
> > have grave doubts about the wisdom of this inclusion,
> and some members
> > of the WG would advise you to drop these functions,
> which are not and
> > cannot be language- and culture-neutral.
> >
> > There is precedent: the decision to drop case-folding
> of names from
> > the design of XML resulted from the realization that every
> > case-folding algorithm available, including the use of
> the Unicode
> > case mapping tables, has an inherent cultural bias. The
> inclusion of
> > culturally and linguistically biased functions does not
> contribute to
> > achieving the goal of universal accessibility for the Web. Some
> > members of the XML Schema WG believe your spec should
> not go forward
> > with these functions in it.
> >
> > If you retain these functions, you should at the very
> least warn users
> > that
> >
> > * Results may violate user expectations (in Québec,
> for example, the
> > standard uppercase equivalent of "é" is "É", while
> in metropolitan
> > France it is more commonly "E"; only one of these
> is supported by
> > the function as defined).
> > * Many characters of class Ll lack uppercase
> equivalents in the
> > Unicode case mapping tables (we stopped counting at
> 150 or so);
> > many characters of class Lu lack lowercase equivalents.
> > * The two functions are not inverses of each other,
> so that for a
> > string S of upper-case characters,
> fn:upper-case(fn:lower-case(S))
> > is not guaranteed to return S, nor is
> > fn:lower-case(fn:upper-case(S)) for a string S of lower-case
> > characters. Latin small letter dotless i (as used
> in Turkish) is
> > perhaps the most prominent lower-case letter which will not
> > round-trip, as Latin capital letter i with dot
> above is the most
> > prominent upper-case letter which will not round
> trip; there are
> > others.
> >
> > You may also wish to make the case mapping depend on
> the default or a
> > user-specified collation.
> >
> > 2.8. Escaping URIs
> >
> > The rules for escaping URIs should be aligned across all W3C
> > specifications; otherwise, we will drive our users crazy.
> >
> > We think that means that you should reference and implement the
> > algorithm specified in the XML Linking specification
> >
> ([30]http://www.w3.org/TR/2001/REC-xlink-20010627/#link-locators) and
> > referenced by XML Schema, or the algorithm given in the
> W3C Character
> > Model specification (which was the same algorithm the
> last time we
> > looked).
> >
> > In particular, some members of the XML Schema WG were
> surprised to see
> > that your algorithm escapes the percent sign in some
> cases but not
> > others; this does not seem to be a feature of the
> algorithm given by
> > XML Linking and by the Character Model.
> >
> > That said, we believe that you do your readers a good service by
> > listing explicitly the affected characters. By
> suggesting that you
> > refer to the Linking/CharMod algorithm, we do not mean
> to suggest that
> > you should make your spec less useful by omitting these lists.
> > (Editorial note: it would perhaps be useful to some
> readers to have a
> > brief discussion of why the advice given in the last
> paragraph should
> > be followed; our readers did not understand the
> rationale for this
> > advice.)
> >
> > 2.9. The binary types
> >
> > Section 12.1.1 says that op:hexBinary-equal returns true if its
> > arguments "are of the same length and contain the same
> code-points";
> > similarly in 12.1.2 for op:base64Binary-equal.
> >
> > The term code-point was defined in section 7.1 as
> denoting integers
> > between 0 and 1114111 (x10FFFF), with a gap in the
> range where Unicode
> > surrogates occur. It seems to be used here to denote what other
> > specifications refer to as octets (bit strings of length 8).
> >
> > Taking the term code point in the sense of `octet', the
> definition
> > still does not match our intuitions of what an equality
> test on binary
> > data must do: it is not enough that each argument
> contain the same
> > octets; they must contain them in the same order.
> >
> > Suggested rewording: "are identical strings of octets".
> If you wish to
> > avoid the word octet, "are identical bit strings" might
> do, although
> > it omits the relatively important fact that the values
> in question
> > must have 8×n bits for some integer n.
> >
> > 2.10. Minor items
> >
> > 2.10.1. User control of collations
> >
> > Section 7.3 says in part "This specification does not
> use xml:lang to
> > identify the default collation, in part because
> collations should be
> > determined by the user of the data, not (normally) the
> data itself,
> > and because ..."
> >
> > The second reason given is sound. The first (collations
> should not
> > normally be determined by the data) is often advanced
> as a principle,
> > but does not seem to all members of the XML Schema WG to be
> > universally true. We are thus grateful for the
> "(normally)" in the
> > sentence. But in any case, the first reason given here
> leads to a
> > non-sequitur: it would be a reason not to make xml:lang
> determine the
> > collation sequence without possibility of user
> override. But it does
> > not, even on its face, provide a reason not to use xml:lang to
> > identify the default collation. We suggest dropping the
> first reason;
> > the second suffices.
> >
> > 2.10.2. Section 7.3.1.1 Examples
> >
> > The fourth example in section 7.3.1.1 says that
> >
> > fn:compare('Strassen', 'Straße')
> >
> > "returns 1 if and only if the default collation
> includes provisions
> > that equate `ss' and the (German) character `ß'
> (`sharp-s')." Unless
> > we have misunderstood the definition of the function,
> the return value
> > should also be 1 if the default collation sorts "ß"
> (sharp s) before
> > "s". Deleting the phrase "and only if" would remove the error.
> >
> > 3. Editorial notes
> >
> > In the course of our work, some editorial points were
> noted; we list
> > them here for the use of the editors. We do not
> particularly expect
> > formal responses on these comments.
> >
> > 1. Definition of must. Section 1.1 defines must thus:
> >
> > Conforming documents and processors are required
> to behave as
> > described; otherwise, they are non-conformant or in error.
> >
> > Is the "or" inclusive or exclusive, or is "in
> error" intended as a
> > synonym or approximate synonym for
> "non-conformant"? Possible
> > alternatives: "otherwise, they are non-conformant
> and in error",
> > "otherwise, they are either non-conformant or else
> in error",
> > "otherwise, they are non-conformant, i.e. in error".
> >
> > 2. Definition of stable. In section 1.1, the
> definition of stable
> > says, inter alia:"Some other functions ... have an explicit
> > dependency on the dynamic context". Unless this
> means that they
> > accept an argument representing the dynamic
> context, it seems at
> > first glance as if explicit is here used with the meaning
> > `implicit'. Perhaps what is intended is that the
> documentation
> > will explicitly mention this dependency. Perhaps
> the best thing to
> > do would be just to drop the explicit; if you really wish to
> > stress the promise of documentation, perhaps read
> "Some other
> > functions ...have a depencency on the dynamic
> context ... These
> > functions are said to be contextual. [INS:
> Contextual functions
> > are always identified as such in their descriptions. :INS] "
> >
> > 3. The term back up. The phrase back up appears to be
> used several
> > times as a technical term (e.g. last paragraph of
> 1.7). What does
> > it mean?
> >
> > 4. The term QName. Some readers (including some
> members of the XML
> > Schema WG) are likely to find it disorienting for
> the term QName
> > to be used here as a synonym for expanded name or
> universal name,
> > and not with the same meaning QName has in the XML
> Namespaces
> > Recommendation. We recognize, however, that what is
> returned is
> > precisely a member of what XML Schema 1.0 defines
> as the value
> > space of the xs:QName type, so that the use of the
> term xs:QName
> > to denote (for example) the return type of the accessor
> > fn:node-name is not only unexceptionable but necessary for
> > consistency. We don't have a good solution for you
> here; we only
> > note the difficulty. Perhaps a note calling the
> reader's attention
> > to the issue would be in order (similar to the note
> on this topic
> > in the Data Model spec).
> >
> > Some members of the WG suggest that this spec, like the Data
> > Model, should prefer the term expanded QName where
> possible, to
> > stress that what is referred to is the pair in the
> value space,
> > not the colonized Name in the lexical space.
> >
> > 5. No parameters and the empty list of parameters:
> >
> > On first reading, the signatures of fn:string and
> fn:error suggest
> > an ambiguity to some readers: the call fn:error()
> appears to match
> > both the first and the second signatures.
> >
> > Members of the WG who have studied XQuery more
> thoroughly assure
> > the rest of us that there is no ambiguity, so our purpose in
> > making this comment is merely to call your attention to an
> > editorial problem: it might be useful to explain to
> the reader why
> > the dual signatures showing no arguments and
> optional arguments
> > are not in fact ambiguous.
> >
> > 6. Section 2.3, first note: the word this seems to need an
> > antecedent; it is not clear to this reader, at
> least, what that
> > antecedent is. (It's also not clear what problem
> with blanks in
> > fragment identifiers is being adverted to.)
> >
> > 7. Raising errors: Section 3 para 1 reads in part:
> "The occurrence of
> > that phrase [sc. `an error is raised'] implicitly causes the
> > invocation of the fn:error function ..." This
> formulation seems to
> > involve a horrible clash of contexts: the phrase
> "an error is
> > raised" occurs in this document, and it occurs
> continuously from
> > the time of publication until the document ceases
> to exist (if
> > documents can ever cease to exist), while the
> error, one expects,
> > ought to be raised in a software system which
> implements the spec,
> > and should probably not be raised continuously from
> now until the
> > spec ceases to exist, if only because it would make
> it hard for
> > users to get work done. For the occurrence of a
> phrase in the spec
> > to cause the raising of an error in conforming
> software seems to
> > involve a rather unusual kind of action at a
> distance. To speak a
> > bit more seriously: perhaps the relevant part of
> the paragraph
> > could be recast, perhaps along these lines: "the
> phrase `an error
> > is raised' is used to describe the behavior of conforming
> > processors in certain situations. When such
> situations arise in a
> > running system, a conforming implementation of this
> specification
> > must invoke the fn:error function defined in this
> section." This
> > is not perfect, but we hope you get the idea.
> >
> > 8. Type promotion in multiple or single steps: Section
> 6.2 says "As
> > far as possible, the promotions should be done in a
> single step.
> > Specifically, when a decimal is promoted to a
> double, it must not
> > be converted to a float and then to a double, as
> this risks loss
> > of precision." [Emphasis added.] These two
> sentences appear to
> > contradict each other: is the rule about
> single-step conversions
> > required of conforming implementations ("must"), or
> recommended
> > without being required ("should")?
> >
> > 9. Code points: The note in section 7.1 identifies
> code points as
> > Unicode scalar values (which are in turn integers),
> but uses the
> > notation #x0000 and #x10FFFF to refer to the
> minimum and maximum
> > values. It's not terribly confusing in context, but strictly
> > speaking, this notation is defined in the XML
> specification as
> > denoting characters, not integers. I believe conventional
> > representations for hexadecimal numbers would write
> these values
> > as 0, 0H, x0, or 0x, and correspondingly 10FFFFH,
> x10FFFF, or
> > 10FFFFx; there may be other hexadecimal
> representations you will
> > prefer. The Unicode specification writes 10FFFF[16].
> >
> > 10. v and w: Section 7.3 says "`uve' and `uwe' are considered
> > equivalent in some European languages"; this is
> unexpected. Are
> > you sure? Which languages?
> >
> > 11. Section 7.4 para 1: for "function" read
> "functions". Here and
> > elsewhere, we believe that sentences like "Several of these
> > functions use a collation" would do better if "a
> collation" were
> > replaced with a plural: "Several of these functions use a
> > collation." Unless, of course, all of these
> functions always use
> > the same collation.
> >
> > 12. Section 7.4.6.1, final example: forgive this
> observation if it's
> > clueless, but since there does not seem to be any addition
> > operator in the example (did we miss it?), it's not
> immediately
> > obvious what -INF + INF has to do with the
> interpretation of the
> > example.
> >
> > 13. Section 7.4.15, fn:string-pad: this seems an
> unfortunate choice of
> > names for a function which does not (despite its name) pad a
> > string with blanks or some other padding
> character(s), but which
> > simply replicates or copies the string multiple
> times. Could it be
> > renamed without excessive heartburn?
> >
> > 14. Section 7.4.16, fn:escape-uri: It would help
> minimize confusion if
> > the lists of characters which are or are not
> escaped gave the
> > character names as well as the characters
> themselves in quotation
> > marks. (In the paper copy used by one member of our
> review task
> > force, this bit of the spec was almost impossible
> to make out
> > without a magnifying glass.)
> >
> > 15. Section 7.5.3, fn:replace: The description of the
> function seemed
> > unclear:
> >
> > The function returns the xs:string that is obtained by
> > replacing all non-overlapping substrings of $input that
> > match the given $pattern with an occurrence of the
> > $replacement string.
> >
> > Replacing all occurrences of the pattern with an
> occurrence of the
> > replacement string seems to suggest an n for 1
> exchange. For "all"
> > read "each". In the following paragraph, one
> occurrence of $input
> > is not marked as an identifier, one is.
> >
> >
> > References
> >
> > 1. http://www.w3.org/
> > 2. http://www.w3.org/Architecture/
> > 3. http://www.w3.org/XML/Group
> > 4. http://www.w3.org/XML/Group/Schemas
> > 5. http://www.w3.org/Member/Eventscal.html
> > 6. http://www.w3.org/Member/#confidential
> > 7. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e65
> > 8. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e70
> > 9. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e86
> > 10. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e98
> > 11. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e145
> > 12. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e178
> > 13. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e191
> > 14. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e238
> > 15. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e246
> > 16. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e251
> > 17. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e278
> > 18. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e283
> > 19. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e327
> > 20. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e350
> > 21. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e392
> > 22. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e399
> > 23. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e444
> > 24. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e463
> > 25. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e513
> > 26. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e516
> > 27. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e544
> > 28. http://www.w3.org/XML/Group/2003/07/xmlschema-fo-
> > comments.html#d0e577
> > 29.
> http://www.w3.org/XML/Group/2003/07/xmlschema-query-notes.html
> 30. http://www.w3.org/TR/2001/REC-xlink-20010627/#link-locators
>
>
Received on Wednesday, 8 October 2003 19:24:53 UTC