- From: Ashok Malhotra <ashokma@microsoft.com>
- Date: Wed, 9 Jan 2002 14:57:31 -0800
- To: <Davidc@nag.co.uk>
- Cc: <www-xml-query-comments@w3.org>, <xsl-list@lists.mulberrytech.com>, <w3c-xml-query-wg@w3.org>
- Message-ID: <E5B814702B65CB4DA51644580E4853FB019EE73B@red-msg-12.redmond.corp.microsoft.com>
David: Thank you for your comments on the F&O draft. I've inserted responses in the text of your note below. This is a personal response. It does not constitute an official response from the XML Query WG and has not been approved by the WG. All the best, Ashok =========================================================== 3.2 numeric constructors Just wanted to voice strong agreement with issue 149: these should not be restricted to string literals. At the underlying semantic level you need constructors but at the functions-exposed to user level this can be merged with functions casting from strings (or anything else). [AM] Noted. The thought was that constructors created typed values from literals. You used a cast to return a typed value from an expression. This argument has been diluted by the fact that you cannot now cast to/from derived types. 4.2.1 xf:string I commented on this in the last draft, the text has changed but it is still contradictory. I can not "correctly perceive" it as a no-op if in the next paragraph it is implied that it does W3C normalisation which is nothing at all like a no-op. Also the example still uses &# notation with text that implies that there will always be an XML parser in the loop which isn't the case for Xquery at present. [AM] It's meant to be a no-op. I think it's the example and the following note that pertains to the example that are causing confusion. We'll try and fix this. 4.2.2 xf:normalisedString Is there any use case for this? It seems to be rather a bizarre thing. The normalisation could be done by the user using translate() if desired. The restriction on not having #xD in the argument will be almost impossible to maintain in non XML uses of Xpath. XML normalises all line ends to #xA but in a non XML setting line ends may well be #xD or #xD#xA pairs, in which case normalising just #xA and declaring #xD an error will mean that an Xquery breaks just by moving the text file containing it from one place to another (unless every host language for xpath does a similar line end normalisation) [AM] The F&O provides constructors for all built-in XML Schema datatypes. normalizedString is a built-in Schema datatype (derived from string). Of course you could create a string and normalize it but then it would be typed as string not normalizedString. 4.4 xf:lower-case Is this collation dependent? I couldn't tell from the previous section 4.3 what exactly a collation controlled. (ie how do I get that the lowercase of I is dotless i in Turkey?) [AM] Experts tell me that collations cannot be used to do translations. An example such as the above requires a dictionary that depends on the country, language, etc. It does require extra information but it's not a collation. xf:match This seems to be underspecified in cases that the matching regions overlap. if the regexp is aa and the string is aaa do you just get (1) or (1 2) (this also applies to xf:replace) Slightly worried that, since xpath sequences do not nest, this semantic will prevent any future extension to allow sed/emacs/perl style numbered subexpressions. Also it forces the system always to match the entire string, which may be rather long, rather than stopping once a match is found. If instead it just returned the position of the first match a plausible extension would be that if the regexp was \(aa\)xx\(bb\) then what was returned was a sequence consisting of the position of the entire match followed by the positions of each of the subexpressions. a future extension to xf:replace could then use (something equivalent to &1 or $1 or \1 in current regexp languages) to access the matched subexpressions in the replacement text. [AM] Agree. There are two issues that request amplification of the semantics. 5.1.3 If this only takes a string literal (as commented above I think all user accessible functions should not have this restriction) then why do a case mapping. if it has to be a literal you may as well demand "TRUE" rather than " true". (Also if it only takes literals it serves no purpose at the user level as it could always be replaced by true() false() or an error. [AM] I agree that if it takes a string literal its useless as true() and false() cover the same ground. 5.2.1 op:boolean and the text says it backs up the "and" operator but I think that has to be backed up as an if clause, to get the correct semantics if one operand could raise an error. [AM] The only error would be if one or both operands were not booleans. This would be caught when the operands of the operator were type checked. 5.3.2 xf:not3 SQL can treat null specially in three valued logic as it knows that any nulls are there for that purpose. Xpath should not assume any special semantics for an empty sequence. This might be an "unknown value" in which case a three valued logic might give reasonable results, or it might be a fixed default value, or anything else, depending on the document type. For a particular class of documents the user can define not3 if it makes sense, but functions assuming a particular interpretation of () should not be in the core of a general XML query language such as Xpath. [AM] Noted. 6 I think that all new functions should match the existing xpath naming convention, ie lowercase - separated words. When mapping names from other languages that have other naming conventions (e.g. camel case) then some extra - may need to be added, and the names lowercased. so I thing dateTime should be date-time throughout gMonthDay should be g-month-day etc. [AM] We tried to do this except where the name includes the name of a XML Schema datatype which uses intercapitalization, hence xf:dateTime(). especially bad is captial c in get-Century but lowercase h in get-hour [AM] That's a bug. I'll fix it. 9 I think I read this as saying that eq compares the base64 encoded string as it appears in the XML (including any white space that would be ignored in the base64 decoding) a more interesting equality iis to compare the base 64 encoded strings ignoring white space (which effectivly compares the encoded data) [AM] Noted. Frankly, we are not sure how important functions on base64 and hexBinary are? Would appreciate feedback. 11.1.4 xf:deep-equal While many queries will need some version of deep equality, the exact details depend very much on the job in hand (ignore comments? white space? element names?) I think it would be better to remove this and have the xquery and xslt drafts give examples of deep equality definitions in their respective user-defined function syntax. [AM] It was felt that a structural equality function was needed. I agree that we need arguments to indicate whether comments and PIs should be ignored or not. 11.1.7 xf:copy I commented on this last time, but _please_ change the name of this function, it is massively confusing given that in XSLT copy does a shallow copy. [AM] deep-copy? Given the note that XSLT will not support this, it should not be in the core at all and moved into a XQuery specific function library. [AM] At this time there is but one function library. same comments for xf:shallow. 11.2 xf:if-absent appears to be a workaround for the loss of the Xpath 1.0 semantic that one can test for empty node sets (sequences) just by coercing to boolean. I very much regret the loss of this semantic. If it could be restored then if-absent would be redundent. if-empty is another example of core functions assuming too much about the way data is encoded in XML. testing for empty data means different things to different people and all of them are simply expressable with existing Xpath constructs, there is no need for this function and it should be removed. In both cases having "if" functionality as a function has the bad effect that the operand is always evaluated even in the false case, so a user would be well advised not to use these functions and instead use an if expression. 12.2.11 why is this sublist and not subsequence? [AM] Good point! 12.4 If the values of the nodes in the sequence are themselves list valued do all the terms in the individual lists get aggregated, and in the case of avg how many terms is the average over? [AM] We do not have nested sequences. So, we start by creating a one-level sequence. The sum of an empty sequence should be 0 not (). [AM] We follow the SQL definition. 12.5 do xf:id and idref only operate on the current document (I assume so, but it isn't stated) [AM] There are issues on the scope of these functions. filter sounds like it may possibly be useful, but it's a bad name. [AM] The name was generated by the XQuery syntax folks. document has lost most of the functionality in the xslt 1.0 version, which needs to be restored. [AM] I agree. There is an issue to that effect. 14 as commented above I believe that casting and constructors should be merged at the user level (although of course they need to be distinct in the formal semantics). Given a function that casts, there is no reason to make available the constructor which has the same functionality but is restricted to having literals as input. The constructor can presumably be optimised but any optimising compiler ought be able to spot a function call with a literal argument and do the same I'd have thought (but I've never written an optimising compiler:-)
Received on Wednesday, 9 January 2002 17:58:04 UTC