- From: Michael Dyck <jmdyck@ibiblio.org>
- Date: Sat, 27 Feb 2016 18:47:13 -0500
- To: public-xsl-query@w3.org
On 16-02-23 04:54 AM, Michael Kay wrote: > We have for many years had the rule in A.2 > > "When tokenizing, the longest possible match that is consistent with >the EBNF is used." (It used to say "... that is valid in the current context", but we decided to change it at meeting #541 (2013-05-21), based on discussion prompted by a message from you: https://lists.w3.org/Archives/Member/w3c-xsl-query/2013Feb/0059.html ) > and I have often wondered if there were cases where the phrase "that is > consistent with the EBNF" actually affected the outcome. It suggests that > the tokenization is sensitive to the grammatical context, which is a > considerable complication. For XQuery, tokenization has always had to be sensitive to grammatical context. E.g., consider: let $t := <title>let it be</title> ... The way that you 'tokenize' the three characters 'l', 'e', 't' differs depending on the grammatical context. And "Building a Tokenizer for XPath or XQuery" is complicated precisely *because* tokenization has to be sensitive to grammatical context. [https://www.w3.org/TR/xquery-xpath-parsing/] > I have submitted a test case MapConstructor-025 which does this: > > let $m := map{'a':1} return map:size(map{$m?a:true()}) > > Although Saxon can't handle this, I believe it is permitted according to > this rule. After the "?", an NCName is consistent with the EBNF but a > QName containing a colon is not, so the longest token "consistent with > the EBNF" is "a" rather than "a:true". > > Any views on whether this is a correct interpretation of the rules? Sounds correct to me. -Michael
Received on Saturday, 27 February 2016 23:48:19 UTC