XQuery Formal Semantics: normalization

XQuery 1.0 Formal Semantics 
W3C Working Draft 26 March 2002

2.5.3 Normalization rules
(By the way, this section is missing from the Table of Contents.)

"All normalization rules have the following structure:
     [Expr]
       ==
    coreExpr
The [ Expr ] above the == sign denotes an expression in the XQuery language
before translation and the coreExpr beneath the == sign denotes an
equivalent expression in the XQuery core."

Of course, the right-hand-side of a normalization rule is usually not a core
expression, because it contains "normalization calls" which are obviously
not in the XQuery language. But the intent seems to be that if those calls
are expanded (with respect to a particular subject expression) and so on
recursively, the result will indeed be a core expression. Which suggests
that if I've got a query q, and I want to normalize it, all I have to do
is compute the expansion of [[q]] (using double-brackets as the plain-text
equivalent of the big bold normalization brackets).

There are some problems with this interpretation, though:

(1) In some rules, the RHS contains non-core constructs outside of
    normalization calls. This means that [[q]] isn't necessarily a core
    expression.

    For instance, in 4.3.5, the rules for
        // RelativePathExpr
    and
        StepExpr // RelativePathExpr
    have occurrences of '/' outside normalization calls.
    (Contrast this with the rules in 4.3, where the only RHS-occurrence of
    '/' is within a normalization call.)

    And in 4.5-4.7, many of the right-hand sides use a LetClause that binds
    multiple variables, which is not allowed in the core.

    Are these just mistakes?

(2) Some normalization rules are missing. This means that [[q]] isn't
    necessarily even defined.

    In some cases, the absence of the rules is probably just oversight.
    For instance, 4.10 has a normalization rule for SortSpecList but not
    one for SortExpr.

    And in other cases, you just haven't got there yet (e.g., 4.8.1
    Element Constructors).

    But in other cases, it's quite intentional that the rules are missing.
    For instance, 4.2.1 says  "All literals are core expressions, therefore
    there are no normalization rules for literals." Rather than no rules,
    I believe you need "identity" rules, e.g.
        [[ IntegerLiteral ]] == IntegerLiteral

It's possible that your intent for normalization was more of a term-
rewriting system, in which rewrite rules are applied, repeatedly, anywhere
in the query, until no more rewrites are possible. But most of the evidence
indicates otherwise. Moreover, in such a system, I believe it would be
harder to prove (a) termination, and (b) correctness.

Clarification would be nice.

-Michael Dyck

Received on Wednesday, 15 May 2002 13:35:36 UTC