[Bug 3758] [FS] technical: 4.7.1: losing type information

http://www.w3.org/Bugs/Public/show_bug.cgi?id=3758

           Summary: [FS] technical: 4.7.1: losing type information
           Product: XPath / XQuery / XSLT
           Version: Candidate Recommendation
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: minor
          Priority: P2
         Component: Formal Semantics
        AssignedTo: simeon@us.ibm.com
        ReportedBy: jmdyck@ibiblio.org
         QAContact: public-qt-comments@w3.org


4.7.1 / Norm
"In general, we do not want to convert all atomic values to text nodes,
especially when performing static-type analysis, because we lose useful
type information. [For example,
    <date>{ xs:date("2003-03-18") }</date>
should be normalized to
    element date { xs:date("2003-03-18") }
rather than
    element date { text { "2003-03-18" } }
because the latter loses useful type info.] To preserve useful type
information, we distinguish between direct element constructors that
contain one element-content unit and those that contain more than one..."

    If you perform STA on these two CompElemConstructors (using
    4.7.3.1 / STA / rule (1|2)), you find that
        element date { xs:date("2003-03-18") }
    fails at premise 4, because xs:date is not a subtype of
        attribute *, (element | text | comment | processing-instruction)*

    And even if you fix those premises to handle atomic types, you wind
    up with the same type for the two CompElemConstructors, i.e. either
        element date of type xs:anyType
    or
        element date of type xs:untyped
    depending only on statEnv.constructionMode. The type of the content
    expression doesn't have any effect.

    Thus, although converting all atomic values to text nodes may lose
    type information, it's information that will be lost anyway. So there
    doesn't seem to be any reason to handle the n=1 case differently.

    (The word "especially" suggests that there are occasions *other* than
    STA when "we do not want want to convert all atomic values to text
    nodes". What did you have in mind there?)

    (Section 4.7.1.1 has a similar split betwen the n=1 and n>1 cases,
    which is even less justified.)

Received on Thursday, 21 September 2006 03:57:51 UTC