- From: Michael Rys <mrys@microsoft.com>
- Date: Wed, 11 Feb 2004 13:22:16 -0800
- To: <antoine.mensch@xquarkgroup.com>, <public-qt-comments@w3.org>
First: We decided a long time ago in the WG, that type inference of the result is not a requirement, but a feature that some implementations may provide. Second: If you provide that, I do not see a problem of inferring a type that defines a content model that accepts the result. In the worst case it is element(C, xs:anyType)... Best regards Michael > -----Original Message----- > From: Antoine Mensch [mailto:antoine.mensch@xquarkgroup.com] > Sent: Wednesday, February 11, 2004 1:20 PM > To: Michael Rys; public-qt-comments@w3.org > Subject: RE: Serialization (sometimes) needs to include type information > > > While I agree with Mike Kay's comments, let me just point out > > that using a construction will retype your C elements to the type > > required by the result element and will not preserve the original > > type annotation. > > > > I've looked at the element constructor semantics: while I've always > considered a bit "strange" for a strongly-typed language to loose all type > information during a simple clone operation, I think the approach is OK as > long as you can easily retype the cloned nodes by simply defining the > appropriate XML schema components (possibly reusing components from the > source schemas). The whole point of my message is to point out an > important > class of XQuery expressions (union-like XPath expressions, including the > very popular descendant-or-self step) for which such a retyping is very > difficult without proper support from the serialization, or perhaps I > should > say XQuery processor, since you are right in pointing out that it is at > the > level of the element constructor that the type information is lost. > > Without such a support, executing even the simplest queries over SOAP > would > be very difficult, because there will be no way the service could retrieve > and pass along to the user the type of the data returned by a query such > as > > doc("myDocument")//C > > as soon as there are more than one type of C elements in the document. > > > So, if you want to get the original types in the data model > > instance that your query generates, you would need to write: > > > > for $x in doc("myDocument")//C > > return $x > > > > And how can I use this result? If C elements contain both xs:string or > xs:date, how do I serialize this sequence of nodes without losing the type > information? I cannot define a content model for this sequence, so it's > only > by looking at the individual nodes that I can retrieve the information. > > Best regards, > > Antoine Mensch > > > -----Message d'origine----- > > De : Michael Rys [mailto:mrys@microsoft.com] > > Envoyé : mercredi 11 février 2004 20:11 > > À : antoine.mensch@xquarkgroup.com; public-qt-comments@w3.org > > Objet : RE: Serialization (sometimes) needs to include type information > > > > > > While I agree with Mike Kay's comments, let me just point out > > that using a construction will retype your C elements to the type > > required by the result element and will not preserve the original > > type annotation. > > > > So, if you want to get the original types in the data model > > instance that your query generates, you would need to write: > > > > for $x in doc("myDocument")//C > > return $x > > > > As soon as you add <result> around it, you will get the content > > retyped (see the section on Typing and element construction in > > the language document at [1]). > > > > Best regards > > Michael > > > > [1] http://www.w3.org/TR/2003/WD-xquery-20031112/#id-type-of-constructed > > > > > > > -----Original Message----- > > > From: Antoine Mensch [mailto:antoine.mensch@xquarkgroup.com] > > > Sent: Wednesday, February 11, 2004 2:58 AM > > > To: public-qt-comments@w3.org; Michael Rys > > > Subject: RE: Serialization (sometimes) needs to include type > information > > > > > > > Note that if you validate the result according to an in-scope > > > > schema component of an element result, then the elements inside > > > > will be typed according to that type and not the original type. > > > > So again, there is no type ambivalence and the schema of the > > > > result element can be easily provided in addition to the data > > generated. > > > > > > I will try to be more concrete: > > > I have two C elements of type ns1:Type1 and ns1:Type2 in a document. I > > > assume that all necessary in scope information is available and > > that I am > > > using validation mode. > > > > > > The element declaration for the result element in the query > > > > > > <result> > > > { > > > for $x in doc("myDocument")//C > > > return $x > > > } > > > </result> > > > > > > should be written (if I want to retain type information in the result > > > document) as: > > > > > > <xs:element name="result"> > > > <xs:complexType> > > > <xs:choice maxOccurs="unbounded"> > > > <xs:element name="C" type="ns1:Type1"/> > > > <xs:element name="C" type="ns1:Type2"/> > > > </xs:choice> > > > </xs:complexType> > > > </xs:element> > > > > > > Unfortunately, this is not a valid schema component. The only > > solution to > > > have a valid element declaration is to declare the result > > element C with a > > > common supertype of both ns1:Type1 and ns1:Type2 (which will be > > xs:anyType > > > in the general case). > > > > > > Therefore, in order to retain the original type information (which > could > > > well be something as simple as knowing whether the C element > > content is a > > > date or a string, when ns1:Type1 and ns1:Type2 are respectively > xs:date > > > and > > > xs:string), I need to annotate each occurrence of C in the > > output with the > > > original type information. > > > > > > Best regards, > > > > > > Antoine Mensch > > > > > > > -----Message d'origine----- > > > > De : Michael Rys [mailto:mrys@microsoft.com] > > > > Envoyé : mercredi 11 février 2004 11:28 > > > > À : antoine.mensch@xquarkgroup.com; public-qt-comments@w3.org > > > > Objet : RE: Serialization (sometimes) needs to include type > > information > > > > > > > > > > > > Note that if you validate the result according to an in-scope > > > > schema component of an element result, then the elements inside > > > > will be typed according to that type and not the original type. > > > > So again, there is no type ambivalence and the schema of the > > > > result element can be easily provided in addition to the data > > generated. > > > > > > > > Best regards > > > > Michael > > > > > > > > > -----Original Message----- > > > > > From: public-qt-comments-request@w3.org [mailto:public-qt- > comments- > > > > > request@w3.org] On Behalf Of Antoine Mensch > > > > > Sent: Wednesday, February 11, 2004 12:54 AM > > > > > To: public-qt-comments@w3.org > > > > > Subject: RE: Serialization (sometimes) needs to include type > > > information > > > > > > > > > > > > > > > > > > > > > > First, you need to give us some more information about > > the in-scope > > > > > > schema components and validation mode for your query. > > > > > > > > > > > I would like to have an in-scope schema component allowing me > > > > to validate > > > > > (so I assume strict or lax validation) the "result" element and > > > > construct > > > > > a > > > > > PSVI that will contain the same (or equivalent) type information > as > > > the > > > > > source data. What I am trying to express is that I cannot write > such > > > > > valid > > > > > complex type, due to restrictions in the schema specification > > > (elements > > > > > with > > > > > the same name must have the same type in a given content model). > > > > > > > > > > > Assuming that you imported the two elements below, have > > the document > > > > > > typed with the information and have lax validation mode, then > your > > > > > > result would be an element result of type xdt:untyped since > > > > it could not > > > > > > find a definition in the schema components for the result > > > > element. This > > > > > > then also means that the C elements will be untyped and > > not preserve > > > > > > their original type. > > > > > > > > > > > > > > > > Yes, I understand that. The point is that I cannot write > > such a schema > > > > > component for the "result" element. > > > > > > > > > > > So your example does not convey the semantics that you assume it > > > does > > > > > > and does not require a type serialization. > > > > > > > > > > > > > > > > I hope the above clarifies why it does require type serialization. > > > > > > > > > > > Also note that XML is primarily late typed data: You have the > > > > > > self-describing XML document and you associate type information > > > after > > > > > > creation of a document. Thus, mandating the serialization of > type > > > > > > information and thus making the document early typed > > seems contrary > > > to > > > > > > the general XML philosophy. > > > > > > > > > > > > > > > > It seems to me that the xsi:type attribute (which I think is > > > > not contrary > > > > > to > > > > > the general XML philosophy) has been introduced for exactly > > > > that purpose. > > > > > > > > > > In addition, consider the following extract from the Data Model > > > > spec (§4): > > > > > "Constructing an Infoset from an instance of the data model, for > > > example > > > > > in > > > > > order to perform schema validity assessment, is accomplished by > > > > > serializing > > > > > the document and parsing it. Implementations are not required > > > > to implement > > > > > this process literally, but they must obtain the same result as if > > > they > > > > > had." > > > > > > > > > > This is impossible if we cannot specify the types of each > > individual C > > > > > elements in the result (though xsi:type). > > > > > > > > > > Best regards, > > > > > > > > > > Antoine Mensch > > > > > > > > > > > > -----Original Message----- > > > > > > > From: public-qt-comments-request@w3.org [mailto:public-qt- > > > comments- > > > > > > > request@w3.org] On Behalf Of Antoine Mensch > > > > > > > Sent: Wednesday, February 11, 2004 12:23 AM > > > > > > > To: public-qt-comments@w3.org > > > > > > > Subject: Serialization (sometimes) needs to include type > > > information > > > > > > > > > > > > > > > > > > > > > Consider the following schema fragment: > > > > > > > > > > > > > > <xs:element name="A"> > > > > > > > <xs:complexType> > > > > > > > <xs:sequence> > > > > > > > <xs:element name="C" type="myns:Type1"/> > > > > > > > </xs:sequence> > > > > > > > </xs:complexType> > > > > > > > </xs:element> > > > > > > > > > > > > > > <xs:element name="B"> > > > > > > > <xs:complexType> > > > > > > > <xs:sequence> > > > > > > > <xs:element name="C" type="myns:Type2"/> > > > > > > > </xs:sequence> > > > > > > > </xs:complexType> > > > > > > > </xs:element> > > > > > > > > > > > > > > Now if we consider a document (or any other data source) > > > containing > > > > > > both A > > > > > > > and B elements, the following query > > > > > > > > > > > > > > <result> > > > > > > > { for $x in doc("myDocument")//C > > > > > > > return $x > > > > > > > } > > > > > > > </result> > > > > > > > > > > > > > > returns a result that cannot be strongly typed without > > losing type > > > > > > > information by any valid schema, as the schema spec forbids > > > elements > > > > > > with > > > > > > > the same name and a different type in the same content model. > > > > > > > > > > > > > > It seems to me that the only way of retaining type information > > > would > > > > > > be to > > > > > > > annotate produced C elements with xsi:type. This could be a > > > > > > serialization > > > > > > > parameter, similar to the cdata-section-elements. However, > > > > this would > > > > > > > raise > > > > > > > another issue, as anonymous type names would then be > > exposed, and > > > > > > would > > > > > > > thus > > > > > > > require to be handled in a consistent way by different > > > > XQuery and XML > > > > > > > Schema > > > > > > > processors. > > > > > > > > > > > > > > This issue is important, especially for tools that perform > > > > distributed > > > > > > > XQuery processing, and that need to retain consistent type > > > > information > > > > > > > when > > > > > > > moving XML data from one processing node to another. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Received on Wednesday, 11 February 2004 16:22:24 UTC