- From: Michael Rys <mrys@microsoft.com>
- Date: Thu, 31 Oct 2002 09:53:25 -0800
- To: "Kay, Michael" <Michael.Kay@softwareag.com>, "Marko Smiljanic" <markosm@cs.utwente.nl>, <www-ql@w3.org>
I fully agree with Michael. Designing schemata that rely on positional grouping is normally bad design (there are a few exceptions). Best regards Michael > -----Original Message----- > From: Kay, Michael [mailto:Michael.Kay@softwareag.com] > Sent: Thursday, October 31, 2002 8:21 AM > To: Marko Smiljanic; www-ql@w3.org > Subject: RE: Things that one can define with XML schema but cannot query w > ith XQuery? > > > > This is a question about XML Schema and XQuery > > <!-- Excuse me if this was already discussed, I had no time > > to carefully read the whole history of this mailing list --> > > > > Lets take the part of some XML schema definition: > > > > <xs:element name="root"> > > <xs:complexType> > > <xs:sequence maxOccurs="5"> > > <xs:element name="A"/> > > <xs:element name="B" minOccurs="0"/> > > </xs:sequence> > > </xs:complexType> > > </xs:element> > > > > It defines that <root> can "contain" from 1 up to 5 sequences > > of two elements <A/> and <B/>. <B> does not have to exist. > > > > E.g. an XML instance confirming to XML schema above: > > > > <root> > > <A>1</A> > > <B>2</B> > > > > <A>3</A> > > <!-- non existing element B --> > > > > <A>4</A> > > <B>5</B> > > </root> > > > > We can assume that the XML schema designer had a specific > > semantics in mind when he specified that a sequence <A/><B/> > > should repeat it self. E.g. <A/> can be the name of a man and > > <B/> can be his address. We thus have from 1 to 5 pairs of > > person name / person address (where address does not have to > > be specified). Note that each single person is represented by > > one sequence. I can say that sequence has a clear semantics > > i.e. 1 sequence = 1 person. (and each person has a name and > > an address) > > > > My questions are: > > > > a) How can we specify a query in XQuery language saying that > > I'm interested in 2nd persons address (i.e. <B>). (The answer > > should be empty for the example above). > > You are quite right that these structures are very difficult to query. > These > come up quite frequently in XSLT, I usually refer to them as "positional > grouping" problems. There are two classes of solution, one involves a > recursive function processing the sequence of siblings, the other involves > treating it as a value-based grouping problem, using the ID of the start > element of a group as the grouping key. Unfortunately the second solution > relies heavily on the use of the generate-id() function and the sibling > axes, neither of which are available in XQuery. > > The usual advice is that this is bad XML design. There is a level of > hierarchy that's missing from the markup, an important object in the data > model that isn't represented by an element in the XML. I would advise > anyone > to add this missing level (it can be done easily using the grouping > facilities in XSLT 2.0) before storing the data in a database. > > > > > b) This is similar to the problem in a): is there any way > > that that I can count the number of A,B sequences in an > > instance XML document (using XQuery). (The example above has > > 3 sequences). > > > > If the answers to those questions are negative, then there is > > a conflict between XML Schema and XQuery. Sequence, all and > > choice are structures that can be define in XML Schema, but > > are not visible in XML instance and might not be accessible > > with XQuery. XML parser can surely count them, but can the > > XQuery do the same. > > > > I don't regard it as a conflict. Just because Schema provides constructs > that enable you to check that your data has a particular pattern doesn't > mean that Query should be able to locate the objects implied by that > pattern. If you want to use the structure in a query, add another level of > elements to make it explicit. > > Michael Kay
Received on Thursday, 31 October 2002 12:53:58 UTC