W3C home > Mailing lists > Public > www-ql@w3.org > October to December 2002

RE: Things that one can define with XML schema but cannot query w ith XQuery?

From: Michael Rys <mrys@microsoft.com>
Date: Thu, 31 Oct 2002 09:53:25 -0800
Message-ID: <5C39F806F9939046B4B1AFE652500A3A034B387D@RED-MSG-10.redmond.corp.microsoft.com>
To: "Kay, Michael" <Michael.Kay@softwareag.com>, "Marko Smiljanic" <markosm@cs.utwente.nl>, <www-ql@w3.org>

I fully agree with Michael. Designing schemata that rely on positional
grouping is normally bad design (there are a few exceptions). 

Best regards
Michael

> -----Original Message-----
> From: Kay, Michael [mailto:Michael.Kay@softwareag.com]
> Sent: Thursday, October 31, 2002 8:21 AM
> To: Marko Smiljanic; www-ql@w3.org
> Subject: RE: Things that one can define with XML schema but cannot
query w
> ith XQuery?
> 
> 
> > This is a question about XML Schema and XQuery
> > <!-- Excuse me if this was already discussed, I had no time
> > to carefully read the whole history of this mailing list -->
> >
> > Lets take the part of some XML schema definition:
> >
> > <xs:element name="root">
> >     <xs:complexType>
> >         <xs:sequence maxOccurs="5">
> >             <xs:element name="A"/>
> >             <xs:element name="B" minOccurs="0"/>
> >         </xs:sequence>
> >     </xs:complexType>
> > </xs:element>
> >
> > It defines that <root> can "contain" from 1 up to 5 sequences
> > of two elements <A/> and <B/>.  <B> does not have to exist.
> >
> > E.g. an XML instance confirming to XML schema above:
> >
> > <root>
> >     <A>1</A>
> >     <B>2</B>
> >
> >     <A>3</A>
> >     <!-- non existing element B -->
> >
> >     <A>4</A>
> >     <B>5</B>
> > </root>
> >
> > We can assume that the XML schema designer had a specific
> > semantics in mind when he specified that a sequence <A/><B/>
> > should repeat it self. E.g. <A/> can be the name of a man and
> > <B/> can be his address. We thus have from 1 to 5 pairs of
> > person name / person address (where address does not have to
> > be specified). Note that each single person is represented by
> > one sequence. I can say that sequence has a clear semantics
> > i.e. 1 sequence = 1 person. (and each person has a name and
> > an address)
> >
> > My questions are:
> >
> > a) How can we specify a query in XQuery language saying that
> > I'm interested in 2nd persons address (i.e. <B>). (The answer
> > should be empty for the example above).
> 
> You are quite right that these structures are very difficult to query.
> These
> come up quite frequently in XSLT, I usually refer to them as
"positional
> grouping" problems. There are two classes of solution, one involves a
> recursive function processing the sequence of siblings, the other
involves
> treating it as a value-based grouping problem, using the ID of the
start
> element of a group as the grouping key. Unfortunately the second
solution
> relies heavily on the use of the generate-id() function and the
sibling
> axes, neither of which are available in XQuery.
> 
> The usual advice is that this is bad XML design. There is a level of
> hierarchy that's missing from the markup, an important object in the
data
> model that isn't represented by an element in the XML. I would advise
> anyone
> to add this missing level (it can be done easily using the grouping
> facilities in XSLT 2.0) before storing the data in a database.
> 
> >
> > b) This is similar to the problem in a): is there any way
> > that that I can count the number of A,B sequences in an
> > instance XML document (using XQuery). (The example above has
> > 3 sequences).
> >
> > If the answers to those questions are negative, then there is
> > a conflict between XML Schema and XQuery. Sequence, all and
> > choice are structures that can be define in XML Schema, but
> > are not visible in XML instance and might not be accessible
> > with XQuery. XML parser can surely count them, but can the
> > XQuery do the same.
> >
> 
> I don't regard it as a conflict. Just because Schema provides
constructs
> that enable you to check that your data has a particular pattern
doesn't
> mean that Query should be able to locate the objects implied by that
> pattern. If you want to use the structure in a query, add another
level of
> elements to make it explicit.
> 
> Michael Kay
Received on Thursday, 31 October 2002 12:53:58 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:17:15 UTC