W3C home > Mailing lists > Public > www-xml-schema-comments@w3.org > October to December 2000

Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1

From: William Jamieson <jamieson_william@jpmorgan.com>
Date: Mon, 27 Nov 2000 13:06:59 +0000
Message-ID: <3A225C73.A4835080@jpmorgan.com>
To: www-xml-schema-comments@w3.org
>

Here is a "concrete use case" with an arbitrary sequence of repeating elements ...

When modeling financial derivatives the risk engineer will typically compose the trade from a toolkit of financial "widgets".
In the part of the model where the refix behaviour is defined he/she will create an arbitrary sequence of formulae, let's give
them names such as "applySpread", "rateToYield", "round", "observeRate", "applyCapFloor", "knockIn", "knockOut" etc.  In an
instance document (in our case a message) the order in which these can be combined is arbitrary and many of these formula can
be repeated. In the following I have omitted the (often substantial) content within each formula so as not to obscure the
point...

<refixCashflow>
    <observeMarketRate> ...etc... </observeMarketRate>
    <round> ...etc... </round>
    <yieldToRate> ...etc...</yieldToRate>
    <applySpread> ...etc... </applySpread>
    <round> ...etc... </round>
    <applyCapFloor> ... etc... </applyCapFloor>
    ... etc ...
</refixCashflow>

Tomorrow they may create a trade where the <applyCapOrFloor> is performed before the <applySpread> and <rounding> is only
performed on the final blended rate.  The proposed constraints that the "all" group imposes make this type of structure very
cumbersome to model.
This is an instance of a general class of document that describe workflow or procedure where the number and order of the
procedural steps is arbitrary.  Intellectually, for the purposes of validation the imposition of strict sequencing of data that
is in a self-describing hierachical format seems Byzantine and, at a more practical level, the performance based justification
for it is poor - it simply moves the processing burden.

regards,
WPJ

> Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1
>
> From: C. M. Sperberg-McQueen (cmsmcq@acm.org)
> Date: Wed, Oct 11 2000
>
> *Next message: Mark Swinkels: "Inconsistent usage of memberType(s) on Union"
>
>    * Previous message: Philip Wadler: "Re: LC-182 single-binding rule"
>    * In reply to: Martin J. Duerst: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Next in thread: Ivan Kurmanov: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Next in thread: Henry S. Thompson: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Reply: Ivan Kurmanov: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Reply: Ivan Kurmanov: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Reply: Martin J. Duerst: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Reply: Ronald Bourret: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>    * Other mail archives: [this mailing list] [other W3C mailing lists]
>    * Mail actions: [ respond to this message ] [ mail a new topic ]
>
>   ------------------------------------------------------------------------
>
> Message-Id: <4.3.2.7.1.20001011090517.00b50848@espanola.com>
> Date: Wed, 11 Oct 2000 09:47:04 -0600
> To: "Martin J. Duerst" <duerst@w3.org>
> From: "C. M. Sperberg-McQueen" <cmsmcq@acm.org>
> Cc: "Martin Gudgin" <marting@develop.com>, "Schema Comments" <www-xml-schema-comments@w3.org>, "Dan Rupe" <Dan_Rupe@go.com>
> Subject: Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1
>
> At 2000-10-11 02:40, Martin J. Duerst wrote:
> >Hello Martin,
> >
> >In summary, I have to say that I'm not at all satisfied with the
> >decision of the WG, and even less by the justification given below.
>
> I'm sorry to hear that, but thank you for letting us know.
>
> >>1.    complexity for schema processors
> >
> >It's a simple matter of counting, isn't it? I don't understand why
> >this should be difficult. For the current version of the all group,
> >a bit vector is needed to check that each element does occur according
> >to the occurrence constraints. This has to be bumped up to a vector of
> >integers. My guess is that this would take a few minutes in XSV, in fact
> >it may be easier to implement from scratch, because there are no special
> >restrictions on minOccurs and maxOccurs.
>
> A bit vector is one way (I believe a fairly common one) of implementing
> the and-connector; it is, however, not the only way.
>
> Any formalism for defining languages is better at some things than
> at others; adding ad hoc rules for what are thought to be special cases
> is not usually thought to be the way to improve a system.  Is there a
> reason to think that counting occurrencs in the way you suggest will be
> an exception to the general rule?  Is there a general rule that suggests
> a reason why we ought to expect this to be a common construct?  Could
> you give a concrete use case for allowing an arbitrary sequence of
> a, b, c, and d elements where (a) the sequence of the elements is
> significant, (b) each element must occur some distinct number of times
> (a one to four times, b exactly once, c ten to thirty times, and d
> exactly three times)?  I have no trouble imagining users who say that
> is what they want; I am having trouble imagining a case where they
> are right.
>
> >If there is something I have missed, please tell me.
>
> Only the general principle that ad hoc solutions lead to odd hack
> systems.
>
> >>2.    the fact that the interpretation usually desired is incompatible with
> >>that of SGML's ampersand connector
> >
> >I'm not sure I understand that. The all group is already different from
> >SGML '&' anyway. And the interpretation is straightforward. The main
> >simplification is provided by the fact that an all group can only
> >occur directly in an element, without any children groups.
> >I'm not at all suggesting to change that.
>
> Every all group currently legal has a straightforward translation into
> an SGML ampersand group which has exactly the same interpretation.
> This is not true of the construct you propose.
>
> >>3.    the feeling on the part of some WG members that this is not a pattern
> >>of document design to be recommended or supported.
> >
> >There are definitely many cases where such a pattern is not desired.
> >But there are definitely also cases where it's very helpful to have
> >them. A typical example is metadata, e.g. the HTML <head> element.
> >There, the <title> element can appear only once, the <meta> element
> >can appear many times, and so on. The same thing can be expressed
> >without this feature, but the resulting content models get clumsy
> >and error-prone. For an example, please see
> >http://lists.w3.org/Archives/Public/xmlschema-dev/2000Aug/0017.html.
>
> With respect, the correct content model here does not seem to me
> clumsy, and once the notion of deterministic content models is
> clearly understood it is not hard to write, either:
>
>    <element name='A'>
>      <complexType content='elementOnly'>
>        <sequence>
>          <element ref='test:B' minOccurs='0' maxOccurs='unbounded'/>
>          <sequence minOccurs="0" maxOccurs="1">
>            <element ref='test:C' minOccurs='1' maxOccur="1"/>
>            <element ref='test:B' minOccurs='0' maxOccurs='unbounded'/>
>          </sequence>
>        </sequence>
>      </complexType>
>    </element>
>
> Or more compactly:  <!ELEMENT A (B*, (C, B*)?) >
>
> A language which accepts a sequence of A, B, and C elements, with
> at most one A and at most one B is a bit more complex, but not too
> hard to work out.
>
>    (c*, ((a, c*, (b, c*)?) | (b, c*, (a, c*)?))?)
>
> The translation into regular expressions becomes tedious if there are
> more than two items for which the maximum cardinality is bounded but
> larger than, say, three.  If I were aware of lots of cases where such
> languages were The Right Thing, I would be working a lot harder to
> find good ways to integrate support for them into languages like
> XML DTDs and XML Schema.  But so far I don't know any serious examples
> and so I am left cold by the argument that writing a regular expression
> which counts up to various numbers for various child elements is
> too hard.
>
> >For another example, please see
> >http://slow1.w3.org/TR/xhtml-modularization/dtd_module_defs.html#a_module_Base:
> ><!ENTITY % head.content
> >     "( %HeadOpts.mix;,
> >      ( ( %title.qname;, %HeadOpts.mix;, ( %base.qname;, %HeadOpts.mix; )? )
> >      | ( %base.qname;, %HeadOpts.mix;, ( %title.qname;, %HeadOpts.mix; ))))"
> > >
>
> A nice example of precisely the pattern shown above.  I don't think this
> is hard to understand; do you?
>
> I agree that it would be simpler to write and the result would be easier to
> understand if the rules against non-deterministic content models were
> eliminated.  But those rules have, in the view of the WG, compensating
> advantages (they enable a guarantee that any schema language can be
> written as an LL(1) language, for example, which means that recursive
> descent parsers are easy to write).
>
> >It is obvious that such things can be avoided for new designs, but
> >it is questionable that this is always desirable, because it is
> >a burden for an user to learn an arbitrary element sequence.
>
> I agree that it is unpleasant for users to have to learn arbitrary
> sequences of elements.  But this is necessary only when using tools
> which have no support for syntax-directed editing.  Any SGML or
> XML editor with schema awareness will remove the necessity for the
> user to learn an arbitrary sequence of elements.
>
> >Also, it is not clear to me why the current all group is considered
> >a recommended or supportable design, whereas the changes I propose
> >are not.
>
> The current all-group closely models the rules for dumping or loading
> rows in a relational table; this is one place where arbitrary order
> has been most consistently desired by users.
>
> >>It would be helpful to us to know whether you are satisfied with the
> >>decision taken by the WG on this issue, or wish your dissent from the
> >>WG's decision to be recorded for consideration by the Director of
> >>the W3C.
> >
> >I not only wish the dissent to be recorded, I wish the decision to
> >be better explained and if possible reverted.
>
> Your dissent has been recorded.  I hope the paragraphs above have
> made the decision clearer.
>
> -Michael Sperberg-McQueen
>
>   ------------------------------------------------------------------------
>
>    * Next message: Mark Swinkels: "Inconsistent usage of memberType(s) on Union"
>    * Previous message: Philip Wadler: "Re: LC-182 single-binding rule"
>    * In reply to: Martin J. Duerst: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Next in thread: Ivan Kurmanov: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Next in thread: Henry S. Thompson: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Reply: Ivan Kurmanov: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Reply: Ivan Kurmanov: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Reply: Martin J. Duerst: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Reply: Ronald Bourret: "Re: LC-16 ( LC-132 ): Allow arbitrary order with occurrence > 1"
>    * Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
>    * Other mail archives: [this mailing list] [other W3C mailing lists]
>    * Mail actions: [ respond to this message ] [ mail a new topic ]
Received on Monday, 27 November 2000 08:09:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 6 December 2009 18:12:49 GMT