Re: [xml-dev] New release (2.8) of XSV from Henry S. Thompson on 2004-10-08 (xmlschema-dev@w3.org from October 2004)

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Fri, 08 Oct 2004 17:30:50 +0100
To: Jeff Rafter <lists@jeffrafter.com>
Cc: xmlschema-dev@w3.org, xml-dev@lists.xml.org
Message-ID: <f5b8yahqiit.fsf@erasmus.inf.ed.ac.uk>

Jeff Rafter <lists@jeffrafter.com> writes:

> Does this mean that you have modified the approach set forth in
> http://www.ltg.ed.ac.uk/~ht/XML_Europe_2003.html? 

Yes.  That approach unfolds numeric exponents, producing a number of
states which grows linearly with maxOccurs for elements and
non-nesting groups, and grows exponentially for nesting groups, that
is if you have (a, b{2,5}, c){1,100} the number of states is
Order(500).

The new approach is linear in the number of particles, because it uses
counters, not unfolding.

> If so, do you have any metrics on changes to the time complexity?

Should not change the _runtime_ complexity at all, which was always
the normal FSM complexity.  Changes the _compile-time_ complexity
significantly, from exponential to linear.

The really good news is that this approach doesn't require XSV to punt
in the face of large exponents, which it used to do (i.e. treated all
numbers > 100 in min/maxOccurs as if they _were_ 100).  All other
existing processors do something similar (that is, punt above some
number), I believe.

I hope to write up the new algorithm RSN . . .

ht
-- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                   URL: http://www.ltg.ed.ac.uk/~ht/
[mail really from me _always_ has this .sig -- mail without it is forged spam]

Received on Friday, 8 October 2004 16:30:56 UTC