W3C home > Mailing lists > Public > xmlschema-dev@w3.org > October 2004

Re: [xml-dev] New release (2.8) of XSV

From: Daniel Veillard <daniel@veillard.com>
Date: Mon, 11 Oct 2004 15:14:27 +0200
To: "Henry S. Thompson" <ht@inf.ed.ac.uk>
Cc: daniel@veillard.com, Jeff Rafter <lists@jeffrafter.com>, xmlschema-dev@w3.org, xml-dev@lists.xml.org
Message-ID: <20041011131427.GC11346@daniel.veillard.com>

On Sun, Oct 10, 2004 at 10:57:32PM +0100, Henry S. Thompson wrote:
> Daniel Veillard <daniel@veillard.com> writes:
> 
> > libxml2 regexps used counters since day 1 for min/maxoccurs implementation.
> > The explosion didn't look a supportable alternative to me as it opens
> > the door to trivial DoS attacks or forces to break the schemas validation
> > which is also a big problem if you consider schemas as a contract between
> > two communicating parties.
> 
> I would be very interested, as I'm sure would others, in a description
> of your algorithm.

  Well nothing fancy really. You need to add state to the regexp, in that
case the state is the number of time you went through the transition
labelled by the element name:namespace pair. Due to the single particle
rule, it means you have no need to be able to rollback the state to
enter a different transition labelled by the same name:namespace pair.
Which means the effective state required is purely linear based on the
number of counted transition you have in your regexp: one integer counter
is sufficient per such transition in the regexp runtime structure. It's
actually more automata with state that I'm using than pure regexps.
And the generated constructs for something like x{a,b} involves 
3 states (IIRC) a couple of epsilon transitions and the counted regexp,
but it's not really rocket sience either, very similar to what I was
taught in the classroom.

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel@veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | 
Received on Monday, 11 October 2004 13:15:14 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:15:11 UTC