- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Tue, 10 Aug 2010 18:25:37 +0100
- To: Casey Jordan <casey.jordan@jorsek.com>
- Cc: Michael Kay <mike@saxonica.com>, "Cheney, Edward A SSG RES USAR USARC" <austin.cheney@us.army.mil>, xmlschema-dev@w3.org, www-dom@w3.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Casey Jordan writes:
> <sequence>
> <element ref="e1"/>
> <choice minOccurs="2">
> <element ref="e2" maxOccurs="2"/>
> <element ref="e3"/>
> </choice>
> <element ref="e4"/>
> </sequence>
>
> This can be seen as a regular expression e1( e2{1,2} | e3 ){2,2} e4. My
> greedy algorithm cannot validate this correctly.
>
> Assuming input like <e1/><e2/><e2/><e4/>
This problem is discussed in [1], which asserts that you need to fall
back to pseudo-parallel or backtracking in such cases. The Python
code in XSV, which I pointed you to yesterday, implements this
fallback.
ht
- --
Henry S. Thompson, School of Informatics, University of Edinburgh
10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 651-1426, e-mail: ht@inf.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
[mail from me _always_ has a .sig like this -- mail without it is forged spam]
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.6 (GNU/Linux)
iD8DBQFMYYuRkjnJixAXWBoRAk1IAJoC+/plq5hesivHGcVJoxCwMn3MMACfRsZp
SKwiRx0+ZdwKtHXRq8jOwjE=
=tq8F
-----END PGP SIGNATURE-----
Received on Tuesday, 10 August 2010 17:26:25 UTC