W3C home > Mailing lists > Public > xmlschema-dev@w3.org > October 2001

Re: Is <pattern value="(.)+\.(gif|jpg|jpeg|bmp)"/> allowed?

From: Ross Thompson <rthompson@contivo.com>
Date: Wed, 17 Oct 2001 15:47:20 -0700
Message-ID: <15310.2680.447480.324093@localhost.localdomain>
To: Stanley Guan <Stanley.Guan@oracle.com>
CC: xmlschema-dev@w3.org
Stanley Guan writes:
 > So, you're saying that
 >   given a <pattern value=".+\.(gif|jpg|jpeg|bmp)"/>
 >   and a string to be validated such as "foo.bmp"
 > The matcher should do something like this:
 >       Use the first pattern piece (i.e., ".+") in the matching
 >       and because it matches up the whole string "foo.bmp"
 >       and there are other pattern pieces remain unused. So,
 >       the first matching try was not successful.  The matcher
 >       will then try to back up one code point (i.e., move "p"
 >       back for further matching)
 >       "p" doesn't match the pattern piece (".").
 >       So, the matcher back up one more code point (i.e, move
 >      "m" back for further matching) and so on.
 >       Not until the matcher move ".bmp" back for further
 >       matching, will the matcher find a good match using
 >       the WHOLE pattern. Namely,
 >           "foo"  matched by ".+",
 >           "." matched by "\.", and
 >           "bmp" matched by "(gif|jpg|jpeg|bmp)".
 > Is that what a matcher supposed to do?

That is a correct description of the functionality.  In fact, regular
expression matchers can be written which do the right thing with a
more efficient algorithm that does not involve any backup.

- Ross

Cynic, n.  A blackguard whose faulty vision sees things as they are,
not as they ought to be.		-- Ambrose Bierce
Received on Wednesday, 17 October 2001 18:47:24 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:14:54 UTC