- From: Ross Thompson <rthompson@contivo.com>
- Date: Wed, 17 Oct 2001 15:47:20 -0700
- To: Stanley Guan <Stanley.Guan@oracle.com>
- CC: xmlschema-dev@w3.org
Stanley Guan writes:
> So, you're saying that
> given a <pattern value=".+\.(gif|jpg|jpeg|bmp)"/>
> and a string to be validated such as "foo.bmp"
> The matcher should do something like this:
> Use the first pattern piece (i.e., ".+") in the matching
> and because it matches up the whole string "foo.bmp"
> and there are other pattern pieces remain unused. So,
> the first matching try was not successful. The matcher
> will then try to back up one code point (i.e., move "p"
> back for further matching)
>
> "p" doesn't match the pattern piece (".").
> So, the matcher back up one more code point (i.e, move
> "m" back for further matching) and so on.
>
> Not until the matcher move ".bmp" back for further
> matching, will the matcher find a good match using
> the WHOLE pattern. Namely,
> "foo" matched by ".+",
> "." matched by "\.", and
> "bmp" matched by "(gif|jpg|jpeg|bmp)".
>
> Is that what a matcher supposed to do?
That is a correct description of the functionality. In fact, regular
expression matchers can be written which do the right thing with a
more efficient algorithm that does not involve any backup.
- Ross
---
Cynic, n. A blackguard whose faulty vision sees things as they are,
not as they ought to be. -- Ambrose Bierce
Received on Wednesday, 17 October 2001 18:47:24 UTC