W3C home > Mailing lists > Public > whatwg@whatwg.org > December 2004

[whatwg] Web Forms 2.0 patterns...

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 16 Dec 2004 22:18:09 +0000 (UTC)
Message-ID: <Pine.LNX.4.61.0412162205480.18787@dhalsim.dreamhost.com>
On Mon, 13 Dec 2004, Steve Webster wrote:
> 
> Firstly, this is my first post to this list and I'm not sure if I'm 
> violating any kind of list etiquette by just posting straight out. I'm 
> just a meager web developer, so I'm not even sure that I have any right 
> to be commenting on this specification, but I thought I'd give it a go.

Your input is more than welcome! You are exactly the kind of person whose 
feedback is most valuable. (Having said that, don't take it personally if 
I disagree with some of your comments!)


> I'm concerned about the implicit start-of- and end-of-string anchors 
> that are to be applied to a pattern. While I appreciate that the 
> majority of use cases would likely require exact user input matching, I 
> would argue that developers could not reasonably anticipate that these 
> anchors would be applied. Indeed, I can think of no other implementation 
> of regular expressions that operates in this way, and I fear that this 
> will only serve to confuse developers already familiar with other 
> regular expression implementations.

The thinking behind the requirement is that it is easier to catch mistakes 
from people who assume that the pattern doesn't include those, than it is 
from those who assume that it does.

In addition, since most patterns are full-string patterns (in fact, I 
can't think of any useful patterns that aren't), it reduces the clutter, 
which is always good for regular expressions.

For instance, if a user wants to require a four digit PIN, and he doesn't 
know that ^/$ are implied, he would say:

   pattern="^[0-9]{4}$"

...and it would work. Or he might forget them, or assume they would be 
there (most pattern matching in form systems that don't use regexps do 
assume that patterns are full-match):

   pattern="[0-9]{4}"

...and it would work. In both cases, simple tests would show that it 
worked, and the author would be happy.

If the author didn't want to have implied ^/$, and didn't know they were 
implied, and wrote a pattern that would match any string that contained 
the word "yes":

   pattern="yes"

...he would immediately find that it didn't work, since it only matches 
that exact string, and even basic testing would catch that.

On the other hand, if we _didn't_ imply those characters, and an author 
assumed they were there (due to, for instance, experience with other form 
systems), and wanted to match a four-digit PIN:

   pattern="[0-9]{4}"

...he would in simple testing find it worked fine, but would likely miss 
the fact that a five digit PIN, would also be accepted.


> It could also be argued that developers who might use such an advanced 
> feature are likely to also have good knowledge of (and be using patterns 
> in conjunction with) ECMAScript and its regular expressions, which do 
> not work in the same way as the proposed pattern attribute would.

I believe that most people using this feature will actually be exposed to 
regexps for the first time with this feature.


> I feel that the justification given in the specification for these 
> implicit anchors - chiefly that it is easier to pick up an error in your 
> pattern when anchors are implicitly added - is a little optimistic. It 
> means that the regular expression you see in the source code is no 
> longer a true representation of what will be fed to the regular 
> expression engine, and without prior knowledge that these anchors are 
> implicitly added (and with no realistic hope of browser debug 
> information on how it parsed the pattern) many developers would be left 
> confused as to why their regular expression works in ECMAScript but not 
> in their web form.

Could you give a realistic example of where that could be a problem?

Cheers,
-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 16 December 2004 14:18:09 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:08:20 UTC