- From: Erik Bruchez <ebruchez@orbeon.com>
- Date: Tue, 12 Sep 2017 22:01:37 -0700
- To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Cc: Steven Pemberton <Steven.Pemberton@cwi.nl>, XForms <public-xformsusers@w3.org>
- Message-ID: <CAAc0PEUMXf721yQJg8YFyf9AvBKFXNim4jVSniSHTkxyAeZ+jw@mail.gmail.com>
A few random thoughts: When giving a little bit of thought on phone numbers in the past, I was tempted to separate input and output completely: - Input would ignore pretty much anything besides digits (except maybe for the "+" sign). - Output would format the number according to a particular country's (or form author's) expectations, if the input is valid. We had a requirement, not implemented yet, to handle specifically UK phone formats, and this is pretty complicated if you want to validate the number of digits more strictly. Some pointers here: https://github.com/orbeon/orbeon-forms/issues/3248 I realize that maybe a phone number type wouldn't have to validate the number of digits as a first approximation. -Erik On Tue, Sep 12, 2017 at 11:38 AM, C. M. Sperberg-McQueen < cmsmcq@blackmesatech.com> wrote: > On Sep 12, 2017, at 3:43 AM, Steven Pemberton <Steven.Pemberton@cwi.nl> > wrote: > > > While we're at it, since it is a regularly occurring field in forms, > > how about adding a telephone number type? > > > I went looking for suitable standards to reference, and I found > > this: > > > E.123 : Notation for national and international telephone > > numbers, e-mail addresses and web addresses > > http://www.itu.int/rec/T-REC-E.123-200102-I/en > > > This is actually about how to represent telephone numbers on printed > > materials, but seems usable. > > > It doesn't give a formal syntax, only examples and descriptions, > > from which I extract the following syntax: > > > telephone: international | local. > > international: "+" digits+ > > local: prefix? digits+ > > prefix: "(" digits+ ")" > > digits: digit (spacing digit)? > > digit: ["0"-"9"] > > spacing: " " | "-" > > > However, looking at > > ttps://en.wikipedia.org/wiki/National_conventions_for_ > writing_telephone_numbers > > some countries seem to expect to be able to bracket the area code > > even in an international number, > > In some business cards in my collection, parentheses are used not > around an area code but around the country code, sometimes with the > leading "+" inside the parentheses. A card (from someone in the EU > bureaucracy — which I expect means the pattern is not idiosyncratic to > one individual has: > > (+352) 999 99-99999 > > Here and elsewhere 9s replace some digits, in the interests of data > privacy. > > Other examples of this pattern (i.e. "(" "+"? country-code ")") > include numbers in the Czech Republic and (with no plus sign) Canada. > > (+420) 999 999 999 > (1) 905 999 9999 > > Sometimes parentheses are used around a part of the number used in > some cases but not in others. In all the cases I remember seeing, > it’s a zero between the country code and the in-country number; the > parentheses appear to mean “when calling internationally, omit the > zero, and when calling within the country omit the country code and > include the zero”. E.g. > > +44 (0) 9999 999999 > > I think I have seen this most often in German phone numbers, but the > examples in my collection of business cards include ones from the UK, > the Netherlands, and Sweden, as well (and most of my German cards > don't use this pattern, which tells me something but I'm not sure what). > > > so we should be a bit laxer; for instance: > > > telephone: prefix? area? digits+ > > prefix: "+" digits+ > > area: "(" digits+ ")" > > digits: digit (spacing digit)? > > digit: ["0"-"9"] > > spacing: " "+ | "-" > > > I think this regexp covers it: > > > "+"? digs+ ("(" digs+ ")" digs+)? > > > where: > > > digs: [0-9] ("-" [0-9])? > > > and after every terminal there may follow spaces. > > My collection of business cards includes examples with the following > characters used as separators in addition to parentheses, blank, and > ‘-‘: > > . (i.e. full stop) > · (i.e. mid-dot) > / > “ “ (i.e. two blanks, or extra-wide blank) > | > > I don’t know how important it might be to support such variations. As > far as I know, they have no well established meaning, although in my > examples / is used only (without white space) between the area code > and the local number and | is sometimes used only (with white space on > either side) between the country code and the in-country number (and > in other cases as the sole separator character) > > > That produces this monstrosity: > > > <pattern value="^\ *\+?([0-9]\ *(\-\ *[0-9]\ *)?)+(\(([0-9]\ *(\-\ > *[0-9]\ *)?)+\)\ *([0-9]\ *(\-\ *[0-9]\ *)?)+)?\ *$"/> > > > which you can try out here (scroll to the bottom): > > http://homepages.cwi.nl/~steven/forms/tests/email.xml > > > Let me know if you find cases that don't work. > > In addition to the examples given above (some of which work and others > of which don't), some cards in my collection have numbers with the > patterns: > > 800,999.9999 x349 > (1) 905 999 9999 x99999 > > If extensions are part of a number, these need to work; if not, not. > > Also: it appears to be a consequence of the regex that any two hyphens > must be separated by at least two decimal digits. So the following > Czech number attested in my collection is legal (it appears that '2' > is one of the area codes for Prague) > > +420 (2) 9999 9999 > > and so is a number with just blanks: > > +420 2 9999 9999 > > but the following variant is not legal: > > +420-2-9999-9999 > > I have not seen anything with that last illegal pattern in my > collection, but the other examples in the collection do provide plenty > of examples of (1) separators before and after area codes, (2) use of > whitespace as the only separator, and (3) use of hyphen as the only > separator. So I am inclined to think it might appear in Real Life. > > Do any telephone systems in the world still use lettered exchanges? > The first telephone number I learned as a child began not "366" but > "EM-6" or "Emerson 6". If that convention is still in use anywhere, > letters will be needed to represent it. > > And of course many commercial organizations use numbers that spell out > words; the phone-in number for the quiz show "Whad'Ya Know" (now > defunct), which ran on public radio stations in the U.S. for thirty > years, was 1-800-WHA-KNOW. Do such numbers need to be supported? > > To try to boil it down, the following number patterns are not > supported by the regex given; whether they should be is in each case a > policy question: > > a (+352) 999 99-99999 > b 999.999.9999 > c 0711/9999-999 > d 613 999-9999 (two blanks or one-em space after area code) > e +33 | 99 99 99 99 > f 800,999.9999 x349 > g (1) 905 999 9999 x99999 > h +420-2-9999-9999 > i EM6-9999 > j 1-800-WHA-KNOW > > The following number patterns are supported (but use parentheses to > enclose something other than an area code) > > z (1) 905 999 9999 > y +44 (0) 9999 999999 > > Pattern h seems plausible but is not attested in the collection of > business cards > I examined; patterns i and j are not attested but specifications of > phone numbers in this form are reasonably well attested (though for > pattern i all the attestations may be decades old by now). > > I thank you for an entertaining couple of hours. (The project leader > to whom I will explain in 45 minutes that I got nothing done on that > project this morning because I was thinking about telephone numbers > may however be less inclined to thank you. Oh, well.) > > ******************************************** > C. M. Sperberg-McQueen > Black Mesa Technologies LLC > cmsmcq@blackmesatech.com > http://www.blackmesatech.com > ******************************************** > > > >
Received on Wednesday, 13 September 2017 05:02:23 UTC