- From: Erik Bruchez <ebruchez@orbeon.com>
- Date: Tue, 12 Sep 2017 22:05:00 -0700
- To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Cc: Steven Pemberton <Steven.Pemberton@cwi.nl>, XForms <public-xformsusers@w3.org>
- Message-ID: <CAAc0PEV3b-T5HNQCMS0Aipndg8J9foe29CDE0pgNQQUcdRKQqw@mail.gmail.com>
One piece of software I hit when doing a bit of research a while back is this: https://github.com/googlei18n/libphonenumber "Google's common Java, C++ and JavaScript library for parsing, formatting, and validating international phone numbers." -Erik On Tue, Sep 12, 2017 at 10:01 PM, Erik Bruchez <ebruchez@orbeon.com> wrote: > A few random thoughts: > > When giving a little bit of thought on phone numbers in the past, I was > tempted to separate input and output completely: > > - Input would ignore pretty much anything besides digits (except maybe for > the "+" sign). > - Output would format the number according to a particular country's (or > form author's) expectations, if the input is valid. > > We had a requirement, not implemented yet, to handle specifically UK phone > formats, and this is pretty complicated if you want to validate the number > of digits more strictly. Some pointers here: > > https://github.com/orbeon/orbeon-forms/issues/3248 > > I realize that maybe a phone number type wouldn't have to validate the > number of digits as a first approximation. > > -Erik > > > On Tue, Sep 12, 2017 at 11:38 AM, C. M. Sperberg-McQueen < > cmsmcq@blackmesatech.com> wrote: > >> On Sep 12, 2017, at 3:43 AM, Steven Pemberton <Steven.Pemberton@cwi.nl> >> wrote: >> >> > While we're at it, since it is a regularly occurring field in forms, >> > how about adding a telephone number type? >> >> > I went looking for suitable standards to reference, and I found >> > this: >> >> > E.123 : Notation for national and international telephone >> > numbers, e-mail addresses and web addresses >> > http://www.itu.int/rec/T-REC-E.123-200102-I/en >> >> > This is actually about how to represent telephone numbers on printed >> > materials, but seems usable. >> >> > It doesn't give a formal syntax, only examples and descriptions, >> > from which I extract the following syntax: >> >> > telephone: international | local. >> > international: "+" digits+ >> > local: prefix? digits+ >> > prefix: "(" digits+ ")" >> > digits: digit (spacing digit)? >> > digit: ["0"-"9"] >> > spacing: " " | "-" >> >> > However, looking at >> > ttps://en.wikipedia.org/wiki/National_conventions_for_writin >> g_telephone_numbers >> > some countries seem to expect to be able to bracket the area code >> > even in an international number, >> >> In some business cards in my collection, parentheses are used not >> around an area code but around the country code, sometimes with the >> leading "+" inside the parentheses. A card (from someone in the EU >> bureaucracy — which I expect means the pattern is not idiosyncratic to >> one individual has: >> >> (+352) 999 99-99999 >> >> Here and elsewhere 9s replace some digits, in the interests of data >> privacy. >> >> Other examples of this pattern (i.e. "(" "+"? country-code ")") >> include numbers in the Czech Republic and (with no plus sign) Canada. >> >> (+420) 999 999 999 >> (1) 905 999 9999 >> >> Sometimes parentheses are used around a part of the number used in >> some cases but not in others. In all the cases I remember seeing, >> it’s a zero between the country code and the in-country number; the >> parentheses appear to mean “when calling internationally, omit the >> zero, and when calling within the country omit the country code and >> include the zero”. E.g. >> >> +44 (0) 9999 999999 >> >> I think I have seen this most often in German phone numbers, but the >> examples in my collection of business cards include ones from the UK, >> the Netherlands, and Sweden, as well (and most of my German cards >> don't use this pattern, which tells me something but I'm not sure what). >> >> > so we should be a bit laxer; for instance: >> >> > telephone: prefix? area? digits+ >> > prefix: "+" digits+ >> > area: "(" digits+ ")" >> > digits: digit (spacing digit)? >> > digit: ["0"-"9"] >> > spacing: " "+ | "-" >> >> > I think this regexp covers it: >> >> > "+"? digs+ ("(" digs+ ")" digs+)? >> >> > where: >> >> > digs: [0-9] ("-" [0-9])? >> >> > and after every terminal there may follow spaces. >> >> My collection of business cards includes examples with the following >> characters used as separators in addition to parentheses, blank, and >> ‘-‘: >> >> . (i.e. full stop) >> · (i.e. mid-dot) >> / >> “ “ (i.e. two blanks, or extra-wide blank) >> | >> >> I don’t know how important it might be to support such variations. As >> far as I know, they have no well established meaning, although in my >> examples / is used only (without white space) between the area code >> and the local number and | is sometimes used only (with white space on >> either side) between the country code and the in-country number (and >> in other cases as the sole separator character) >> >> > That produces this monstrosity: >> >> > <pattern value="^\ *\+?([0-9]\ *(\-\ *[0-9]\ *)?)+(\(([0-9]\ *(\-\ >> *[0-9]\ *)?)+\)\ *([0-9]\ *(\-\ *[0-9]\ *)?)+)?\ *$"/> >> >> > which you can try out here (scroll to the bottom): >> >> http://homepages.cwi.nl/~steven/forms/tests/email.xml >> >> > Let me know if you find cases that don't work. >> >> In addition to the examples given above (some of which work and others >> of which don't), some cards in my collection have numbers with the >> patterns: >> >> 800,999.9999 x349 >> (1) 905 999 9999 x99999 >> >> If extensions are part of a number, these need to work; if not, not. >> >> Also: it appears to be a consequence of the regex that any two hyphens >> must be separated by at least two decimal digits. So the following >> Czech number attested in my collection is legal (it appears that '2' >> is one of the area codes for Prague) >> >> +420 (2) 9999 9999 >> >> and so is a number with just blanks: >> >> +420 2 9999 9999 >> >> but the following variant is not legal: >> >> +420-2-9999-9999 >> >> I have not seen anything with that last illegal pattern in my >> collection, but the other examples in the collection do provide plenty >> of examples of (1) separators before and after area codes, (2) use of >> whitespace as the only separator, and (3) use of hyphen as the only >> separator. So I am inclined to think it might appear in Real Life. >> >> Do any telephone systems in the world still use lettered exchanges? >> The first telephone number I learned as a child began not "366" but >> "EM-6" or "Emerson 6". If that convention is still in use anywhere, >> letters will be needed to represent it. >> >> And of course many commercial organizations use numbers that spell out >> words; the phone-in number for the quiz show "Whad'Ya Know" (now >> defunct), which ran on public radio stations in the U.S. for thirty >> years, was 1-800-WHA-KNOW. Do such numbers need to be supported? >> >> To try to boil it down, the following number patterns are not >> supported by the regex given; whether they should be is in each case a >> policy question: >> >> a (+352) 999 99-99999 >> b 999.999.9999 >> c 0711/9999-999 >> d 613 999-9999 <(613)%20999-9999> (two blanks or one-em space after >> area code) >> e +33 | 99 99 99 99 >> f 800,999.9999 x349 >> g (1) 905 999 9999 x99999 <(905)%20999-9999> >> h +420-2-9999-9999 >> i EM6-9999 >> j 1-800-WHA-KNOW >> >> The following number patterns are supported (but use parentheses to >> enclose something other than an area code) >> >> z (1) 905 999 9999 <(905)%20999-9999> >> y +44 (0) 9999 999999 >> >> Pattern h seems plausible but is not attested in the collection of >> business cards >> I examined; patterns i and j are not attested but specifications of >> phone numbers in this form are reasonably well attested (though for >> pattern i all the attestations may be decades old by now). >> >> I thank you for an entertaining couple of hours. (The project leader >> to whom I will explain in 45 minutes that I got nothing done on that >> project this morning because I was thinking about telephone numbers >> may however be less inclined to thank you. Oh, well.) >> >> ******************************************** >> C. M. Sperberg-McQueen >> Black Mesa Technologies LLC >> cmsmcq@blackmesatech.com >> http://www.blackmesatech.com >> ******************************************** >> >> >> >> >
Received on Wednesday, 13 September 2017 05:05:49 UTC