- From: Erik Bruchez <ebruchez@orbeon.com>
- Date: Tue, 12 Sep 2017 22:05:00 -0700
- To: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
- Cc: Steven Pemberton <Steven.Pemberton@cwi.nl>, XForms <public-xformsusers@w3.org>
- Message-ID: <CAAc0PEV3b-T5HNQCMS0Aipndg8J9foe29CDE0pgNQQUcdRKQqw@mail.gmail.com>
One piece of software I hit when doing a bit of research a while back is
this:
    https://github.com/googlei18n/libphonenumber
"Google's common Java, C++ and JavaScript library for parsing, formatting,
and validating international phone numbers."
-Erik
On Tue, Sep 12, 2017 at 10:01 PM, Erik Bruchez <ebruchez@orbeon.com> wrote:
> A few random thoughts:
>
> When giving a little bit of thought on phone numbers in the past, I was
> tempted to separate input and output completely:
>
> - Input would ignore pretty much anything besides digits (except maybe for
> the "+" sign).
> - Output would format the number according to a particular country's (or
> form author's) expectations, if the input is valid.
>
> We had a requirement, not implemented yet, to handle specifically UK phone
> formats, and this is pretty complicated if you want to validate the number
> of digits more strictly. Some pointers here:
>
>     https://github.com/orbeon/orbeon-forms/issues/3248
>
> I realize that maybe a phone number type wouldn't have to validate the
> number of digits as a first approximation.
>
> -Erik
>
>
> On Tue, Sep 12, 2017 at 11:38 AM, C. M. Sperberg-McQueen <
> cmsmcq@blackmesatech.com> wrote:
>
>> On Sep 12, 2017, at 3:43 AM, Steven Pemberton <Steven.Pemberton@cwi.nl>
>> wrote:
>>
>> > While we're at it, since it is a regularly occurring field in forms,
>> > how about adding a telephone number type?
>>
>> > I went looking for suitable standards to reference, and I found
>> > this:
>>
>> >       E.123 : Notation for national and international telephone
>> >       numbers, e-mail addresses and web addresses
>> >       http://www.itu.int/rec/T-REC-E.123-200102-I/en
>>
>> > This is actually about how to represent telephone numbers on printed
>> > materials, but seems usable.
>>
>> > It doesn't give a formal syntax, only examples and descriptions,
>> > from which I extract the following syntax:
>>
>> >  telephone: international | local.
>> >  international: "+" digits+
>> >  local: prefix? digits+
>> >  prefix: "(" digits+ ")"
>> >  digits: digit (spacing digit)?
>> >  digit: ["0"-"9"]
>> >  spacing: " " | "-"
>>
>> > However, looking at
>> > ttps://en.wikipedia.org/wiki/National_conventions_for_writin
>> g_telephone_numbers
>> > some countries seem to expect to be able to bracket the area code
>> > even in an international number,
>>
>> In some business cards in my collection, parentheses are used not
>> around an area code but around the country code, sometimes with the
>> leading "+" inside the parentheses.  A card (from someone in the EU
>> bureaucracy — which I expect means the pattern is not idiosyncratic to
>> one individual has:
>>
>>   (+352) 999 99-99999
>>
>> Here and elsewhere 9s replace some digits, in the interests of data
>> privacy.
>>
>> Other examples of this pattern (i.e. "(" "+"? country-code ")")
>> include numbers in the Czech Republic and (with no plus sign) Canada.
>>
>>   (+420) 999 999 999
>>   (1) 905 999 9999
>>
>> Sometimes parentheses are used around a part of the number used in
>> some cases but not in others.  In all the cases I remember seeing,
>> it’s a zero between the country code and the in-country number; the
>> parentheses appear to mean “when calling internationally, omit the
>> zero, and when calling within the country omit the country code and
>> include the zero”.  E.g.
>>
>>    +44 (0) 9999 999999
>>
>> I think I have seen this most often in German phone numbers, but the
>> examples in my collection of business cards include ones from the UK,
>> the Netherlands, and Sweden, as well (and most of my German cards
>> don't use this pattern, which tells me something but I'm not sure what).
>>
>> > so we should be a bit laxer; for instance:
>>
>> >  telephone: prefix? area? digits+
>> >  prefix: "+" digits+
>> >  area: "(" digits+ ")"
>> >  digits: digit (spacing digit)?
>> >  digit: ["0"-"9"]
>> >  spacing: " "+ | "-"
>>
>> > I think this regexp covers it:
>>
>> >       "+"? digs+ ("(" digs+ ")" digs+)?
>>
>> > where:
>>
>> >       digs: [0-9] ("-" [0-9])?
>>
>> > and after every terminal there may follow spaces.
>>
>> My collection of business cards includes examples with the following
>> characters used as separators in addition to parentheses, blank, and
>> ‘-‘:
>>
>>   . (i.e. full stop)
>>   · (i.e. mid-dot)
>>   /
>>   “  “ (i.e. two blanks, or extra-wide blank)
>>   |
>>
>> I don’t know how important it might be to support such variations.  As
>> far as I know, they have no well established meaning, although in my
>> examples / is used only (without white space) between the area code
>> and the local number and | is sometimes used only (with white space on
>> either side) between the country code and the in-country number (and
>> in other cases as the sole separator character)
>>
>> > That produces this monstrosity:
>>
>> > <pattern value="^\ *\+?([0-9]\ *(\-\ *[0-9]\ *)?)+(\(([0-9]\ *(\-\
>> *[0-9]\ *)?)+\)\ *([0-9]\ *(\-\ *[0-9]\ *)?)+)?\ *$"/>
>>
>> > which you can try out here (scroll to the bottom):
>>
>>  http://homepages.cwi.nl/~steven/forms/tests/email.xml
>>
>> > Let me know if you find cases that don't work.
>>
>> In addition to the examples given above (some of which work and others
>> of which don't), some cards in my collection have numbers with the
>> patterns:
>>
>>   800,999.9999 x349
>>   (1) 905 999 9999 x99999
>>
>> If extensions are part of a number, these need to work; if not, not.
>>
>> Also: it appears to be a consequence of the regex that any two hyphens
>> must be separated by at least two decimal digits.  So the following
>> Czech number attested in my collection is legal (it appears that '2'
>> is one of the area codes for Prague)
>>
>>   +420 (2) 9999 9999
>>
>> and so is a number with just blanks:
>>
>>   +420 2 9999 9999
>>
>> but the following variant is not legal:
>>
>>   +420-2-9999-9999
>>
>> I have not seen anything with that last illegal pattern in my
>> collection, but the other examples in the collection do provide plenty
>> of examples of (1) separators before and after area codes, (2) use of
>> whitespace as the only separator, and (3) use of hyphen as the only
>> separator.  So I am inclined to think it might appear in Real Life.
>>
>> Do any telephone systems in the world still use lettered exchanges?
>> The first telephone number I learned as a child began not "366" but
>> "EM-6" or "Emerson 6".  If that convention is still in use anywhere,
>> letters will be needed to represent it.
>>
>> And of course many commercial organizations use numbers that spell out
>> words; the phone-in number for the quiz show "Whad'Ya Know" (now
>> defunct), which ran on public radio stations in the U.S. for thirty
>> years, was 1-800-WHA-KNOW.  Do such numbers need to be supported?
>>
>> To try to boil it down, the following number patterns are not
>> supported by the regex given; whether they should be is in each case a
>> policy question:
>>
>>   a  (+352) 999 99-99999
>>   b  999.999.9999
>>   c  0711/9999-999
>>   d  613  999-9999 <(613)%20999-9999> (two blanks or one-em space after
>> area code)
>>   e +33 | 99 99 99 99
>>   f  800,999.9999 x349
>>   g  (1) 905 999 9999 x99999 <(905)%20999-9999>
>>   h  +420-2-9999-9999
>>   i  EM6-9999
>>   j  1-800-WHA-KNOW
>>
>> The following number patterns are supported (but use parentheses to
>> enclose something other than an area code)
>>
>>   z  (1) 905 999 9999 <(905)%20999-9999>
>>   y  +44 (0) 9999 999999
>>
>> Pattern h seems plausible but is not attested in the collection of
>> business cards
>> I examined; patterns i and j are not attested but specifications of
>> phone numbers in this form are reasonably well attested (though for
>> pattern i all the attestations may be decades old by now).
>>
>> I thank you for an entertaining couple of hours.  (The project leader
>> to whom I will explain in 45 minutes that I got nothing done on that
>> project this morning because I was thinking about telephone numbers
>> may however be less inclined to thank you.  Oh, well.)
>>
>> ********************************************
>> C. M. Sperberg-McQueen
>> Black Mesa Technologies LLC
>> cmsmcq@blackmesatech.com
>> http://www.blackmesatech.com
>> ********************************************
>>
>>
>>
>>
>
Received on Wednesday, 13 September 2017 05:05:49 UTC