W3C home > Mailing lists > Public > public-forms@w3.org > June 2008

Re: Add scripts to XForms input-mode script list in Appendix E (PR#106)

From: Steven Pemberton <steven.pemberton@cwi.nl>
Date: Thu, 12 Jun 2008 16:23:03 +0200
To: "Martin Duerst" <duerst@it.aoyama.ac.jp>, "John Boyer" <boyerj@ca.ibm.com>
Cc: "Richard Ishida" <ishida@w3.org>, "Felix Sasaki" <fsasaki@w3.org>, "Forms WG" <public-forms@w3.org>
Message-ID: <op.ucm3kp0rsmjzpq@acer3010>

Hi Martin,

We are here at the Forms FtF and trying to come to some resolution on your  
last call comment.

Our basic problem (and why we originally asked if you would be willing to  
do the work) is that we don't understand the algorithm you used to select  
which entries in http://unicode.org/iso15924/iso15924-codes.html and  
http://java.sun.com/j2se/1.4.2/docs/api/java/lang/Character.UnicodeBlock.html  
should end up in the inputmode list.

Could you give us some help here: what is it about particular entries in  
those two lists that make them suitable or not for use as inputmode  
values? Just to take one example early in the alphabet, why is cherokee in  
the list, and cypriot not (both which were added to the ISO list on the  
same date).

If you can help us understand thos, maybe we can understand how to  
describe which future values are suitable for later addition.

Thanks!

Best wishes,

Steven

On Thu, 21 Feb 2008 07:48:58 +0100, Martin Duerst <duerst@it.aoyama.ac.jp>  
wrote:

> Hello John,
>
> Here's a resend of the mail I sent to Steven earlier this year.
>
>
> At 13:34 08/01/10, Martin Duerst wrote:
>> Hello Steven,
>>
>> Thanks for contacting me. Hope everything is well with you.
>>
>> [I cut out the thead because currenty, my mailer seems to have
>> occasional weird problems with sending long messages.]
>>
>> At 01:30 08/01/10, Steven Pemberton wrote:
>>> Hi Martin,
>>>
>>> Any news on this?
>>
>> Well, yes and no. I have to admit that I had the editing token,
>> and didn't act on it. I also have to admit that I was a bit demotivated
>> by the fact that I did the actual work,
>
> That referred to my earlier creation of a list of script tokens that
> needed to be added.
>
>
>> and it would have been rather
>> easy for somebody on your side to contribute, e.g. at least for cross-
>> checking.
>>
>> But by chance, I got an idea that I think should meet all our
>> concerns in a simple way. What we want to do is to add tokens
>> for more scripts. You suggested that we simply say that other
>> scripts are also allowed. I responded that because there are
>> some irregularities/transformations with spelling, things are
>> not so easy. I still believe that to be the case, but I agree
>> that having to update the list by hand is work that we should
>> try to avoid. The solution to this may be quite easy, actually:
>>
>> Use ISO 15924 four-letter script codes.
>> (http://unicode.org/iso15924/iso15924-codes.html)
>>
>> As a result, I propose the following changes:
>>
>> In E.3.1, Script Tokens
>> (http://www.w3.org/TR/2007/REC-xforms-20071029/#mode-scripts,
>> or similar in whatever version of your spec that's actually affected),
>> add at the end of the first paragraph, the following sentence:
>>
>>>>>>
>> For scripts added to Unicode after version 3.2, use the four-letter
>> ISO 15924 script code, with the first letter in upper-case and the
>> remaining three letters in lower case.
>>>>>>
>
> If we want to add an example, we could do that. Here is a proposal:
> Add after the sentence above:
>
>>>>>
> For example, the script token for Tifinagh (used in North Africa)
> is Tfng.
>>>>>
>
>
>> Please add references as you see fit (different specs have somewhat
>> differing traditions on how much and what to add as references, but
>> here's a list of possible candidates:
>>
>> The standard itself, also available on the net at
>> http://unicode.org/iso15924/standard/index.html.
>> The registration authority Web page:
>> http://unicode.org/iso15924/.
>> The list of alphabetical codes:
>> http://unicode.org/iso15924/iso15924-codes.html
>>
>> Comments:
>> - The clause about case is necessary becase these tokens are case-
>>  sensitive.
>> - There are currently two tokens with 4 letters, namely 'thai' and
>>  'user'. But because they are all lower-case, there is no potential
>>  for conflict.
>>
>> I think this is the fastest and cleanest (including future-proofness)
>> way to deal with this issue, and I'm sure that somebody in your
>> group can do the editing more quickly and safely than be, but I'd
>> be extremely glad to do some proofreading and cross-checking.
>>
>> Regards,    Martin.
>>
>> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
>> #-#-#  http://www.sw.it.aoyama.ac.jp      mailto:duerst@it.aoyama.ac.jp
>
> Regards,   Martin.
>
>
> #-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
> #-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp
>
Received on Thursday, 12 June 2008 14:24:09 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 October 2013 22:06:48 UTC