- From: Ian Hickson <ian@hixie.ch>
- Date: Thu, 17 Oct 2002 19:21:37 +0000 (GMT)
- To: Alexander Savenkov <w3@hotbox.ru>, Daniel Yacob <locales@geez.org>
- Cc: "kode@hotbox.ru" <kode@hotbox.ru>, "www-style@w3.org" <www-style@w3.org>
On Thu, 17 Oct 2002, Alexander Savenkov wrote: > >> Unfortunately, you do not provide detailed algorithms for your >> cyrillic and ethiopic numbering styles and therefore they cannot be >> added to the CSS3 Lists module at this stage. I considered adding the >> coptic style, but could not fully understand the provided sample code >> from a quick glance, and therefore decided to wait for a more easily >> understandable description in English for this style as well. > What about Cyrillic numbering styles all the necessary information is > provided actually. The numbering algorithm for, say, 'lower-russian' > is the same as for 'lower-alpha', i. e. all the list items are listed > in the note. I really see no problem here. > Please let us know what details exactly are needed. Well, there is the suggestion at one point in the specification that 'cyrillic' repeats the last character ('many') which is different from 'lower-latin'. Also, your notes do not say which systems have a significant zero value, or what the suffix character should be (Ethiopic systems use U+1366, most others use U+002E). In particular, you give upper-hexadecimal in the same table as upper-alpha, even though they use different algorithms. See also my answers to Daniel, below. > ...until now none of us has seen the Generated Content Module. The relevant part can be summarised as the CSS2 Generated Content chapter, with 'content' applying to every element, and 'display:marker' replaced by '::marker'. This is actually described in the CSS3 Lists module to some extent. >> I have added lower- and upper- armenian, however I have conflicting >> information for the character(s) to use for 7000. Are you sure that >> the 7000 digit (U+0582/U+0552) should be combined with the 600 digit >> (U+0578/U+0548)? > > In fact the 7000 digit represents one of the Armenian letters and this Due to ambiguities with words like "character" and "letter", I always try to be explicit with codepoints. Is 7000 represented by two codepoints or one? I know nothing about the armenian language. > has been checked for a number of times. Can you tell us where did you > get different information and what is this information? It was probably based on information from Frank Tang's research. On Thu, 17 Oct 2002, Daniel Yacob wrote: >> >> Unfortunately, you do not provide detailed algorithms for your >> cyrillic and ethiopic numbering styles and therefore they cannot be >> added to the CSS3 Lists module at this stage. I considered adding the > > I guess I don't know the sort of algorithm to provide here. Where > character lists are given to use a basic alphabetic list type I > thought the algorithm (works as a numeral system where the radix is > the list length) was implicit. Can you send us the algorithm > you have for lower-greek for us to use as a model? Lower-greek is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+03B1, U+03B2, U+03B3, U+03B4, U+03B5, U+03B6, U+03B7, U+03B8, U+03B9, U+03BA, U+03BB, U+03BC, U+03BD, U+03BE, U+03BF, U+03C0, U+03C1, U+03C3, U+03C4, U+03C5, U+03C6, U+03C7, U+03C8, U+03C9, with a base of 24, a suffix of U+002E, and no exceptions. >> coptic style, but could not fully understand the provided sample code >> from a quick glance, and therefore decided to wait for a more easily >> understandable description in English for this style as well. > > Can you forward us what you have for the ethiopic-numeric style > description in english? I'd like to see it out of curiousity but > would also use it to model an english description for coptic. Below is a plain text version of the algorithm I've included for ethiopic-numeric. The following algorithm converts decimal digits to ethiopic numbers. 1. Split the number into groups of two digits, starting with the least significant decimal digit. 2. Number each group sequentially, starting from the least significant as group number zero. 3. If the group has an odd number (as given in the previous step) and has the value 1, or if the group is the most significant one and has the value 1, or if the group has the value zero, then remove the digit (but leave the group, so it still has a separator appended below). 4. For each remaining digit, substitute the relevant ethiopic character from the list below. Tens Units Values Codepoints Values Codepoints 10 U+1372 1 U+1369 20 U+1373 2 U+136A 30 U+1374 3 U+136B 40 U+1375 4 U+136C 50 U+1376 5 U+136D 60 U+1377 6 U+136E 70 U+1378 7 U+136F 80 U+1379 8 U+1370 90 U+137A 9 U+1371 5. For each group with an odd number (as given in the second step), append U+137B. 6. For each group with an even number (as given in the second step), except the group with number 0, append U+137C. 7. Concatenate the groups into one string. This system is defined for all numbers greater than zero. For zero and negative numbers, the decimal system is used instead. The suffix for the ethiopic-numeric numbering systems is a dot . U+002E. [Is there a better suffix to use? The alphabetic ethiopic systems use a different suffix.] Examples: The decimal number 100, in ethiopic, is U+137B The decimal number 78010092, in ethiopic, is U+1378 U+1370 U+137B U+1369 U+137C U+137B U+137A U+136A. The decimal number 780000001092, in ethiopic, is U+1378 U+1370 U+137B U+1369 U+137C U+137B U+137C U+137B U+137A U+136A. This is one of the 16 algorithmic styles with their own custom algorithms. There are also five generic algorithms: glyph for things that are the same character for every number (e.g. disc), numeric for systems that repeat a set of digits with a particular base with an insignificant zero value (such as decimal, binary, cambodian), alphabetic for similar systems with no zero value (lower-alpha, amharic, upper-norwegian), symbolic for systems that go through several characters then repeat the characters if they are reused (footnotes), and non-repeating for systems that are merely maps from decimal numbers to codepoints (circled-decimal). The numeric and alphabetic systems are defined in terms of codepoints, a base, a suffix, and any notes on the rendering. The symbolic systems just need a codepoint list. The non-repeating systems are defined in terms of a map from number to codepoint(s). -- Ian Hickson )\._.,--....,'``. fL "meow" /, _.. \ _\ ;`._ ,. http://index.hixie.ch/ `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 17 October 2002 15:21:40 UTC