W3C home > Mailing lists > Public > www-style@w3.org > October 2002

Re: Lists Module Comments (Technical Note)

From: Ian Hickson <ian@hixie.ch>
Date: Thu, 17 Oct 2002 19:21:37 +0000 (GMT)
To: Alexander Savenkov <w3@hotbox.ru>, Daniel Yacob <locales@geez.org>
Cc: "kode@hotbox.ru" <kode@hotbox.ru>, "www-style@w3.org" <www-style@w3.org>
Message-ID: <Pine.LNX.4.21.0210171844590.10805-100000@dhalsim.dreamhost.com>

On Thu, 17 Oct 2002, Alexander Savenkov wrote:
> 
>> Unfortunately, you do not provide detailed algorithms for your
>> cyrillic and ethiopic numbering styles and therefore they cannot be
>> added to the CSS3 Lists module at this stage. I considered adding the
>> coptic style, but could not fully understand the provided sample code
>> from a quick glance, and therefore decided to wait for a more easily
>> understandable description in English for this style as well.

> What about Cyrillic numbering styles all the necessary information is
> provided actually. The numbering algorithm for, say, 'lower-russian'
> is the same as for 'lower-alpha', i. e. all the list items are listed
> in the note. I really see no problem here.
> Please let us know what details exactly are needed.

Well, there is the suggestion at one point in the specification that
'cyrillic' repeats the last character ('many') which is different from
'lower-latin'. Also, your notes do not say which systems have a
significant zero value, or what the suffix character should be
(Ethiopic systems use U+1366, most others use U+002E).

In particular, you give upper-hexadecimal in the same table as
upper-alpha, even though they use different algorithms.

See also my answers to Daniel, below.


> ...until now none of us has seen the Generated Content Module.

The relevant part can be summarised as the CSS2 Generated Content chapter,
with 'content' applying to every element, and 'display:marker' replaced by
'::marker'. This is actually described in the CSS3 Lists module to some
extent.


>> I have added lower- and upper- armenian, however I have conflicting
>> information for the character(s) to use for 7000. Are you sure that
>> the 7000 digit (U+0582/U+0552) should be combined with the 600 digit
>> (U+0578/U+0548)?
>
> In fact the 7000 digit represents one of the Armenian letters and this

Due to ambiguities with words like "character" and "letter", I always try
to be explicit with codepoints. Is 7000 represented by two codepoints or
one? I know nothing about the armenian language.


> has been checked for a number of times. Can you tell us where did you
> get different information and what is this information?

It was probably based on information from Frank Tang's research.


On Thu, 17 Oct 2002, Daniel Yacob wrote:
>> 
>> Unfortunately, you do not provide detailed algorithms for your
>> cyrillic and ethiopic numbering styles and therefore they cannot be
>> added to the CSS3 Lists module at this stage. I considered adding the
> 
> I guess I don't know the sort of algorithm to provide here.  Where
> character lists are given to use a basic alphabetic list type I
> thought the algorithm (works as a numeral system where the radix is
> the list length) was implicit.  Can you send us the algorithm
> you have for lower-greek for us to use as a model?

Lower-greek is defined as an alphabetic system (numeric repeating with no
insignificant 0 value) defined for all positive numbers greater than zero,
using codepoints U+03B1, U+03B2, U+03B3, U+03B4, U+03B5, U+03B6, U+03B7,
U+03B8, U+03B9, U+03BA, U+03BB, U+03BC, U+03BD, U+03BE, U+03BF, U+03C0,
U+03C1, U+03C3, U+03C4, U+03C5, U+03C6, U+03C7, U+03C8, U+03C9, with a
base of 24, a suffix of U+002E, and no exceptions.


>> coptic style, but could not fully understand the provided sample code
>> from a quick glance, and therefore decided to wait for a more easily
>> understandable description in English for this style as well.
> 
> Can you forward us what you have for the ethiopic-numeric style
> description in english?  I'd like to see it out of curiousity but
> would also use it to model an english description for coptic.

Below is a plain text version of the algorithm I've included for
ethiopic-numeric.

   The following algorithm converts decimal digits to ethiopic numbers.

   1. Split the number into groups of two digits, starting with the least
      significant decimal digit.
   2. Number each group sequentially, starting from the least significant
      as group number zero.
   3. If the group has an odd number (as given in the previous step) and
      has the value 1, or if the group is the most significant one and has
      the value 1, or if the group has the value zero, then remove the
      digit (but leave the group, so it still has a separator appended
      below).
   4. For each remaining digit, substitute the relevant ethiopic character
      from the list below.

             Tens               Units
         Values Codepoints  Values Codepoints
             10 U+1372           1 U+1369
             20 U+1373           2 U+136A
             30 U+1374           3 U+136B
             40 U+1375           4 U+136C
             50 U+1376           5 U+136D
             60 U+1377           6 U+136E
             70 U+1378           7 U+136F
             80 U+1379           8 U+1370
             90 U+137A           9 U+1371

   5. For each group with an odd number (as given in the second step),
      append U+137B.
   6. For each group with an even number (as given in the second
      step), except the group with number 0, append U+137C.
   7. Concatenate the groups into one string.

   This system is defined for all numbers greater than zero. For zero
   and negative numbers, the decimal system is used instead.

   The suffix for the ethiopic-numeric numbering systems is a dot .
   U+002E. [Is there a better suffix to use? The alphabetic ethiopic
   systems use a different suffix.]

   Examples:

   The decimal number 100, in ethiopic, is U+137B

   The decimal number 78010092, in ethiopic, is U+1378 U+1370 U+137B
   U+1369 U+137C U+137B U+137A U+136A.

   The decimal number 780000001092, in ethiopic, is U+1378 U+1370
   U+137B U+1369 U+137C U+137B U+137C U+137B U+137A U+136A.

This is one of the 16 algorithmic styles with their own custom
algorithms. There are also five generic algorithms: glyph for things
that are the same character for every number (e.g. disc), numeric for
systems that repeat a set of digits with a particular base with an
insignificant zero value (such as decimal, binary, cambodian),
alphabetic for similar systems with no zero value (lower-alpha,
amharic, upper-norwegian), symbolic for systems that go through
several characters then repeat the characters if they are reused
(footnotes), and non-repeating for systems that are merely maps from
decimal numbers to codepoints (circled-decimal).

The numeric and alphabetic systems are defined in terms of codepoints,
a base, a suffix, and any notes on the rendering. The symbolic systems
just need a codepoint list. The non-repeating systems are defined in
terms of a map from number to codepoint(s).

-- 
Ian Hickson                                      )\._.,--....,'``.    fL
"meow"                                          /,   _.. \   _\  ;`._ ,.
http://index.hixie.ch/                         `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 17 October 2002 15:21:40 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 27 April 2009 13:54:16 GMT