From: Daniel Yacob <locales@geez.org>

Date: Fri, 06 Dec 2002 16:09:45 -0500

To: bert@w3.org, ian@hixie.ch

Cc: www-style@w3.org

Message-Id: <E18KPjF-0004Nn-00@geez.org>

Greetings, Some much delayed comments on the Ethiopic list styles from the Nov-7 TR. The below notes some corrections, answers questions in the TR, asks some new questions and adds some data. TR Corrections ============== While proofreading all Ethiopic lists I discovered that U+1210 should not be in the Sidama list, it should be removed in the next draft. In all of the halehame lists U+1330 should precede U+1338. They are reversed in the sidama, tigre and oromo lists. The reversal *is* correct in Blin. The Afar list should include U+12F8 after U+12F0. Suffix ------ Concerning the suffix for Ethiopic lists, there is no strong preference. I encounter U+002F more often than U+1366 (as used in the TR), but only slightly more often. Hence, I don't think a strong preference for suffix choice can be demonstrated from literature. This isn't critical so long as the CSS spec allows designers to set their suffix of choice. I've used U+002F in the algorithms below, feel free to change it. Ethiopic-Numeric ---------------- Answering the boxed question in the TR, the best suffix for ethiopic-numeric is U+1361. In this case there is a strong suffix preference. Otherwise a tweak is needed in the algorithm, all things considered the simplest tweak would be to adjust step 3 to: 3. If the group has an odd number (as given in the previous step) and has the value 1, or if the group is the most significant one and has the value 1, or if the group has the value zero, then remove the digit (but leave the group if the group has an even number, so it still has a separator appended below). which adds the condition "...if the group has an even number" for holding onto a group. I've walked thru this adjusted algorithm with the example numbers (Ian remember 780000001092 was a typo for 780100000092) and it generates the proper values. Additional Ethiopic List Style Algorithms ========================================= Concerning qualifications for the list styles that follow; they would be as valid as non-ge'ez, amharic and tigrigna styles in the present draft proposal. Like the afar, oromo sidama, somali, and tigre styles in the present draft the information on the following is garnered from ethiopic phonology tables, literacy and orthography studies conducted over the last 20 years. So while a literacy commission proposed an ethiopic character set for Afar, for example, whether or not the Afar later adapted the character set as shown I can not say with any certainty. All of these groups, with no doubt whatsoever, have used Ethiopic at one time or another, but plausibly could have modified what was proposed in the referenced investigations. References for the list styles can be found here: http://www.ethiopic.org/Collation/OrderedLists.html The blin style has been verified for list context by a Blin standards groups. The Agaw, Harari, Me'en, and Silti list styles rely on references or direct communication with regional government offices that is only a few years old. Omitted in the following are list styles for Bench and Sebatbeit. These languages require characters not yet in Unicode and will likely not be before version 6.0. They could however be supported now using a subset from available Unicode characters, but some note would have to be made that the list styles are subject to a later revision. What would policy dictate here? agaw is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1210, U+1218, U+1228, U+1230, U+1238, U+1240, U+1250, U+1260, U+1268, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12B8, U+12C8, U+12D0, U+12D8, U+12E0, U+12E8, U+12F0, U+1300, U+1308, U+1318, U+1320, U+1328, U+1330, U+1338, U+1348, U+1350, with a base of 31, a suffix of U+002F, and no exceptions. ari is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1260, U+1268, U+1270, U+1278, U+1290, U+12A0, U+12A8, U+12B8, U+12C8, U+12D0, U+12D8, U+12E0, U+12E8, U+12F0, U+12F8, U+1300, U+1308, U+1328, U+1340, U+1350, with a base of 26, a suffix of U+002F, and no exceptions. blin is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1210, U+1218, U+1230, U+1238, U+1228, U+1240, U+1250, U+1260, U+1270, U+1290, U+12A0, U+12A8, U+12B8, U+12C8, U+12D0, U+12E8, U+12F0, U+1300, U+1308, U+1318, U+1320, U+1328, U+1348, U+12D8, U+12E0, U+1278, U+1298, U+1338, U+1330, U+1350, U+1268 with a base of 33, a suffix of U+002F, and no exceptions. dizi is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12C8, U+12D8, U+12E0, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1338, U+1340, U+1348, with a base of 26, a suffix of U+002F, and no exceptions. gedeo is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1290, U+12A0, U+12A8, U+12C8, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1330, U+1338, U+1348, U+1350, with a base of 24, a suffix of U+002F, and no exceptions. gumuz is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1210, U+1208, U+1210, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1268, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12C8, U+12D0, U+12D8, U+12E0, U+12E8, U+12F0, U+12F8, U+1308, U+1328, U+1330, U+1340, U+1350, with a base of 29, a suffix of U+002F, and no exceptions. hadiyya is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1290, U+12A0, U+12A8, U+12C8, U+12D8, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1330, U+1348, U+1350, with a base of 24, a suffix of U+002F, and no exceptions. harari is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1210, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12B8, U+12C8, U+12E0, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1348, with a base of 24, a suffix of U+002F, and no exceptions. kaffa is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1210, U+1218, U+1220, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1280, U+1290, U+12A0, U+12A8, U+12C8, U+12D0, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1330, U+1348, U+1350, with a base of 27, a suffix of U+002F, and no exceptions. kebena is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1290, U+12A0, U+12A8, U+12C8, U+12D0, U+12D8, U+12E0, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1330, U+1348, U+1350, with a base of 26, a suffix of U+002F, and no exceptions. kembata is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1268, U+1270, U+1278, U+1290, U+12A0, U+12A8, U+12C8, U+12D8, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1330, U+1348, U+1350 with a base of 25, a suffix of U+002F, and no exceptions. konso is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12B8, U+12C8, U+12D0, U+12E8, U+12F0, U+1300, U+1348, U+1350, with a base of 22, a suffix of U+002F, and no exceptions. kunama is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1260, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12B8, U+12C8, U+12E8, U+12F0, U+1300, U+1308, U+1348 with a base of 20, a suffix of U+002F, and no exceptions. meen is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1280, U+1290, U+1298, U+12A0, U+12A8, U+12C8, U+12D8, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1330, U+1350, U+12F8, U+1340, with a base of 27, a suffix of U+002F, and no exceptions. saho is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1210, U+1218, U+1228, U+1230, U+1240, U+1260, U+1270, U+1290, U+12A0, U+12A8, U+12C8, U+12D0, U+12D8, U+12E8, U+12F0, U+1308, U+1320, U+1328, U+1330, U+1338, U+1348, with a base of 23, a suffix of U+002F, and no exceptions. silti is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12B8, U+12C8, U+12D8, U+12E0, U+12E8, U+12F0, U+1300, U+1308, U+1320, U+1328, U+1330, U+1348, U+1350 with a base of 27, a suffix of U+002F, and no exceptions. wolaita is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12C8, U+12D8, U+12E0, U+12E8, U+12F0, U+12F8, U+1230, U+1308, U+1320, U+1328, U+1330, U+1338, U+1340, U+1348, U+1350, with a base of 29, a suffix of U+002F, and no exceptions. yemsa is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+1200, U+1208, U+1218, U+1228, U+1230, U+1238, U+1240, U+1260, U+1268, U+1270, U+1278, U+1290, U+1298, U+12A0, U+12A8, U+12C8, U+12D8, U+12E0, U+12E8, U+12F0, U+1300, U+1308, U+1318, U+1320, U+1328, U+1330, U+1348, U+1350, with a base of 28, a suffix of U+002F, and no exceptions. Non-Ethiopic Ethiopian List Styles ================================== Qubee List Styles ----------------- Qubee is a writing system based on the Roman alphabet used by the Oromo of Ethiopia in the regional government, legal and schools systems. Qubee dates back to the 70s but official use did not begin until the last change in the national government at the start of the 90s. The Qubee Alphabet (also the collation order): A, AA, B, C, D, E, EE, F, G, H, I, II, J, K, L, M, N, O, OO, P, Q, R, S, T, U, UU, V, W, X, Y, Z, CH, DH, KH, NY, PH, SH "V" and "Z" are kept for non-Oromo transcriptions. "KH" after "DH" is also added for transcriptions from Arabic. A complication appears when a list length exceeds the base size. For example, a value of 38 for upper-oromo-qubee would be "AA" which, out of context, could be mistaken for the list item value of 2. There is no existing rule for how to avoid this conflict. Suggested solutions are to either do nothing and rely on the context (assume list values do not appear in isolation away from the list) or to add a space or punctuation as a delimiter. Relying on the list context is prefered and would be easiest to implement, it requires no special treatment, otherwise a comma (U+002C, ",") is the suggested cycle delimiter. lower-oromo-qubee is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+0041, U+0041U+0041, U+0042, U+0043, U+0044, U+0045, U+0045U+0045, U+0046, U+0047, U+0048, U+0049, U+0049U+0049, U+004A, U+004B, U+004C, U+004D, U+004E, U+004F, U+004FU+004F, U+0050, U+0051, U+0052, U+0053, U+0054, U+0055, U+0055U+0055, U+0056, U+0057, U+0058, U+0059, U+005A, U+0043U+0048, U+0044U+0048, U+004BU+0048, U+004EU+0059, U+0050U+0048, U+0053U+0048, with a base of 37, a suffix of U+002E, and no exceptions. upper-oromo-qubee is defined as an alphabetic system (numeric repeating with no insignificant 0 value) defined for all positive numbers greater than zero, using codepoints U+0061, U+0061U+0061, U+0062, U+0063, U+0064, U+0065, U+0065U+0065, U+0066, U+0067, U+0068, U+0069, U+0069U+0069, U+006A, U+006B, U+006C, U+006D, U+006E, U+006F, U+006FU+006F, U+0070, U+0071, U+0072, U+0073, U+0074, U+0075, U+0075U+0075, U+0076, U+0077, U+0078, U+0079, U+007A, U+0063U+0068, U+0064U+0068, U+006BU+0068 U+006EU+0079, U+0070U+0068, U+0073U+0068, with a base of 37, a suffix of U+002E, and no exceptions.Received on Friday, 6 December 2002 15:49:20 GMT

