- From: CE Whitehead <cewcathar@hotmail.com>
- Date: Sat, 16 Apr 2011 14:44:37 -0400
- To: <tiro@tiro.com>, <xn--mlform-iua@xn--mlform-iua.no>
- CC: <asmusf@ix.netcom.com>, <fantasai.lists@inkedblade.net>, <cowan@mercury.ccil.org>, <www-international@w3.org>, <public-i18n-core@w3.org>, <public-i18n-indic@w3.org>, <public-i18n-cjk@w3.org>, <www-style@w3.org>
- Message-ID: <SNT142-w208027E48785A22CCD9E5CB3AF0@phx.gbl>
Hi I tend to think some font behavior may be more language specific than script specific (however, I have seen spacing used for emphasis in English, at least in the past, so it may not be just German). Here are some resources. I hope that at least one is useful. The following unicode conference report discusses the kashida (stretching) in Arabic http://www.scribd.com/doc/238961/Authentic-Arabic-Typography-Technical-and-Aesthetic-Challenges "Text block justification in Arabic involves more than in Latin script. The latter has two distinct mechanisms for justification: aglobal one, which indiscriminately inserts micro-spaces and a specific one to hyphenate words according to elaborate rules that vary from language to language. Islamic calligraphy has a device called thekeshideh, a Persian and Ottoman Turkish term meaning 'stretching'.Keshideh is typeface-dependent, as the hyphen is language-dependent. That is, to get aesthetically acceptable results, akeshideh is placed according to a complex set of rules giving priority to certain letter combinations over others. These rules vary between calligraphic styles. The result is characteristically different for each kind of Arabic script. In other words, thekeshideh is the equivalent of hyphenation and not of micro-justification." (I have seen the kashida used to distinguish repeated characters at the end of a word such as in words ending with the suffix, "iyyin" -- for example, "misrayyiin" 'Egyptian' from "misra" 'Egypt' + suffix; I haved not paid much attention to formatting myself however I have seen heaping at least on my own; I would not notice that stretching was being used for text justification so it is probably more common than word heaping) http://www.idnforums.com/forums/1031-watch-out-there-is-arabic-hyphen.html This Egyptian speaker says that there is an Arabic hyphen although I gather it must be rare. It is slightly curved -- but I cannot find an example that I can see. http://www.tug.org/TUGboat/tb27-2/tb87benatia.pdf Arabic text justification Mohamed Jamal Eddine Benatia, Mohamed Elyaakoubi and Azzeddine Lazrek Department of Computer Science, Faculty of Science, University Cadi Ayyad P.O. Box 2390, Marrakesh, Morocco lazrek (at) ucam dot ac dot ma Here it is again: http://www.ucam.ac.ma/fssm/rydarab Arabic Text Justification M. J. E. Benaatia, M. Elyaakoubi and A. Lazrek (Department of Computer Science Cadi Ayyad University, Marrakesh) "Calligraphers also build on other practices for justification, such as: word heaping: putting certain words above others . moving the broken fragment above the hyphenated word . word hyphenation . word hyphenation in margin . decreasing of some words at the end of a line . . curving of the baseline ." (Authors give details about the use of the kashida) The authors mention moving the hyphenated fragment to the margin for the Holy Q'uran and show an example; this was discussed previously I think on some list; I still am unable to see the hyphens. I'm attaching one image of word heaping and hyphenation) http://books.google.com/books?id=mONRrqREIAAC&pg=PR8&lpg=PR8&dq=Arabic+hyphen&source=bl&ots=bfQg-0ldLB&sig=lMqm1bAN22TKFjUBrTF4lt38Xnw&hl=fr&ei=Bf-oTaHUIMfniALL28TvDA&sa=X&oi=book_result&ct=result&resnum=10&ved=0CFcQ6AEwCTgU#v=onepage&q=Arabic%20hyphen&f=false A Dictionary of Post-classical Yemeni Arabic Vol 1 Explains some hyphenation formats peculiar to Hebrew script. (apostrophe-like form is used) A more useless resource but this backs up what you all say -- that hyphenation is possible in Arabic script though not really for the Arabic language: http://omega.enstb.org/yannis/pdf/marrakech.pdf "• if they are given by ligature nodes, once again we have two possibilities: when the ligature is not broken then we have a single static glyph. When the ligature is broken we return to “characters” (or at least to something which is a bit closer to the concept of character, even though it is not exactly a character) and apply the main loop again to the two parts (before and after the break), which sometimes results in new ligatures. But once again each node list obtained that way is unique." "Dynamic typesetting is a method of typesetting where glyphs can change during the process of line breaking, for reasons which may depend on macrotypographic properties such as justification of the line or of the entire paragraph, or more global phenomena like glyphs on subsequent lines touching each other or to avoid rivers, etc The keshideh is a curved pen stroke of definite length that slightly stretches a letter-compound. The illustration shows 3 measures of keshideh commonly used in Naskh). Within a word, or rather letter- compound, usually no more than one such keshideh occurs. Some letters produce their own prolonged forms, in which case . . . ruled out " "Step 1: Hyphenation It has been said over and over again that Arabic is not hyphenated. This is true when we refer to Arabic language, but false when we refer to Arabic script. Indeed, there is one language written in Arabic script, namely Uighur, which uses hyphenation just like any European language. Uighur may use the Arabic script but is . . . {IMG} not a Semitic language and hence does not use implicit short vowels: all vowels are explicitly written and one can easily identify syllables and hyphenate words between them.2 Uighur is indeed hyphenated but if we add soft hyphen characters we risk obstruction of contextual analysis, which is the next step. It is easier to add potential breakpoints as texteme properties, as is done for other languages." (I am still reading the report and should have some comments on that soon.) Best, C. E. Whitehead cewcathar@hotmail.com
Received on Saturday, 16 April 2011 18:53:29 UTC