- From: Felix Sasaki <fsasaki@w3.org>
- Date: Thu, 28 Apr 2005 15:02:13 +0900
- To: Richard Ishida <ishida@w3.org>
- Cc: GEO <public-i18n-geo@w3.org>, fsasaki@w3.org
Hi Richard, Here are some comments on the writins system tutorial: - General: It might be helpful to start with and overview of your basic terms like script, language, character, ..., and their relation (e.g. a language can be written with different scripts and / or mixed scripts). - "There are a few local characters, such as for Cantonese in Hong Kong, that are not in widespread use. In Chinese these ideographs are called hanzi. They are often referred to as Han characters ... Unicode supports over 70,000 Han characters." This is confusing. - "Note that each of the large glyphs ..." You should introduce the glyph concept, or mention that you will explain it later. - "... for grammatical particles and endings." -> "to express grammatical information like time, and various particles." - example from indic script? - "Before getting into this section it is important to draw attention to the difference between characters and glyphs. A character is a semantic unit representing an indivisible unit of text in memory. A glyph is the visual representation of a character or sequence of characters." You could summarize the relations between char and glyph before going into detail - 1:1 (common(?) case), 1:n (e.g. historic variants of char), n:1 (e.g. ligatures). - "Vertically oriented text is still very common .." Delete "still". - Section on word boundaries: This is from a former co-worker at an TEI-task force [1], John Smith [2], an Indologist. It might be another interesting example, although it is too long for the tutorial and it goes beyond character encoding problems, which can only be solved with markup: "difficulty arises within the Devana¯garı¯ script in which Sanskrit is normally written. Devana¯garı¯ is a syllabary, in which one syllable consists of any number of consonants (in practice, from zero to five) followed by one vowel followed optionally by m. or h. (anusva¯ra or visarga).1 If a word ends in a consonant, it therefore has to share a syllable with the next word, so that a¯ sı¯d ra¯ja¯ (‘there was a king’) is written a¯ - sı¯ - dra¯ - ja¯ . To make matters worse, sandhi (phonological change at word boundaries) may fuse two consecutive vowels together, so that, even ignoring orthography, the words can no longer be divided — for example tatha¯ api (‘even so’) becomes tatha¯pi, where the single vowel a¯ is shared by two inseparable words." Best regards, Felix [1] www.tei-c.org [2] http://bombay.oriental.cam.ac.uk/ Btw, which tutorials are for www05? Richard Ishida wrote: >In preparation (still) for the WWW2005 tutorial day I have (finally) produced another tutorial entitled: > >Ruby Markup and Styling >http://www.w3.org/International/tutorials/ruby/ > >Comments can be sent. >I haven't added the text views yet. I plan to review the wording again next week. > >Cheers, >RI > > >============ >Richard Ishida >W3C > >contact info: >http://www.w3.org/People/Ishida/ > >W3C Internationalization: >http://www.w3.org/International/ > >Publication blog: >http://people.w3.org/rishida/blog/ > > > > >
Received on Thursday, 28 April 2005 06:02:23 UTC