Re: Review comments on Indic layout doc from Somnath Chandra on 2014-12-10 (public-i18n-indic@w3.org from October to December 2014)

From: Somnath Chandra <schandra@deity.gov.in>
Date: Wed, 10 Dec 2014 16:29:53 +0530
To: Richard Ishida <ishida@w3.org>
Cc: indic <public-i18n-indic@w3.org>, slata <slata@mit.gov.in>, prashant verma <vermaprashant1@gmail.com>
Message-id: <fc31f1139305.54887501@nic.in>
Hello Richard,

We have modified the document based on feedback received. Kindly find enclosed the latest document. The images have been given in a separate folder and the images are copyright-free. The older images may be overwritten with these new images.

Looking forward to hear from you for further guidance , so that same may be published as FPWD.

With best regards,
Sincerely

Somnath



On 11/24/14 05:52 PM, Richard Ishida  <ishida@w3.org> wrote:
> 
> good to hear from you Somnath.
> 
> I'm out of the office today, so i'll work on this tomorrow.
> 
> cheers,
> ri
> 
> 
> On 24/11/2014 09:48, Somnath Chandra wrote:
> >Dear Richard,
> >
> >Thanks for your valuable feedback.
> >
> >We have tried to minimize the errors .
> >
> >We are sending the HTML file for your kind reference and further
> >necessary suggestions.
> >
> >Looking forward to hear from you for guidance so that same could be
> >published as FPWD.
> >
> >regards,
> >
> >Dr. Somnath Chandra
> >Scientist-E
> >Dept. of Electronics & Information Technology
> >Ministry of Communications & Information Technology
> >Govt. of India
> >Tel:+91-11-24364744,24301856
> >Fax: +91-11-24363099
> >e-mail :schandra@mit.gov.in
> >
> >
> >
> >On 09/30/14 01:16 AM, *Richard Ishida * <ishida@w3.org> wrote:
> >>These are review comments on
> >>http://www.w3.org/International/docs/indic-layout/
> >>
> >>
> >>
> >>First, structure of the document.
> >>
> >>I suggest that section 5, ABNF segmentation, be moved to immediately
> >>after the introduction, since it is central to much of the rest of the
> >>document.
> >>
> >>The title 'Issues in Indic Layout' is a throwback to a previous
> >>version of the document. I think that if we keep that heading, we
> >>should change it to "Requirements for Indic Layout". However, the
> >>whole of the document is about requirements for indic layout, so I
> >>suggest that we adopt the following organisation:
> >>
> >>Introduction
> >>Units of text in Indic Scripts
> >>Text segmentation
> >>Indic Syllable boundaries (this is the current ABNF section)
> >>Line breaking
> >>First letter styling
> >>Letter spacing
> >>Vertical arrangements...
> >>Collation
> >>then the end matter
> >>
> >>====
> >>
> >>
> >>
> >>
> >>Now some more detailed comments, per section.
> >>
> >>
> >>
> >>Section 1.1 Indic language complexities
> >>
> >>[1] the document should indicate what SI No means
> >>
> >>[2] in addition to the link to South-Asian-Scripts, i suggest pointing
> >>the reader to Unicode Technical Note #10, An Introduction to Indic
> >>Scripts, the latest version of which is to be found at
> >>http://rishida.net/scripts/indic-overview/
> >>
> >>[3] fig 1's picture is 3579x4493 pixels - far too big to be included
> >>in the document, and I've had problems downloading it even on the
> >>desktop. We should create a smaller version. And by the way, are
> >>there any copyright issues in using it?
> >>
> >>
> >>
> >>Section 1.2 Basic components of Indian languages
> >>
> >>[1] section 1.2.1: "Unicode uses a 16 bit encoding that provides code
> >>point for more than 65000 characters (65536)." This is waaay out of
> >>date. Unicode has over 1 million codepoints available. Please correct.
> >>
> >>[2] fig 3: again, we should check it's ok to use this
> >>
> >>[3] section 1.2.4: "This section provides the basic alphabet system of
> >>Devanagari Script i.e Consonants, Vowels, Modifiers, Matras, Halant,
> >>Nukta etc." should probably say "the basic alphabetic system of
> >>Devanagari script as used for Hindi"
> >>
> >>[4] to be consistent, we should explain the function of the visarga
> >>and the halant (and probably mention that the latter is called virama
> >>by Unicode).
> >>
> >>[5] section 1.2.1, CLDR: "It is a part of the W3C and Unicode Standard."
> >>It's not a W3C standard.
> >>
> >>
> >>
> >>Section 2.1 First letter
> >>
> >>[1] The first para, except the last sentence, and the para immediately
> >>after the 2 pictures are CSS-specific, and so should be removed from
> >>this document. (They may be useful in the other document that will map
> >>the requirements to technology in order to point out the delta.)
> >>
> >>[2] "the sequence of characters in the first syllable is as follows in
> >>memory:"
> >>I suggest:
> >>"the sequence of characters in the first syllable as stored in memory
> >>is as shown at the top of Figure 4."
> >>
> >>[3] "There are two default grapheme clusters here. The first includes
> >>the SA+VIRAMA+THA+I. (The second is the last two characters, T+II.)"
> >>That is incorrect. There are three grapheme clusters (which is why
> >>this is problematic, of course): SA+VIRAMA, THA+I and TA+II.
> >>
> >>
> >>
> >>Section 2.2 Letter Spacing
> >>
> >>[1] <h3> markup is used for may lines of text in this section where it
> >>is not appropriate. Please remove/fix.
> >>
> >>[2] Fig 6: I suggest turning this picture into prose, so that
> >>explanations can be added. There appear to be 3 approaches
> >>illustrated: one is segmentation by grapheme cluster, another by
> >>syllable, and I'm not at all sure what the third one is.
> >>
> >>[3] I think the document needs to be clearer about which of the three
> >>approaches just mentioned are actually appropriate (all? some?), and
> >>give some idea of the frequency of use and what is preferred, or if
> >>that information is not available, to at least say so clearly.
> >>
> >>
> >>Section 3 Text segmentation
> >>
> >>[1] I think the following text would naturally sit after the rest of
> >>the text in this section:
> >>
> >>"Word boundaries are used in a number of different contexts. The most
> >>familiar ones are selection (double-click mouse selection, or “move to
> >>next word” control-arrow keys), and “Whole Word Search” for search and
> >>replace. They are also used in database queries, to determine whether
> >>elements are within a certain number of words of one another. Some
> >>special sentence boundaries like the double poorna virama, possibly
> >>with numbers (as in Sanskrit text, shlokas etc.)"
> >>
> >>[2] What is the requirement wrt word boundaries? This isn't clear.
> >>
> >>[3] "Solution :
> >>Grapheme Cluster Boundaries: Indic Syllable definition [See section 5]
> >>Possible Extension for handling some cases Mouse Selection: At Indic
> >>syllable and code point level"
> >>
> >>This needs significant expansion. (Note that grapheme cluster
> >>boundaries are not equivalent to the syllable definition.)
> >>
> >>[4] Give a picture of the danda. What about the double danda?
> >>
> >>[5] "The precise determination of text elements may vary according to
> >>orthographic conventions for a given script or language."
> >>In some scripts it also depends on the operation being applied by the
> >>application. Is that the case for Hindi?
> >>
> >>
> >>
> >>Section 4 Line breaking
> >>
> >>[1] "Hyphens are used when a word remains incomplete at the end of a
> >>line while writing or when specifying a range."
> >>This sentence is ambiguous wrt what follows. I suggest just dropping it.
> >>
> >>[2] "Rule 2: The definition of Indic syllable may be used to break the
> >>line and a hyphen should be at the breaking point so that word can be
> >>read intuitively"
> >>Can a Hindi word be broken at any syllable boundary? If so, we should
> >>say so.
> >>
> >>
> >>
> >>Section 5 ABNF ...
> >>
> >>[1] "needs to be evolved" -> "is provided here"
> >>
> >>[2] "V(upper case) is complete vowel"
> >>I think the generally used term is 'independent vowel', no?
> >>
> >>
> >>
> >>Section 6 Contributors
> >>
> >>[1] please remove the baroque styling.
> >>
> >>
> >>
> >>Other editorial
> >>
> >>There are still many editorial nits, such as spaces before
> >>punctuation, 'is' instead of 'are', word running together, missing 's'
> >>in plurals. etc. It would be good to clean these up as we work through
> >>the text.
> >>
> >>
> >>
> >>Hope that helps,
> >>ri
> >>
> >--
> >Dr. Somnath Chandra
> >Scientist-E
> >Dept. of Electronics & Information Technology
> >Ministry of Communications & Information Technology
> >Govt. of India
> >Tel:+91-11-24364744,24301856
> >Fax: +91-11-24363099
> >e-mail :schandra@mit.gov.in
> 
> 
-- 

Dr. Somnath Chandra
Scientist-E
Dept. of Electronics & Information Technology
Ministry of Communications & Information Technology
Govt. of India
Tel:+91-11-24364744,24301856
Fax: +91-11-24363099
e-mail :schandra@mit.gov.in
Attachments

application/x-zip-compressed attachment: WII.zip
Received on Wednesday, 10 December 2014 11:00:42 UTC